Migrations service

Type: ReferenceCreated: Team: Platform
draft
About this prompt

This is the exact review prompt the AI reviewer runs for the migrations service pull requests, applied on top of the shared review rules. Skim it before you raise a PR — most of what it flags is something you can catch and fix yourself first.
# Skapp Migrations Review Guidelines

> Spring Boot 3.5 + Java 21 + Liquibase (SQL format) + HikariCP + Lombok + MySQL + PostgreSQL

## Architecture

Standalone Spring Boot microservice that manages database schema migrations across the Skapp multi-tenant platform. It runs Liquibase programmatically (not auto-run) against a master MySQL database, per-tenant MySQL databases, and per-tenant PostgreSQL schemas.

### Component flow
```
Controller (v1/) → Service (interface) → ServiceImpl → DataSourceConfig → DataSourceFactory → HikariCP pool → DB
                                              ↳ TenantContext (set/clear)            ↳ Liquibase.update()
```
ServiceImpl sets `TenantContext`, gets a tenant-specific `DataSource` from `DataSourceConfig`/`DataSourceFactory`, passes it to `Liquibase` for schema migration, then clears context and closes the pool.

### Multi-tenancy model
- **Master DB** (MySQL): shared platform-level data
- **Tenant DBs** (MySQL): one database per tenant
- **Tenant schemas** (PostgreSQL): one schema per tenant within a shared database

### Package structure
```
com.skapp.migration/
  config/          → Configuration classes
  constant/        → Shared constants
  controller/v1/   → REST endpoints
  enums/           → Domain enums
  exception/       → Custom exceptions and global handler
  filters/         → Request filters
  model/           → JPA entities
  payload/         → Request/response DTOs
  repository/      → Spring Data JPA repositories
  security/        → Authentication
  service/         → Service interfaces
  service/impl/    → Service implementations
  utils/           → Utility classes
```

### Key conventions
- **Constructor injection** via Lombok `@RequiredArgsConstructor` — never field injection
- **Interface-first services** — every service has an interface, implementation is `*ServiceImpl`
- **Programmatic Liquibase** — migrations are executed via `Liquibase.update()` in service code, not Spring Boot auto-configuration (`spring.liquibase.enabled: false`)
- **API key authentication** — two key types: `SERVICE_API` (internal service-to-service) and `CLIENT_API` (admin-triggered operations)
- **TenantContext** — thread-local context holding current tenant, database type, operation type, and process type
- **Connection lifecycle** — tenant datasources are created on-demand via `DataSourceFactory`, cached in a `ConcurrentHashMap`, and closed after migration via `closeTenantDataSource()`
- **Migration locking** — in-memory `ConcurrentHashMap`-based lock prevents concurrent migration runs for the same scope

---

## SQL Migration File Standards

### File naming convention
```
NNNNN_action_description[.dialect].sql
```
- **5-digit zero-padded sequential number** (`00001`, `00024`, `00079`)
- **Action prefix in filename**: `create_table_*`, `alter_table_*`, `drop_*`, `create_triggers_*`
- **Descriptive suffix**: table name + what changed (e.g., `alter_table_organization_add_column_partner_id`)
- **Dialect suffix** (optional): `.mysql.sql` for initial scripts, plain `.sql` for incremental MySQL migrations, `.postgresql.sql` for PostgreSQL
- Numbers must be strictly sequential with no gaps or duplicates within a changelog

### Liquibase header format
Every SQL file must start with:
```sql
-- liquibase formatted sql

-- changeset AuthorName:NNNNN_description
```
- Author is the developer's name (PascalCase, e.g., `AkilaSilva`, `ErandiDeSilva`, `PrasadL`)
- Changeset ID matches the filename (without extension)
- One changeset per file — never multiple changesets in a single file

### Rollback comments
Every migration must include a rollback comment at the end:
```sql
-- rollback DROP TABLE `table_name`;
-- rollback ALTER TABLE `table_name` DROP COLUMN `column_name`;
```
- For multi-column ALTERs, each rollback statement on its own `-- rollback` line
- For FK drops, rollback must drop both the constraint and the auto-created index:
  ```sql
  -- rollback ALTER TABLE `table` DROP FOREIGN KEY FK_name;
  -- rollback ALTER TABLE `table` DROP INDEX FK_name;
  -- rollback ALTER TABLE `table` DROP COLUMN `column`;
  ```

### Changelog YAML registration
Every new SQL file must be added to the appropriate changelog YAML:
- MySQL master: `db/changelog/mysql/master/master-db.mysql.changelog.yml`
- MySQL tenant: `db/changelog/mysql/tenant/tenant-db.mysql.changelog.yml`
- PostgreSQL tenant: `db/changelog/postgresql/tenant/tenant-db.postgresql.changelog.yml`

Entry format:
```yaml
  - include:
      file: db/changelog/mysql/tenant/core/ddl-scripts/NNNNN_description.sql
```

---

## MySQL Schema Conventions

### Table creation
```sql
CREATE TABLE IF NOT EXISTS `table_name`
(
    `id`                 bigint       NOT NULL AUTO_INCREMENT,
    ...columns...
    `created_by`         varchar(255) DEFAULT NULL,
    `created_date`       datetime(6)  DEFAULT NULL,
    `last_modified_by`   varchar(255) DEFAULT NULL,
    `last_modified_date` datetime(6)  DEFAULT NULL,
    PRIMARY KEY (`id`),
    ...keys and constraints...
);
```

### Data types
| Use case | Type | Notes |
|----------|------|-------|
| Primary key | `bigint NOT NULL AUTO_INCREMENT` | Always `bigint`, never `int` |
| Boolean | `bit(1)` | Use `DEFAULT b'1'` or `b'0'` for defaults |
| Short string | `varchar(N)` | Use appropriate lengths: names `varchar(50-255)`, emails `varchar(255)`, phone `varchar(15)`, URLs `varchar(2083)` |
| Long text | `text` | For unbounded content (signatures, descriptions, JSON strings) |
| Enum values | `varchar(255)` | Stored as strings, not MySQL `ENUM` type (exception: legacy `time_config.day_of_week`, `employee_personal_info.blood_group`) |
| Fixed enum | `enum(...)` | Only for truly fixed sets inherited from older schemas — new enums use `varchar(255)` |
| Date | `date` | For date-only values |
| Timestamp | `datetime(6)` | Microsecond precision for audit fields |
| Nullable timestamp | `timestamp DEFAULT NULL` | For optional timestamp fields |
| Monetary | `double` | Used for amounts (e.g., tax_amount) |
| Counts | `int` | For counters, use `NOT NULL DEFAULT 0` |
| UUID | `varchar(36)` | For external-facing identifiers |
| JSON | `json` | For structured nested data |

### Naming conventions
| Element | Pattern | Example |
|---------|---------|---------|
| Table | `snake_case`, singular or domain-prefixed | `employee`, `es_envelope`, `in_customer`, `leave_type` |
| Column | `snake_case` | `employee_id`, `created_date`, `is_active` |
| Primary key | `PRIMARY KEY (\`column\`)` | Inline, no named constraint |
| Foreign key | `FK_table_reference_column` | `FK_employee_job_family_job_family_id` |
| Index | `IDX_table_column` | `IDX_employee_job_family_id` |
| Unique key | `UK_table_column` | `UK_user_email`, `UK_ai_tokens_user_id` |
| Boolean column | `is_*` prefix | `is_active`, `is_verified`, `is_deleted` |

### Domain prefixes
Tables are prefixed by module:
- `es_*` — eSign module (e.g., `es_envelope`, `es_document`, `es_recipient`)
- `in_*` — Invoicing module (e.g., `in_customer`, `in_tax`)
- No prefix — core HR/people module (e.g., `employee`, `leave_type`, `team`)

### Audit columns
All new tables with business data must include the standard audit columns:
```sql
`created_by`         varchar(255) DEFAULT NULL,
`created_date`       datetime(6)  DEFAULT NULL,
`last_modified_by`   varchar(255) DEFAULT NULL,
`last_modified_date` datetime(6)  DEFAULT NULL,
```

### Foreign keys
- Always declare FK constraints explicitly with `CONSTRAINT` keyword
- Always add an index (`KEY`) on FK columns for query performance
- FK constraint and its index should share the same name
- Use `REFERENCES` with the full `table (column)` syntax
- Cascade rules: use sparingly — only `ON DELETE CASCADE` on true parent-child composition relationships (e.g., `support_request_attachment` → `support_request`)

---

## PostgreSQL Schema Conventions

### Table creation
```sql
CREATE TABLE IF NOT EXISTS table_name
(
    id         BIGSERIAL NOT NULL,
    ...columns...
    created_by BIGINT,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_by BIGINT,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,

    CONSTRAINT pk_table_name PRIMARY KEY (id),
    ...constraints...
);
```

### Key differences from MySQL
| Aspect | MySQL | PostgreSQL |
|--------|-------|------------|
| Auto-increment | `bigint NOT NULL AUTO_INCREMENT` | `BIGSERIAL NOT NULL` |
| Strings | `varchar(N)` with backticks | `TEXT` (unquoted), occasionally `VARCHAR(N)` for constrained fields |
| Boolean | `bit(1)` | `BOOLEAN` with `DEFAULT FALSE/TRUE` |
| Timestamp | `datetime(6)` | `TIMESTAMP` |
| JSON | `json` | `JSONB` (binary, indexable) |
| Identifier quoting | Backticks `` ` `` | No quoting (lowercase names) or double quotes `"` for special cases |
| Audit columns | `created_by VARCHAR(255)`, `created_date DATETIME(6)` | `created_by BIGINT`, `created_at TIMESTAMP` |
| PK naming | Unnamed `PRIMARY KEY` | Named `CONSTRAINT pk_table_name PRIMARY KEY` |
| FK naming | `FK_Table_Ref_Column` | `fk_table_ref_column` (lowercase) |
| UK naming | `UK_Table_Column` | `uk_table_constraint` (lowercase) |
| Cascade | Rare | `ON DELETE CASCADE` is common |
| Collation | Default | `COLLATE "C"` for sort-sensitive TEXT columns (e.g., fractional indexes) |

---

## What to flag

### Migration file issues
- Missing `-- liquibase formatted sql` header — every file must start with this
- Missing or malformed changeset comment — must be `-- changeset Author:changeset_id`
- Missing rollback comment — every migration must have a rollback strategy
- Changeset ID not matching filename
- Author name not in PascalCase format
- Multiple changesets in a single file
- Sequence number gaps or duplicates within the same changelog
- New SQL file not registered in the corresponding changelog YAML
- Changelog YAML entry order not matching the sequential file numbering
- File placed in wrong directory (MySQL migration in PostgreSQL folder or vice versa)

### Schema design issues
- Missing `IF NOT EXISTS` on CREATE TABLE — all creates must be idempotent
- Missing `IF EXISTS` on DROP TABLE/DROP TRIGGER in rollbacks
- Using `int` for primary keys — must be `bigint` (MySQL) or `BIGSERIAL` (PostgreSQL)
- Missing audit columns on new business tables (`created_by`, `created_date`, `last_modified_by`, `last_modified_date` for MySQL; `created_by`, `created_at`, `updated_by`, `updated_at` for PostgreSQL)
- Missing index on foreign key columns in MySQL — every FK column needs a `KEY`
- Unnamed constraints — all FK, UK constraints must be explicitly named following the naming convention
- Column type inconsistencies with existing tables (e.g., using `varchar(100)` for an email when the convention is `varchar(255)`)
- Using MySQL `ENUM` type for new tables — use `varchar(255)` instead (enums are managed at application level)
- Missing `NOT NULL` on columns that should never be null (especially PKs, required business fields)
- Missing `DEFAULT` values where appropriate (booleans, counters, status fields)
- Overly broad `CASCADE` rules — only use on true parent-child compositions
- `varchar(255)` for fields that clearly need constraints (phone should be `varchar(15)`, country `varchar(100)`)

### ALTER TABLE issues
- Adding a `NOT NULL` column without a `DEFAULT` value — will fail on tables with existing data
- Changing column types without considering data loss (e.g., `varchar(255)` → `varchar(50)`)
- Dropping columns that might be referenced by application code or other FK constraints
- Multi-concern migrations — each ALTER should address a single logical change
- Missing rollback for ALTER operations (especially column type changes — rollback must specify the original type)

### Trigger & advanced DDL
- Triggers without matching drop trigger migration (lifecycle management)
- Stored procedures or complex DDL without thorough rollback strategy
- Triggers that could interfere with Liquibase's own operations on tracked tables

### Data safety
- DML statements (INSERT, UPDATE, DELETE) in DDL migration files without explicit transaction handling
- Destructive operations (DROP TABLE, DROP COLUMN) without confirmation that no application code references the dropped object
- Migrations that modify data in `DATABASECHANGELOG` or `DATABASECHANGELOGLOCK` tables directly

---

## Java Code Standards

### Service patterns
- All services must have an interface and `*ServiceImpl` implementation
- Use `@RequiredArgsConstructor` for dependency injection — never `@Autowired` on fields
- Use `@Slf4j` for logging — never `System.out.println`
- Configuration values via `@Value("${...}")` — never hardcoded connection strings or credentials

### Tenant context management
- Always set tenant context before database operations: `TenantContext.setContext(tenantId, dbType, opType, processType)`
- Always clear context in `finally` block: `TenantContext.clear()`
- Always close tenant datasource after migration: `dataSourceConfig.closeTenantDataSource(tenantId)`
- Pattern:
  ```java
  try {
      TenantContext.setContext(tenantId, DatabaseType.MYSQL, OperationType.WRITE, ProcessType.TENANT_MIGRATION);
      // ... perform work ...
  } finally {
      TenantContext.clear();
      dataSourceConfig.closeTenantDataSource(tenantId);
  }
  ```

### Connection & resource management
- All JDBC connections must use try-with-resources: `try (Connection conn = ds.getConnection())`
- All PreparedStatements and ResultSets must be closed via try-with-resources
- Never use string concatenation for SQL queries with user input — use `PreparedStatement` with `?` placeholders
- Exception: database/schema names cannot be parameterized — validate with `isInvalidDatabaseName()` regex before concatenation
- Batch operations must use `addBatch()` + `executeBatch()` for bulk inserts/updates

### Error handling
- Throw `ModuleException` with an `ErrorMessage` enum constant — never raw `RuntimeException` with string messages (exception: `MasterMigrationServiceImpl.handleMigrationError` wraps as `RuntimeException` — this is a known pattern for master-level errors)
- All new error scenarios must have a corresponding entry in the `ErrorMessage` enum
- Log errors with stack trace: `log.error("message", exception)` — not just the message
- Migration failures for individual tenants should be caught and logged, not abort the entire batch

### Controller conventions
- All endpoints under `/v1/` path prefix
- `@PreAuthorize` with `hasAuthority('CLIENT_API')` for admin operations or `hasAuthority('SERVICE_API')` for internal service calls
- Return `ResponseEntity<Object>` with appropriate HTTP status codes
- Catch `ModuleException` for known error cases and return meaningful HTTP status (e.g., `409 CONFLICT` for already-running migrations)

### Security
- API keys read from `x-api-key` header — never from query params or request body
- Separate `SecurityFilterChain` beans for each API path prefix with correct `@Order`
- Never log full API keys — only log masked/partial identifiers

### Naming
| Element | Pattern | Example |
|---------|---------|---------|
| Service interface | `*Service` | `TenantMigrationService`, `MasterMigrationService` |
| Service implementation | `*ServiceImpl` | `TenantMigrationServiceImpl` |
| Controller | `*Controller` | `MigrationController` |
| Entity | PascalCase noun | `Tenant`, `Organization` |
| DTO | `*Dto`, `*Request`, `*Response` | `SchedulerRequest`, `HealthResponse`, `MigrationExecutionResponseDto` |
| Enum | PascalCase | `DatabaseType`, `OperationType`, `ErrorMessage` |
| Constants | `UPPER_SNAKE_CASE` | `MASTER_DATABASE`, `API_KEY_HEADER` |

---

## What NOT to flag (false positives)

- Lombok usage (`@Getter`, `@Setter`, `@RequiredArgsConstructor`, `@Slf4j`, `@Builder`) — this is standard
- `spring.liquibase.enabled: false` — migrations are executed programmatically, not auto-run
- `spring.jpa.hibernate.ddl-auto: none` — all DDL is managed by Liquibase, never Hibernate auto
- `spring.jpa.open-in-view: true` — acceptable for this service's request scope
- Direct JDBC usage (PreparedStatement, ResultSet) in service layer — this is the established pattern for migration-specific data operations, not a sign of missing JPA
- Raw SQL in data migration services (e.g., `EsMigrationServiceImpl`, `TenantMigrationServiceImpl`) — these perform bulk operations that don't map well to JPA
- `ConcurrentHashMap`-based locking in `MigrationLockService` — intentionally in-memory, sufficient for single-instance deployment
- String concatenation for database/schema names in SQL — these are validated before use, cannot be parameterized
- `DataSourceFactory` cache using `computeIfAbsent` — this is the intended connection pooling strategy
- No unit tests for migration SQL — SQL migrations are validated by Liquibase checksums and integration testing
- `spring-javaformat-maven-plugin` handling formatting — don't flag style issues caught by the formatter
- PostgreSQL `TEXT` type instead of `VARCHAR` — this is the PostgreSQL convention in this project
- Different audit column naming between MySQL (`created_date`/`last_modified_date`) and PostgreSQL (`created_at`/`updated_at`) — these are intentionally different patterns for the two database engines
- Duplicate sequence numbers across different changelogs (e.g., `00009` appears multiple times in PostgreSQL changelog) — these were registered and are tracked by Liquibase by their full changeset ID, not just the number

---

## Checklist

Verify each item for every changed file:

### SQL migration files
1. **Liquibase header**: Starts with `-- liquibase formatted sql`? Changeset comment matches filename? Author in PascalCase?
2. **Rollback present**: Every migration has `-- rollback` comment(s)? FK drops include both constraint and index?
3. **Changelog registered**: New SQL file added to the correct changelog YAML in the right order?
4. **Idempotent DDL**: `CREATE TABLE IF NOT EXISTS` used? `IF EXISTS` in rollback drops?
5. **Data types correct**: `bigint` (not `int`) for PKs? `bit(1)` for MySQL booleans / `BOOLEAN` for PostgreSQL? `datetime(6)` for MySQL timestamps / `TIMESTAMP` for PostgreSQL? `varchar(255)` for enums?
6. **Audit columns present**: New business tables include `created_by`, `created_date`, `last_modified_by`, `last_modified_date` (MySQL) or `created_by`, `created_at`, `updated_by`, `updated_at` (PostgreSQL)?
7. **FK indexes**: Every FK column in MySQL has a corresponding `KEY` index?
8. **Naming conventions**: FK/IDX/UK prefixes follow established patterns? Table names use correct domain prefix (`es_*`, `in_*`)?
9. **Safe ALTERs**: `NOT NULL` columns have `DEFAULT`? Column type changes won't lose data? Single concern per migration file?
10. **Sequence numbers**: No gaps or duplicates within the same changelog?

### Java code
1. **Tenant context lifecycle**: `TenantContext.setContext()` called before DB ops? `TenantContext.clear()` in `finally`? `closeTenantDataSource()` in `finally`?
2. **Resource management**: All `Connection`, `PreparedStatement`, `ResultSet` in try-with-resources?
3. **Error handling**: `ModuleException` with `ErrorMessage` enum (not raw `RuntimeException`)? New error scenarios have `ErrorMessage` entries?
4. **SQL injection**: User input parameterized with `?`? DB names validated with `isInvalidDatabaseName()` before concatenation?
5. **Naming**: Service interface + `*ServiceImpl` pattern? Controller under `/v1/`? `@PreAuthorize` with correct authority?
About this prompt​

About this prompt