Trt-1989: partition management by neisw · Pull Request #3334 · openshift/sippy

neisw · 2026-03-17T12:33:05Z

Adds migration helpers for one off partition migration steps
Adds FK analysis for pre-migration review
Adds example delete helper to handle deletes without relying on cascading deletes / FK constraints
Adds partition mananagement for
- Updating partitioned tables
- Detaching old partitions
- Dropping older detached partitions

Summary by CodeRabbit

New Features
- End-to-end partition management: discovery, stats, automated partition creation/aging, attach/detach, safe removal and dry-run support
- Database migration toolkit: create, update and finalize partitioned tables, incremental data migration, identity/sequence sync, and partition-aware renames
- Partition impact analysis and controlled deletion of test data
Refactor
- Models and migration flow adjusted to support partitioned tables and explicit timestamp/primary-key handling
Tests
- Extensive unit tests covering partitioning, validation, SQL generation and migration helpers

openshift-ci-robot · 2026-03-17T12:33:09Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

openshift-ci · 2026-03-17T12:33:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neisw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [neisw]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-03-17T12:35:31Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: da10bc4d-7820-48e4-b121-b9aac6ba875c

📥 Commits

Reviewing files that changed from the base of the PR and between 98d456e and a297fe9.

📒 Files selected for processing (2)

pkg/db/partitions.go
pkg/db/utils.go

Walkthrough

Adds a comprehensive PostgreSQL partitioning and migration subsystem, migrates prow test models to explicit ID/timestamp fields, integrates partition lifecycle management into DB init and the Prow dataloader, and adds extensive partition- and migration-focused tests.

Changes

Cohort / File(s)	Summary
Partition engine & utilities `pkg/db/partitions.go`, `pkg/db/utils.go`	New end-to-end partition management and DB utilities: discovery, stats, create/attach/detach/drop, retention analysis, partitioned table creation/update, FK/sequence handling, schema introspection, migration helpers, rename/sync utilities, dry-run and transactional safety. Large API surface and many exported types/functions added.
Partition migration APIs `pkg/db/migration.go`	New staged migration APIs: `MigrateToPartitionedTable`, `UpdatePartitionedTableMigration`, `FinalizePartitionedTableMigration`, `AnalyzePartitioningImpact`, and `DeleteTestsByName` with logging and dry-run support.
DB init/migration flow change `pkg/db/db.go`	Reordered migration flow to handle partitioned models explicitly: removed AutoMigrate for partitioned models and invoke UpdatePartitionedTable/partitioned migration loop instead; retained syncPartitionedTables call with TODOs.
Model changes for partitioning `pkg/db/models/prow.go`	Replaced `gorm.Model` on `ProwJobRunTest` and `ProwJobRunTestOutput` with explicit `ID`, `CreatedAt`, `UpdatedAt`, `DeletedAt`; added composite/index annotations and adjusted foreign-key/index metadata for partition compatibility.
Partition tests `pkg/db/partitions_test.go`, `pkg/db/utils_test.go`	Extensive unit tests covering partition-name validation, PartitionConfig validation and SQL generation, data-type normalization, schema/migration verification scaffolding, and related utilities and edge cases.
Dataloader partition lifecycle `pkg/dataloader/prowloader/prow.go`	Added `PartitionManagementConfig` and `partitionConfigs`; new helper methods `updatePartitions`, `agePartitions`, `preparePartitions`; replaced ad-hoc per-date partition creation with centralized partition-management loop; expanded partition-related logging and error handling.
Minor cleanup & comments `pkg/dataloader/prowloader/prow.go`	Removed unused import and added comment in `extractTestCases` noting reliance on `created_at` for test entries.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name	Status	Explanation	Resolution
Sql Injection Prevention	❌ Error	CREATE INDEX statements in partitions.go lines 1204-1206 and 1566-1603 lack proper pq.QuoteIdentifier() protection for table/column identifiers, creating SQL injection vulnerabilities.	Wrap all table and column identifiers in CREATE INDEX statements with pq.QuoteIdentifier() and implement code review checklist to ensure consistent protection across all SQL statement construction.
Go Error Handling	⚠️ Warning	PR contains multiple Go error handling violations including ignored errors, panic() outside init(), and unwrapped error returns without context.	Remove ignored error assignments, replace panic() with proper error returns, and refactor all return err statements to use fmt.Errorf with %w verb for context.
Single Responsibility And Clear Naming	⚠️ Warning	Pull request violates Single Responsibility Principle: utils.go is 2,159 lines with 8 conceptual domains, 45 new public methods added to DB type, and partition logic scattered across multiple files.	Restructure packages with focused modules (schema.go, datamigration.go, sequences.go, foreignkeys.go, tableops.go), consolidate partition operations in partitions.go, move misplaced methods, and restrict migration.go to migration operations only.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Excessive Css In React Should Use Styles	✅ Passed	This custom check is not applicable to the PR as it exclusively modifies Go backend files for database management, with no React/JavaScript component changes.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (1)

pkg/db/utils_test.go (1)
121-183: These tests never exercise the new DB helpers.

These cases only validate local literals or log text, so they all pass even if the production migration/partition helpers regress. For this surface area, either call the real APIs with assertions or move the examples into package docs.
As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

Also applies to: 251-480
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/db/utils_test.go` around lines 121 - 183, The
TestMigrateTableDataValidation unit test only asserts on literals and never
calls the real migration helpers (e.g., the new migrateTableData / partition
helper functions), so it gives false confidence; update the test to either (A)
call the actual helpers under test with a lightweight in-memory or test DB
fixture and assert their behavior (invoke migrateTableData/PartitionHelper
methods with representative inputs, verify returned errors and SQL/results) or
(B) if you want to keep it as documentation, move these cases into package
examples or godoc and add real integration tests elsewhere; ensure the test
references and exercises the actual functions (migrateTableData,
PartitionHelper, or whatever new DB helper names exist) and covers
dryRun=true/false, omitted columns, and error paths.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/db/db.go`:
- Around line 127-129: The current loop calls UpdatePartitionedTable for each
partitionedModel in partitionedModelsToMigrate but fails when the physical
partitioned table doesn't exist; modify the flow to ensure the base partitioned
table exists before calling UpdatePartitionedTable by either invoking the
table-creation routine (e.g., call CreatePartitionedTable or run AutoMigrate for
partitionedModel.model/tableName) or update UpdatePartitionedTable to detect a
missing table and create it instead; reference partitionedModelsToMigrate,
UpdatePartitionedTable, AutoMigrate, and the partition sync so the missing-table
creation happens prior to the UpdatePartitionedTable call to avoid bootstrapping
failures.
- Around line 121-123: The tableName for the models.ProwJobRunTestOutput schema
entry is incorrect ("prow_job_run_tests_outputs") and must be changed to the
canonical "prow_job_run_test_outputs" so migrations target the real table;
locate the struct literal that sets model: &models.ProwJobRunTestOutput{} and
update its tableName field to "prow_job_run_test_outputs" to match other call
sites.

In `@pkg/db/migration.go`:
- Around line 168-170: The resume window is inclusive because
MigrateTableDataRange uses ">= startDate", so passing maxDate (or migrateFrom)
causes rows at that exact timestamp to be re-copied; to fix it, advance the
start timestamp by the smallest representable unit before passing into
MigrateTableDataRange (e.g., compute a next-timestamp from maxDate and use that
as the start) for the calls that pass maxDate/migrateFrom (references:
MigrateTableDataRange, maxDate, migrateUpTo and the other call sites noted
around lines 256-258) so resumed runs become exclusive of the already-migrated
max timestamp.
- Around line 391-409: The logic in AnalyzePartitioningImpact incorrectly
hardcodes "created_at" as the partition key when checking inbound FKs; change it
to use the actual partition column(s) discovered for the target table (e.g., the
variable holding partition columns like partCols/partitionCols) instead of
hasCreatedAt and "created_at" literals: compute whether the source has all
required partition column(s) (e.g., check each name in partitionCols against
srcColSet), set fields["source_has_<col>"] or a single boolean like
fields["source_has_partition_cols"], and update the fields["reason"] messages
and UNIQUE-constraint suggestion to reference the real partition column list
(join partitionCols) and fk.ReferencedColumns accordingly so the DROP vs EXPAND
decision and explanatory text reflect the true partition key(s) rather than
always using created_at.

In `@pkg/db/partitions.go`:
- Around line 1604-1612: The index cleanup in UpdatePartitionedTable is dropping
PostgreSQL constraint-backed indexes (created by MoveForeignKeys as UNIQUE
constraints) because currentIdxMap contains them; modify UpdatePartitionedTable
to first query pg_constraint/pg_class to collect names of indexes backing
constraints (e.g., select conindid/pg_get_indexdef or join pg_constraint ->
pg_class to get index names) into a set (constraintIdxSet) and then skip any
idxName that is in constraintIdxSet (in addition to the existing
strings.HasSuffix(idxName, "_pkey") check) when iterating currentIdxMap; ensure
the new check references currentIdxMap, UpdatePartitionedTable and Skip
constraint-backed indexes so the UNIQUE constraints retain their backing
indexes.

In `@pkg/db/utils.go`:
- Around line 699-725: VerifyPartitionCoverage currently treats the date range
as inclusive of the day containing endDate, but the helper intends [startDate,
endDate) semantics; change how endDay is computed so the day containing endDate
is excluded. Specifically, in VerifyPartitionCoverage (and where endDay is set),
replace endDay := endDate.Truncate(24 * time.Hour) with endDay :=
endDate.Add(-time.Nanosecond).Truncate(24 * time.Hour) (or equivalent
subtraction of a tiny duration) so the loop over currentDate only checks up to
the last day fully before endDate; keep the rest of the loop, existingDates map
and getPartitionsInDateRange usage unchanged.
- Around line 1603-1628: The current validation skips checks when a target is
also a source (tableRenames/tableRenameMap) but the sequential ALTER TABLE
execution still fails for cycles (e.g., A→B, B→A); update the rename logic to
detect cycles in the rename graph and perform a two-phase rename instead: first
compute strongly connected components or any cycle detection over tableRenames,
then for any cycle rename all source tables to unique temporary names (e.g.,
append a safe suffix or GUID), execute those ALTER TABLE ... RENAME TO temp
operations, and finally rename temps to their intended targets; modify the
execution loop that issues the ALTER TABLE statements to use this temp-phase for
cycles and fall back to direct single-step renames for acyclic mappings.

---

Nitpick comments:
In `@pkg/db/utils_test.go`:
- Around line 121-183: The TestMigrateTableDataValidation unit test only asserts
on literals and never calls the real migration helpers (e.g., the new
migrateTableData / partition helper functions), so it gives false confidence;
update the test to either (A) call the actual helpers under test with a
lightweight in-memory or test DB fixture and assert their behavior (invoke
migrateTableData/PartitionHelper methods with representative inputs, verify
returned errors and SQL/results) or (B) if you want to keep it as documentation,
move these cases into package examples or godoc and add real integration tests
elsewhere; ensure the test references and exercises the actual functions
(migrateTableData, PartitionHelper, or whatever new DB helper names exist) and
covers dryRun=true/false, omitted columns, and error paths.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 39e750cc-5817-48ab-bfbb-df0cc3ff2816

📥 Commits

Reviewing files that changed from the base of the PR and between 5bbcb4d and af7e262.

📒 Files selected for processing (8)

pkg/dataloader/prowloader/prow.go
pkg/db/db.go
pkg/db/migration.go
pkg/db/models/prow.go
pkg/db/partitions.go
pkg/db/partitions_test.go
pkg/db/utils.go
pkg/db/utils_test.go

pkg/db/db.go

coderabbitai · 2026-03-17T12:52:53Z

pkg/db/db.go

+	for _, partitionedModel := range partitionedModelsToMigrate {
+		if _, err := d.UpdatePartitionedTable(partitionedModel.model, partitionedModel.tableName, false, false); err != nil {
+			return err


⚠️ Potential issue | 🔴 Critical

This path only works if the partitioned tables already exist.

UpdatePartitionedTable errors when the target table is missing. Since these models were removed from AutoMigrate, any environment bootstrapping from an empty schema will stop here before the later partition sync can create them.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/db.go` around lines 127 - 129, The current loop calls UpdatePartitionedTable for each partitionedModel in partitionedModelsToMigrate but fails when the physical partitioned table doesn't exist; modify the flow to ensure the base partitioned table exists before calling UpdatePartitionedTable by either invoking the table-creation routine (e.g., call CreatePartitionedTable or run AutoMigrate for partitionedModel.model/tableName) or update UpdatePartitionedTable to detect a missing table and create it instead; reference partitionedModelsToMigrate, UpdatePartitionedTable, AutoMigrate, and the partition sync so the missing-table creation happens prior to the UpdatePartitionedTable call to avoid bootstrapping failures.

pkg/db/migration.go

coderabbitai · 2026-03-17T12:52:53Z

pkg/db/migration.go

+			// Assume partitioning on created_at (most common case) — check if source has it
+			hasCreatedAt := srcColSet["created_at"]
+			fields["source_has_created_at"] = hasCreatedAt
+
+			if len(srcPartCols) > 0 {
+				// Source table is also partitioned
+				fields["action"] = "DROP"
+				fields["reason"] = "both tables are partitioned; FK constraints between partitioned tables are not supported"
+			} else if !hasCreatedAt {
+				// Source table lacks the partition key column — FK must be dropped
+				fields["action"] = "DROP"
+				fields["reason"] = fmt.Sprintf("source table %s lacks created_at column needed to reference partitioned %s; FK cannot be expanded",
+					fk.SourceTable, tableName)
+			} else {
+				// Source table has the partition key — FK could be expanded
+				fields["action"] = "EXPAND or DROP"
+				fields["reason"] = fmt.Sprintf("source table %s has created_at column; FK could be expanded to include partition key, "+
+					"but this requires a UNIQUE constraint on %s(%s, created_at) and adds complexity",
+					fk.SourceTable, tableName, fk.ReferencedColumns)


⚠️ Potential issue | 🟠 Major

AnalyzePartitioningImpact is only correct for created_at partitions.

This helper already looks up the table's real partition columns, but the inbound-FK branch ignores them and hard-codes created_at. For any table partitioned on a different key, the report will recommend dropping or expanding the wrong FK shape.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/migration.go` around lines 391 - 409, The logic in AnalyzePartitioningImpact incorrectly hardcodes "created_at" as the partition key when checking inbound FKs; change it to use the actual partition column(s) discovered for the target table (e.g., the variable holding partition columns like partCols/partitionCols) instead of hasCreatedAt and "created_at" literals: compute whether the source has all required partition column(s) (e.g., check each name in partitionCols against srcColSet), set fields["source_has_<col>"] or a single boolean like fields["source_has_partition_cols"], and update the fields["reason"] messages and UNIQUE-constraint suggestion to reference the real partition column list (join partitionCols) and fk.ReferencedColumns accordingly so the DROP vs EXPAND decision and explanatory text reflect the true partition key(s) rather than always using created_at.

coderabbitai · 2026-03-17T12:52:54Z

pkg/db/partitions.go

+	// Drop indexes that are no longer in the model
+	for idxName := range currentIdxMap {
+		// Skip primary key and system indexes
+		if strings.HasSuffix(idxName, "_pkey") {
+			continue
+		}
+		alterStatements = append(alterStatements,
+			fmt.Sprintf("DROP INDEX IF EXISTS %s", idxName))
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the specific lines in question sed -n '1604,1612p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 344

🏁 Script executed:

# Let's also see more context around those lines sed -n '1590,1625p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 1116

🏁 Script executed:

# Find the UpdatePartitionedTable function definition grep -n "func.*UpdatePartitionedTable" pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 187

🏁 Script executed:

# Find MoveForeignKeys function grep -n "func.*MoveForeignKeys" pkg/db/utils.go

Repository: openshift/sippy

Length of output: 148

🏁 Script executed:

# Let's look at the MoveForeignKeys function sed -n '963,1050p' pkg/db/utils.go

Repository: openshift/sippy

Length of output: 2825

🏁 Script executed:

# Check what creates UNIQUE constraints - look for more of MoveForeignKeys sed -n '963,1100p' pkg/db/utils.go

Repository: openshift/sippy

Length of output: 4334

🏁 Script executed:

# Let's understand how currentIdxMap is built - look earlier in UpdatePartitionedTable sed -n '1293,1400p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 3324

🏁 Script executed:

# See how getCurrentIndexes is implemented grep -n "func.*getCurrentIndexes" pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 137

🏁 Script executed:

# Read that function sed -n '1200,1280p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 2320

🏁 Script executed:

# Continue reading UpdatePartitionedTable to see how modelIndexes is built sed -n '1400,1550p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 5399

🏁 Script executed:

# Read getCurrentIndexes function sed -n '1675,1750p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 2153

🏁 Script executed:

# Also check what MoveForeignKeys specifically creates - look for UNIQUE constraint creation sed -n '1050,1200p' pkg/db/utils.go

Repository: openshift/sippy

Length of output: 4972

Exclude constraint-backed indexes from the index cleanup loop.

When MoveForeignKeys creates UNIQUE constraints to support expanded FKs, PostgreSQL automatically creates backing indexes with the constraint name. The index cleanup in UpdatePartitionedTable will drop these indexes (they're in the current database but not in the GORM model definition), leaving the UNIQUE constraints without their backing indexes and corrupting the schema.

Constraint-backed indexes can be identified via the pg_constraint system catalog and should be skipped during cleanup alongside the _pkey indexes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/partitions.go` around lines 1604 - 1612, The index cleanup in UpdatePartitionedTable is dropping PostgreSQL constraint-backed indexes (created by MoveForeignKeys as UNIQUE constraints) because currentIdxMap contains them; modify UpdatePartitionedTable to first query pg_constraint/pg_class to collect names of indexes backing constraints (e.g., select conindid/pg_get_indexdef or join pg_constraint -> pg_class to get index names) into a set (constraintIdxSet) and then skip any idxName that is in constraintIdxSet (in addition to the existing strings.HasSuffix(idxName, "_pkey") check) when iterating currentIdxMap; ensure the new check references currentIdxMap, UpdatePartitionedTable and Skip constraint-backed indexes so the UNIQUE constraints retain their backing indexes.

coderabbitai · 2026-03-17T12:52:54Z

pkg/db/utils.go

+func (dbc *DB) VerifyPartitionCoverage(tableName string, startDate, endDate time.Time) error {
+	partitions, err := dbc.getPartitionsInDateRange(tableName, startDate, endDate)
+	if err != nil {
+		return fmt.Errorf("failed to get partitions: %w", err)
+	}
+
+	// Create a map of existing partition dates for quick lookup
+	existingDates := make(map[string]bool)
+	for _, p := range partitions {
+		dateStr := p.PartitionDate.Format("2006-01-02")
+		existingDates[dateStr] = true
+	}
+
+	// Normalize to calendar days so timestamps within a day are handled correctly
+	startDay := startDate.Truncate(24 * time.Hour)
+	endDay := endDate.Truncate(24 * time.Hour)
+
+	// Check that we have a partition for each day in the range
+	var missingDates []string
+	currentDate := startDay
+	for !currentDate.After(endDay) {
+		dateStr := currentDate.Format("2006-01-02")
+		if !existingDates[dateStr] {
+			missingDates = append(missingDates, dateStr)
+		}
+		currentDate = currentDate.AddDate(0, 0, 1) // Move to next day
+	}


⚠️ Potential issue | 🟠 Major

The partition coverage check is off by one day.

This helper documents [startDate, endDate), but the loop still validates the day containing endDate itself. A migration ending exactly at midnight should not require a partition for the first day outside the copy window.

Suggested fix

- // Normalize to calendar days so timestamps within a day are handled correctly - startDay := startDate.Truncate(24 * time.Hour) - endDay := endDate.Truncate(24 * time.Hour) + // Normalize to calendar days so timestamps within a day are handled correctly. + // The required coverage is [startDate, endDate), so compute the last covered day + // from the instant just before endDate. + startDay := startDate.Truncate(24 * time.Hour) + lastRequiredDay := endDate.Add(-time.Nanosecond).Truncate(24 * time.Hour) // Check that we have a partition for each day in the range var missingDates []string currentDate := startDay - for !currentDate.After(endDay) { + for !currentDate.After(lastRequiredDay) { dateStr := currentDate.Format("2006-01-02") if !existingDates[dateStr] { missingDates = append(missingDates, dateStr) } currentDate = currentDate.AddDate(0, 0, 1) // Move to next day }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (dbc *DB) VerifyPartitionCoverage(tableName string, startDate, endDate time.Time) error {

partitions, err := dbc.getPartitionsInDateRange(tableName, startDate, endDate)

if err != nil {

return fmt.Errorf("failed to get partitions: %w", err)

}

// Create a map of existing partition dates for quick lookup

existingDates := make(map[string]bool)

for _, p := range partitions {

dateStr := p.PartitionDate.Format("2006-01-02")

existingDates[dateStr] = true

}

// Normalize to calendar days so timestamps within a day are handled correctly

startDay := startDate.Truncate(24 * time.Hour)

endDay := endDate.Truncate(24 * time.Hour)

// Check that we have a partition for each day in the range

var missingDates []string

currentDate := startDay

for !currentDate.After(endDay) {

dateStr := currentDate.Format("2006-01-02")

if !existingDates[dateStr] {

missingDates = append(missingDates, dateStr)

}

currentDate = currentDate.AddDate(0, 0, 1) // Move to next day

}

func (dbc *DB) VerifyPartitionCoverage(tableName string, startDate, endDate time.Time) error {

partitions, err := dbc.getPartitionsInDateRange(tableName, startDate, endDate)

if err != nil {

return fmt.Errorf("failed to get partitions: %w", err)

}

// Create a map of existing partition dates for quick lookup

existingDates := make(map[string]bool)

for _, p := range partitions {

dateStr := p.PartitionDate.Format("2006-01-02")

existingDates[dateStr] = true

}

// Normalize to calendar days so timestamps within a day are handled correctly.

// The required coverage is [startDate, endDate), so compute the last covered day

// from the instant just before endDate.

startDay := startDate.Truncate(24 * time.Hour)

lastRequiredDay := endDate.Add(-time.Nanosecond).Truncate(24 * time.Hour)

// Check that we have a partition for each day in the range

var missingDates []string

currentDate := startDay

for !currentDate.After(lastRequiredDay) {

dateStr := currentDate.Format("2006-01-02")

if !existingDates[dateStr] {

missingDates = append(missingDates, dateStr)

}

currentDate = currentDate.AddDate(0, 0, 1) // Move to next day

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/utils.go` around lines 699 - 725, VerifyPartitionCoverage currently treats the date range as inclusive of the day containing endDate, but the helper intends [startDate, endDate) semantics; change how endDay is computed so the day containing endDate is excluded. Specifically, in VerifyPartitionCoverage (and where endDay is set), replace endDay := endDate.Truncate(24 * time.Hour) with endDay := endDate.Add(-time.Nanosecond).Truncate(24 * time.Hour) (or equivalent subtraction of a tiny duration) so the loop over currentDate only checks up to the last day fully before endDate; keep the rest of the loop, existingDates map and getPartitionsInDateRange usage unchanged.

coderabbitai · 2026-03-17T12:52:54Z

pkg/db/utils.go

+	for _, rename := range tableRenames {
+		source := rename.From
+		target := rename.To
+		// Skip check if this target is also a source (table swap scenario)
+		if _, isAlsoSource := tableRenameMap[target]; isAlsoSource {
+			log.WithFields(log.Fields{
+				"source": source,
+				"target": target,
+			}).Debug("target is also a source - table swap detected")
+			continue
+		}
+
+		var exists bool
+		query := `
+			SELECT EXISTS (
+				SELECT 1 FROM pg_tables
+				WHERE schemaname = 'public' AND tablename = @table_name
+			)
+		`
+		result := dbc.DB.Raw(query, sql.Named("table_name", target)).Scan(&exists)
+		if result.Error != nil {
+			return 0, fmt.Errorf("failed to check if target table %s exists: %w", target, result.Error)
+		}
+		if exists {
+			return 0, fmt.Errorf("target table %s already exists (conflict with rename from %s)", target, source)
+		}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Get the full context of the function containing lines 1603-1628 cat -n pkg/db/utils.go | sed -n '1600,1650p'

Repository: openshift/sippy

Length of output: 2226

🏁 Script executed:

# Also check lines 1955-1968 mentioned in "Also applies to" cat -n pkg/db/utils.go | sed -n '1950,1975p'

Repository: openshift/sippy

Length of output: 903

🏁 Script executed:

# Search for where tableRenames are actually executed (the ALTER TABLE statements) rg -n "ALTER TABLE" pkg/db/utils.go -A 3 -B 3

Repository: openshift/sippy

Length of output: 3581

🏁 Script executed:

# Search for the full function that contains this rename logic rg -n "func.*rename" pkg/db/utils.go -i

Repository: openshift/sippy

Length of output: 425

Swap/cycle renames still fail at execution time.

The validation at lines 1603-1628 detects when a target is also a source and skips the pre-execution conflict check. However, this does not address the core problem: the sequential execution at lines 1955-1968 will fail for any cycle (e.g., A→B, B→A). After the first ALTER TABLE A RENAME TO B succeeds, the second statement ALTER TABLE B RENAME TO A fails because table B already exists. Cycles require either temporary intermediate names or a two-phase rename strategy.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/utils.go` around lines 1603 - 1628, The current validation skips checks when a target is also a source (tableRenames/tableRenameMap) but the sequential ALTER TABLE execution still fails for cycles (e.g., A→B, B→A); update the rename logic to detect cycles in the rename graph and perform a two-phase rename instead: first compute strongly connected components or any cycle detection over tableRenames, then for any cycle rename all source tables to unique temporary names (e.g., append a safe suffix or GUID), execute those ALTER TABLE ... RENAME TO temp operations, and finally rename temps to their intended targets; modify the execution loop that issues the ALTER TABLE statements to use this temp-phase for cycles and fall back to direct single-step renames for acyclic mappings.

openshift-ci-robot · 2026-03-17T13:14:24Z

@neisw: This pull request references Trt-1989 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adds migration helpers for one off partition migration steps

Adds FK analysis for pre-migration review

Adds example delete helper to handle deletes without relying on cascading deletes / FK constraints

Adds partition mananagement for

Updating partitioned tables

Detaching old partitions

Dropping older detached partitions

Summary by CodeRabbit

Release Notes

New Features

Automated partition lifecycle management with configurable retention policies, automatic aging, and old partition cleanup

Database table migration utilities supporting partitioned table transitions with dry-run capabilities

Partition discovery, statistics, and retention analysis tools for improved data management

Enhanced data models to support partitioned table architecture

Tests

Comprehensive test coverage for partition management and migration utilities

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2026-03-17T19:08:35Z

@neisw: This pull request references Trt-1989 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adds migration helpers for one off partition migration steps

Adds FK analysis for pre-migration review

Adds example delete helper to handle deletes without relying on cascading deletes / FK constraints

Adds partition mananagement for

Updating partitioned tables

Detaching old partitions

Dropping older detached partitions

Summary by CodeRabbit

Release Notes

New Features

End-to-end partition management: discovery, stats, creation, attach/detach, aging and safe removal with dry-run support

Database migration utilities to convert and sync tables to partitioned layouts, verify schemas, and migrate data incrementally

Tools for partition impact analysis, identity/sequence sync, and controlled deletion of test-related data

Tests

Extensive unit tests covering partition logic, validation, SQL generation, and migration helpers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

pkg/db/migration.go (1)

393-411: ⚠️ Potential issue | 🟠 Major

Inbound FK analysis is still hard-coded to created_at.

The function already discovers actual partition columns (tablePartCols), but this branch still assumes created_at, so recommendations are wrong for other partition keys.

Suggested fix

-			// Assume partitioning on created_at (most common case) — check if source has it
-			hasCreatedAt := srcColSet["created_at"]
-			fields["source_has_created_at"] = hasCreatedAt
+			missingPartCols := []string{}
+			for _, partCol := range tablePartCols {
+				hasCol := srcColSet[partCol]
+				fields[fmt.Sprintf("source_has_%s", partCol)] = hasCol
+				if !hasCol {
+					missingPartCols = append(missingPartCols, partCol)
+				}
+			}
@@
-			} else if !hasCreatedAt {
+			} else if len(missingPartCols) > 0 {
 				fields["action"] = "DROP"
-				fields["reason"] = fmt.Sprintf("source table %s lacks created_at column needed to reference partitioned %s; FK cannot be expanded",
-					fk.SourceTable, tableName)
+				fields["reason"] = fmt.Sprintf(
+					"source table %s lacks required partition column(s) [%s] to reference partitioned %s; FK cannot be expanded",
+					fk.SourceTable, strings.Join(missingPartCols, ", "), tableName,
+				)
 			} else {
 				fields["action"] = "EXPAND or DROP"
-				fields["reason"] = fmt.Sprintf("source table %s has created_at column; FK could be expanded to include partition key, "+
-					"but this requires a UNIQUE constraint on %s(%s, created_at) and adds complexity",
-					fk.SourceTable, tableName, fk.ReferencedColumns)
+				fields["reason"] = fmt.Sprintf(
+					"source table %s has required partition column(s); FK could be expanded to include partition key(s), "+
+						"but this requires a UNIQUE constraint on %s(%s, %s) and adds complexity",
+					fk.SourceTable, tableName, fk.ReferencedColumns, strings.Join(tablePartCols, ", "),
+				)
 			}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/migration.go` around lines 393 - 411, The branch is incorrectly
hard-coded to "created_at"; instead derive the partition key(s) from
tablePartCols and check that all those columns exist in srcColSet (e.g.,
hasPartitionKeys := all cols in tablePartCols are present in srcColSet). Replace
uses of hasCreatedAt and the literal "created_at" in messages with that computed
hasPartitionKeys and the actual tablePartCols list, and update the reason text
to reference the required UNIQUE constraint on (fk.ReferencedColumns,
<comma-separated tablePartCols>) rather than created_at so the recommendation
correctly reflects the real partition key(s); adjust fields keys/names
accordingly (srcColSet, srcPartCols, fields, fk, tableName,
fk.ReferencedColumns, tablePartCols).

pkg/db/partitions.go (1)

1608-1616: ⚠️ Potential issue | 🟠 Major

Do not drop constraint-backed indexes during cleanup.

This loop still drops indexes that back constraints (not only _pkey should be protected). Dropping them can break constraint integrity after FK migration steps.

Suggested fix

+	// Gather indexes owned by constraints so we never drop them.
+	constraintIdxSet, err := dbc.getConstraintBackedIndexes(tableName)
+	if err != nil {
+		return "", fmt.Errorf("failed to get constraint-backed indexes: %w", err)
+	}
+
 	// Drop indexes that are no longer in the model
 	for idxName := range currentIdxMap {
 		// Skip primary key and system indexes
 		if strings.HasSuffix(idxName, "_pkey") {
 			continue
 		}
+		if constraintIdxSet[idxName] {
+			continue
+		}
 		alterStatements = append(alterStatements,
 			fmt.Sprintf("DROP INDEX IF EXISTS %s", idxName))
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/partitions.go` around lines 1608 - 1616, The loop that drops indexes
(iterating currentIdxMap and appending to alterStatements) must skip any
constraint-backed indexes, not just ones ending with "_pkey"; update the loop in
partitions.go to check against the set/map of constraint-backed index names
(e.g., constraintIndexMap or whatever constraint map you populated earlier) and
continue if idxName is present there (or alternatively detect constraint-backed
names via the existing constraint metadata); only append DROP INDEX for indexes
that are neither primary/system nor listed as constraint-backed.

🧹 Nitpick comments (1)

pkg/db/partitions.go (1)

831-846: Avoid per-day IsPartitionAttached lookups in the date loop.

This loop issues an attachment query per existing date. For large ranges, this becomes an N+1 catalog-query path and slows partition sync significantly.

Suggested refactor

+	attachedPartitions, err := dbc.ListAttachedPartitions(tableName)
+	if err != nil {
+		return 0, fmt.Errorf("failed to list attached partitions: %w", err)
+	}
+	attachedSet := make(map[string]bool, len(attachedPartitions))
+	for _, p := range attachedPartitions {
+		attachedSet[p.TableName] = true
+	}
@@
-			attached, err := dbc.IsPartitionAttached(partitionName)
-			if err != nil {
-				return 0, fmt.Errorf("failed to check if partition %s is attached: %w", partitionName, err)
-			}
-			if !attached {
+			if !attachedSet[partitionName] {
 				partitionsToReattach = append(partitionsToReattach, partitionName)
 			}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/partitions.go` around lines 831 - 846, The per-day attachment check
inside the date loop causes N+1 catalog queries; instead gather all partition
names for dates that exist into a slice (use the same partitionName format
"%s_%s" as in the loop) and call a single batched check (implement or use a
helper like IsPartitionsAttached(partitionNames []string) returning a
map[string]bool or set) to determine which are detached, then populate
partitionsToReattach from that result; keep the existing logic that appends new
dates to partitionsToCreate but remove the per-iteration call to
dbc.IsPartitionAttached and replace it with the batched lookup and corresponding
error handling.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/db/migration.go`:
- Around line 168-171: The current use of
maxDate.UTC().Truncate(24*time.Hour).AddDate(0,0,1) moves migrateFrom to next
midnight and skips rows between maxDate and that midnight; change migrateFrom to
start just after the last migrated timestamp instead (e.g., migrateFrom :=
maxDate.UTC().Add(time.Nanosecond)) so rows with timestamps > maxDate are
included without reprocessing the exact maxDate row; apply the same fix to the
other occurrence around lines 257-260 where migrateFrom is computed and ensure
the call to MigrateTableDataRange still receives the adjusted migrateFrom.
- Around line 56-57: The code builds raw SQL with interpolated identifiers
(dateColumn, sourceTable) in the query variable used with dbc.DB.Raw, which
risks SQL injection and query breakage; fix by validating identifiers against a
strict whitelist/regex (e.g. ^[A-Za-z_][A-Za-z0-9_]*$) or by mapping to known
table/column names, then escape/quote the validated identifiers using the DB's
identifier quote (e.g. backticks or double quotes) before using fmt.Sprintf to
build the SQL; apply this change to the query construction sites that set query
and call dbc.DB.Raw (the SELECT MIN(...) block around query, and the other
occurrences referenced at the query-building code near the other two locations).

---

Duplicate comments:
In `@pkg/db/migration.go`:
- Around line 393-411: The branch is incorrectly hard-coded to "created_at";
instead derive the partition key(s) from tablePartCols and check that all those
columns exist in srcColSet (e.g., hasPartitionKeys := all cols in tablePartCols
are present in srcColSet). Replace uses of hasCreatedAt and the literal
"created_at" in messages with that computed hasPartitionKeys and the actual
tablePartCols list, and update the reason text to reference the required UNIQUE
constraint on (fk.ReferencedColumns, <comma-separated tablePartCols>) rather
than created_at so the recommendation correctly reflects the real partition
key(s); adjust fields keys/names accordingly (srcColSet, srcPartCols, fields,
fk, tableName, fk.ReferencedColumns, tablePartCols).

In `@pkg/db/partitions.go`:
- Around line 1608-1616: The loop that drops indexes (iterating currentIdxMap
and appending to alterStatements) must skip any constraint-backed indexes, not
just ones ending with "_pkey"; update the loop in partitions.go to check against
the set/map of constraint-backed index names (e.g., constraintIndexMap or
whatever constraint map you populated earlier) and continue if idxName is
present there (or alternatively detect constraint-backed names via the existing
constraint metadata); only append DROP INDEX for indexes that are neither
primary/system nor listed as constraint-backed.

---

Nitpick comments:
In `@pkg/db/partitions.go`:
- Around line 831-846: The per-day attachment check inside the date loop causes
N+1 catalog queries; instead gather all partition names for dates that exist
into a slice (use the same partitionName format "%s_%s" as in the loop) and call
a single batched check (implement or use a helper like
IsPartitionsAttached(partitionNames []string) returning a map[string]bool or
set) to determine which are detached, then populate partitionsToReattach from
that result; keep the existing logic that appends new dates to
partitionsToCreate but remove the per-iteration call to dbc.IsPartitionAttached
and replace it with the batched lookup and corresponding error handling.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: e3f3880f-7afe-476b-ae4b-3ed8ecb640ff

📥 Commits

Reviewing files that changed from the base of the PR and between af7e262 and b2d93df.

📒 Files selected for processing (3)

pkg/db/db.go
pkg/db/migration.go
pkg/db/partitions.go

pkg/db/migration.go

openshift-ci-robot · 2026-03-17T21:06:09Z

@neisw: This pull request references Trt-1989 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adds migration helpers for one off partition migration steps

Adds FK analysis for pre-migration review

Adds example delete helper to handle deletes without relying on cascading deletes / FK constraints

Adds partition mananagement for

Updating partitioned tables

Detaching old partitions

Dropping older detached partitions

Summary by CodeRabbit

New Features

End-to-end partition management: discovery, stats, automated partition creation/aging, attach/detach, safe removal and dry-run support

Database migration toolkit: create, update and finalize partitioned tables, incremental data migration, identity/sequence sync, and partition-aware renames

Partition impact analysis and controlled deletion of test data

Tests

Extensive unit tests covering partitioning, validation, SQL generation and migration helpers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

pkg/db/partitions.go (1)

1612-1620: ⚠️ Potential issue | 🟠 Major

Don't drop indexes that back live constraints.

getCurrentIndexes includes PostgreSQL-managed backing indexes for UNIQUE constraints. MoveForeignKeys in pkg/db/utils.go creates those constraints during FK expansion, and this cleanup only exempts _pkey, so a later UpdatePartitionedTable can fail with cannot drop index ... because constraint ... requires it. Filter out indexes referenced by pg_constraint.conindid before this loop.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/db/partitions.go` around lines 1612 - 1620, The loop that drops indexes
(iterating currentIdxMap) is erroneously removing PostgreSQL-managed backing
indexes for UNIQUE/other constraints; update the logic so it first queries
pg_constraint.conindid (or uses getCurrentIndexes/MoveForeignKeys helpers) to
build a set of index OIDs/names that are referenced by constraints and then skip
those when appending DROP INDEX statements (in the same scope referencing
currentIdxMap and the DROP loop used by UpdatePartitionedTable). In short:
collect constraint-backed index identifiers from pg_constraint.conindid, add
them to a skip set, and change the existing DROP loop to continue when an
idxName is in that skip set (still keeping the existing "_pkey" exemption).

pkg/db/utils.go (2)

1603-1615: ⚠️ Potential issue | 🟠 Major

Table swaps still need a temp-name phase.

Lines 1608-1614 permit targets that are also sources, but the execution path is still a single-pass ALTER TABLE ... RENAME TO .... A true swap/cycle like A→B, B→A will still fail on the first conflicting rename. Detect cycles and route them through temporary names before the final pass.

Also applies to: 1957-1969

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/utils.go` around lines 1603 - 1615, The current loop over tableRenames
in pkg/db/utils.go skips targets that are also sources but still performs
single-pass ALTER TABLE ... RENAME TO ..., which will fail for cycles (e.g.,
A→B, B→A); modify the rename logic to detect cycles in
tableRenames/tableRenameMap (e.g., by building a directed graph of rename.From →
rename.To and finding strongly connected components or cycles) and for any cycle
generate unique temporary names and route those tables through a temp-name phase
before performing the final renames; update the code paths that iterate
tableRenames (the loop referencing source/target and tableRenameMap) to branch:
non-cyclic renames can be executed normally, while cyclic components are
transformed into two-phase renames using the temp names to avoid conflicts.

712-725: ⚠️ Potential issue | 🟠 Major

Keep endDate exclusive in coverage checks.

MigrateTableDataRange copies [startDate, endDate), but this loop still requires the partition for the day containing endDate. A run ending exactly at midnight, or an empty range with startDate == endDate, is reported as missing a partition that is never read.

Suggested fix

-	// Normalize to calendar days so timestamps within a day are handled correctly
-	startDay := startDate.Truncate(24 * time.Hour)
-	endDay := endDate.Truncate(24 * time.Hour)
+	// Normalize to calendar days so timestamps within a day are handled correctly.
+	// Required coverage is [startDate, endDate).
+	startDay := startDate.Truncate(24 * time.Hour)
+	lastRequiredDay := endDate.Add(-time.Nanosecond).Truncate(24 * time.Hour)

 	// Check that we have a partition for each day in the range
 	var missingDates []string
 	currentDate := startDay
-	for !currentDate.After(endDay) {
+	for !currentDate.After(lastRequiredDay) {
 		dateStr := currentDate.Format("2006-01-02")
 		if !existingDates[dateStr] {
 			missingDates = append(missingDates, dateStr)
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/utils.go` around lines 712 - 725, The coverage check currently treats
endDate as inclusive and thus may require a partition for the day containing
endDate even though MigrateTableDataRange copies [startDate, endDate) (causing
false missing partitions or when startDate==endDate). Fix by making the
end-of-range exclusive before normalizing: compute an exclusive endDay from
endDate (e.g., endDate.Add(-time.Nanosecond) then Truncate(24*time.Hour)) or
skip the loop when endDate <= startDate, and then iterate from startDay to that
exclusive endDay when building missingDates; update references in the same
function (startDay, endDay/currentDate, existingDates, missingDates,
MigrateTableDataRange usage) accordingly.

pkg/db/migration.go (1)

394-412: ⚠️ Potential issue | 🟠 Major

Use the real partition key in the inbound FK analysis.

This branch still hard-codes created_at, so any table partitioned on a different column gets the wrong DROP/EXPAND guidance. If this helper is meant for pre-migration review, it needs the planned partition column(s) as input instead of assuming one.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/db/migration.go` around lines 394 - 412, The code currently hard-codes
"created_at" when deciding whether an inbound FK can be expanded; replace that
assumption by using the actual planned partition key(s) passed into this helper
(e.g., add or use a parameter like plannedPartitionCols or partitionKey) instead
of "created_at". Update the existence check to use srcColSet[partitionKey] (or
iterate plannedPartitionCols and ensure all are present), adjust the fields
entries and error/reason messages to reference the real partition key
variable(s) rather than the literal "created_at", and when describing the UNIQUE
constraint change the referenced columns from "%s(%s, created_at)" to include
the actual partition key(s) (e.g., "%s(%s, <partitionKeyList>)"); keep logic
around srcPartCols, fields["action"], and fields["reason"] otherwise the same.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/db/migration.go`:
- Around line 283-301: The swap and FK rewrite must run atomically: perform
RenameTables and MoveForeignKeys inside a single DB transaction so failures roll
back both operations. Either (a) open a transaction (e.g., dbc.BeginTx) and call
tx-aware versions of RenameTables and MoveForeignKeys (or modify those functions
to accept a tx param like RenameTables(tx, ...) and MoveForeignKeys(tx, ...)),
or (b) add overloads to RenameTables/MoveForeignKeys to accept an existing tx
and use that tx for the DDL; on any error rollback the transaction and return
the error, otherwise commit and then log the successful renames/moves. Ensure
you update the code paths that currently call RenameTables and MoveForeignKeys
(and use variables renames, oldTable, sourceTable, moveForeignKeys, dryRun) to
use the transactional variants so the swap + FK rewrite are atomic.

In `@pkg/db/partitions.go`:
- Around line 277-305: The query uses cutoffDate (set as
time.Now().AddDate(...)) and compares it to TO_DATE(...) causing off-by-one
deletions; change to compare dates only by either replacing the bound parameter
with SQL CURRENT_DATE (i.e., keep the WHERE clause using CURRENT_DATE -
`@retentionDays` or TO_DATE(...) < CURRENT_DATE - `@retentionDays`) or compute
cutoffDate as a date-only value before binding (e.g., truncate to midnight with
time.Now().Truncate(24*time.Hour).AddDate(0,0,-retentionDays)) and bind that
instead; update both places where cutoffDate is used (the attachedOnly query
block referencing TO_DATE(... ) < `@cutoff_date` and the GetRetentionSummary
function) and ensure the parameter/SQL uses date-only semantics so TO_DATE(...)
compares to a date, not a timestamp.

---

Duplicate comments:
In `@pkg/db/migration.go`:
- Around line 394-412: The code currently hard-codes "created_at" when deciding
whether an inbound FK can be expanded; replace that assumption by using the
actual planned partition key(s) passed into this helper (e.g., add or use a
parameter like plannedPartitionCols or partitionKey) instead of "created_at".
Update the existence check to use srcColSet[partitionKey] (or iterate
plannedPartitionCols and ensure all are present), adjust the fields entries and
error/reason messages to reference the real partition key variable(s) rather
than the literal "created_at", and when describing the UNIQUE constraint change
the referenced columns from "%s(%s, created_at)" to include the actual partition
key(s) (e.g., "%s(%s, <partitionKeyList>)"); keep logic around srcPartCols,
fields["action"], and fields["reason"] otherwise the same.

In `@pkg/db/partitions.go`:
- Around line 1612-1620: The loop that drops indexes (iterating currentIdxMap)
is erroneously removing PostgreSQL-managed backing indexes for UNIQUE/other
constraints; update the logic so it first queries pg_constraint.conindid (or
uses getCurrentIndexes/MoveForeignKeys helpers) to build a set of index
OIDs/names that are referenced by constraints and then skip those when appending
DROP INDEX statements (in the same scope referencing currentIdxMap and the DROP
loop used by UpdatePartitionedTable). In short: collect constraint-backed index
identifiers from pg_constraint.conindid, add them to a skip set, and change the
existing DROP loop to continue when an idxName is in that skip set (still
keeping the existing "_pkey" exemption).

In `@pkg/db/utils.go`:
- Around line 1603-1615: The current loop over tableRenames in pkg/db/utils.go
skips targets that are also sources but still performs single-pass ALTER TABLE
... RENAME TO ..., which will fail for cycles (e.g., A→B, B→A); modify the
rename logic to detect cycles in tableRenames/tableRenameMap (e.g., by building
a directed graph of rename.From → rename.To and finding strongly connected
components or cycles) and for any cycle generate unique temporary names and
route those tables through a temp-name phase before performing the final
renames; update the code paths that iterate tableRenames (the loop referencing
source/target and tableRenameMap) to branch: non-cyclic renames can be executed
normally, while cyclic components are transformed into two-phase renames using
the temp names to avoid conflicts.
- Around line 712-725: The coverage check currently treats endDate as inclusive
and thus may require a partition for the day containing endDate even though
MigrateTableDataRange copies [startDate, endDate) (causing false missing
partitions or when startDate==endDate). Fix by making the end-of-range exclusive
before normalizing: compute an exclusive endDay from endDate (e.g.,
endDate.Add(-time.Nanosecond) then Truncate(24*time.Hour)) or skip the loop when
endDate <= startDate, and then iterate from startDay to that exclusive endDay
when building missingDates; update references in the same function (startDay,
endDay/currentDate, existingDates, missingDates, MigrateTableDataRange usage)
accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: c2833840-5fba-43f1-80e6-2e5878486b73

📥 Commits

Reviewing files that changed from the base of the PR and between b2d93df and 98d456e.

📒 Files selected for processing (3)

pkg/db/migration.go
pkg/db/partitions.go
pkg/db/utils.go

coderabbitai · 2026-03-17T21:17:03Z

pkg/db/migration.go

+	count, err := dbc.RenameTables(renames, true, true, true, true, dryRun)
+	if err != nil {
+		return fmt.Errorf("failed to swap tables: %w", err)
+	}
+	l.WithField("renames", count).Info("tables swapped")
+
+	// Step 6: Move or drop foreign keys
+	fkTarget := sourceTable
+	if !moveForeignKeys {
+		fkTarget = ""
+	}
+	if moveForeignKeys {
+		l.Info("step 6: moving foreign keys")
+	} else {
+		l.Info("step 6: dropping foreign keys")
+	}
+	moved, err := dbc.MoveForeignKeys(oldTable, fkTarget, dryRun)
+	if err != nil {
+		return fmt.Errorf("failed to process foreign keys: %w", err)


⚠️ Potential issue | 🔴 Critical

Make the swap and FK rewrite atomic.

Line 283 commits the table swap before Line 299 starts a separate FK rewrite. If MoveForeignKeys fails, the migration is left half-finished: the app now points at the new sourceTable, while some FKs still reference *_old. Please run these steps in one transaction, or make RenameTables/MoveForeignKeys accept an existing tx.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/migration.go` around lines 283 - 301, The swap and FK rewrite must run atomically: perform RenameTables and MoveForeignKeys inside a single DB transaction so failures roll back both operations. Either (a) open a transaction (e.g., dbc.BeginTx) and call tx-aware versions of RenameTables and MoveForeignKeys (or modify those functions to accept a tx param like RenameTables(tx, ...) and MoveForeignKeys(tx, ...)), or (b) add overloads to RenameTables/MoveForeignKeys to accept an existing tx and use that tx for the DDL; on any error rollback the transaction and return the error, otherwise commit and then log the successful renames/moves. Ensure you update the code paths that currently call RenameTables and MoveForeignKeys (and use variables renames, oldTable, sourceTable, moveForeignKeys, dryRun) to use the transactional variants so the swap + FK rewrite are atomic.

coderabbitai · 2026-03-17T21:17:03Z

pkg/db/partitions.go

+	cutoffDate := time.Now().AddDate(0, 0, -retentionDays)
+
+	var query string
+	if attachedOnly {
+		// Only return attached partitions
+		query = `
+			WITH attached_partitions AS (
+				SELECT c.relname AS tablename
+				FROM pg_inherits i
+				JOIN pg_class c ON i.inhrelid = c.oid
+				JOIN pg_class p ON i.inhparent = p.oid
+				WHERE p.relname = @table_name
+			)
+			SELECT
+				tablename,
+				'public' as schemaname,
+				TO_DATE(SUBSTRING(tablename FROM '_(\d{4}_\d{2}_\d{2})$'), 'YYYY_MM_DD') AS partition_date,
+				(CURRENT_DATE - TO_DATE(SUBSTRING(tablename FROM '_(\d{4}_\d{2}_\d{2})$'), 'YYYY_MM_DD'))::INT AS age_days,
+				pg_total_relation_size('public.'||tablename) AS size_bytes,
+				pg_size_pretty(pg_total_relation_size('public.'||tablename)) AS size_pretty,
+				COALESCE(n_live_tup, 0) AS row_estimate
+			FROM pg_tables
+			LEFT JOIN pg_stat_user_tables ON pg_stat_user_tables.relname = pg_tables.tablename
+				AND pg_stat_user_tables.schemaname = pg_tables.schemaname
+			WHERE pg_tables.schemaname = 'public'
+				AND pg_tables.tablename LIKE @table_pattern ESCAPE '\'
+				AND pg_tables.tablename IN (SELECT tablename FROM attached_partitions)
+				AND TO_DATE(SUBSTRING(tablename FROM '_(\d{4}_\d{2}_\d{2})$'), 'YYYY_MM_DD') < @cutoff_date
+			ORDER BY partition_date ASC


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pkg/db/partitions.go | head -320 | tail -50

Repository: openshift/sippy

Length of output: 2753

🏁 Script executed:

sed -n '277,310p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 1418

🏁 Script executed:

sed -n '357,390p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 1249

🏁 Script executed:

sed -n '320,330p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 474

🏁 Script executed:

sed -n '340,345p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 277

🏁 Script executed:

rg -n "QueryRow|Query|cutoff_date" pkg/db/partitions.go | head -20

Repository: openshift/sippy

Length of output: 925

🏁 Script executed:

sed -n '350,410p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 2571

🏁 Script executed:

sed -n '330,340p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 450

🏁 Script executed:

sed -n '405,410p' pkg/db/partitions.go

Repository: openshift/sippy

Length of output: 314

🏁 Script executed:

python3 << 'EOF' # Demonstrate the PostgreSQL date/timestamp comparison issue # Scenario: retentionDays = 30, current time = 2025-03-15 14:30:45 import datetime now = datetime.datetime(2025, 3, 15, 14, 30, 45) retention_days = 30 cutoff_date = now - datetime.timedelta(days=retention_days) print("Current time:", now) print("Retention days:", retention_days) print("Cutoff timestamp:", cutoff_date) print("Cutoff date only:", cutoff_date.date()) print() # Example partition dates to check partition_dates = [ datetime.date(2025, 2, 12), # 31 days old datetime.date(2025, 2, 13), # 30 days old (exactly) datetime.date(2025, 2, 14), # 29 days old ] print("PostgreSQL comparison: TO_DATE(...) < `@cutoff_date`") print("WHERE TO_DATE extracts date from partition name") print() for pdate in partition_dates: # PostgreSQL TO_DATE creates midnight date_as_timestamp_midnight = datetime.datetime.combine(pdate, datetime.time(0, 0, 0)) # Comparison with the timestamp cutoff is_selected = date_as_timestamp_midnight < cutoff_date print(f"Partition {pdate} (age {(now.date() - pdate).days} days)") print(f" {date_as_timestamp_midnight} < {cutoff_date}") print(f" Selected for deletion: {is_selected}") print() EOF

Repository: openshift/sippy

Length of output: 610

🏁 Script executed:

# Check if there are calls to these functions and what retentionDays values are used rg -n "GetPartitionsForRemoval|GetRetentionSummary" pkg/db/partitions.go | head -10

Repository: openshift/sippy

Length of output: 685

🏁 Script executed:

rg -n "GetPartitionsForRemoval|GetRetentionSummary" --type go -A 3 | grep -E "GetPartitionsForRemoval|GetRetentionSummary|retentionDays" | head -20

Repository: openshift/sippy

Length of output: 892

Use CURRENT_DATE for the partition age cutoff to avoid off-by-one deletions.

The cutoff date is computed as time.Now().AddDate(0, 0, -retentionDays), which includes the current time of day. When this timestamp is compared against TO_DATE(...) in the WHERE clause—which returns midnight—a partition exactly at the retention boundary gets selected for deletion after the current day passes noon. For example, with retentionDays=30 at 2:30 PM on March 15, the partition from February 13 (exactly 30 days old) is incorrectly selected because 2025-02-13 00:00:00 < 2025-02-13 14:30:45. Replace with CURRENT_DATE in the SQL query, or compute the cutoff using only the date component before binding the parameter.

Also applies to: 357–406 (GetRetentionSummary)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/db/partitions.go` around lines 277 - 305, The query uses cutoffDate (set as time.Now().AddDate(...)) and compares it to TO_DATE(...) causing off-by-one deletions; change to compare dates only by either replacing the bound parameter with SQL CURRENT_DATE (i.e., keep the WHERE clause using CURRENT_DATE - `@retentionDays` or TO_DATE(...) < CURRENT_DATE - `@retentionDays`) or compute cutoffDate as a date-only value before binding (e.g., truncate to midnight with time.Now().Truncate(24*time.Hour).AddDate(0,0,-retentionDays)) and bind that instead; update both places where cutoffDate is used (the attachedOnly query block referencing TO_DATE(... ) < `@cutoff_date` and the GetRetentionSummary function) and ensure the parameter/SQL uses date-only semantics so TO_DATE(...) compares to a date, not a timestamp.

openshift-ci-robot · 2026-03-17T21:41:36Z

@neisw: This pull request references Trt-1989 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adds migration helpers for one off partition migration steps

Adds FK analysis for pre-migration review

Adds example delete helper to handle deletes without relying on cascading deletes / FK constraints

Adds partition mananagement for

Updating partitioned tables

Detaching old partitions

Dropping older detached partitions

Summary by CodeRabbit

New Features

End-to-end partition management: discovery, stats, automated partition creation/aging, attach/detach, safe removal and dry-run support

Database migration toolkit: create, update and finalize partitioned tables, incremental data migration, identity/sequence sync, and partition-aware renames

Partition impact analysis and controlled deletion of test data

Refactor

Models and migration flow adjusted to support partitioned tables and explicit timestamp/primary-key handling

Tests

Extensive unit tests covering partitioning, validation, SQL generation and migration helpers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2026-03-17T21:58:31Z

Scheduling required tests:
/test e2e

openshift-ci · 2026-03-17T22:25:30Z

@neisw: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e	`a297fe9`	link	true	`/test e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

neisw · 2026-03-17T23:43:30Z

/hold
until after #3333 merges and migration is complete

neisw added 2 commits March 14, 2026 20:42

TRT-1989: remove ProwJobRunTest during partition migration

bf6747b

TRT-1989: add partition management

af7e262

openshift-ci bot requested review from dgoodwin and petr-muller March 17, 2026 12:33

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 17, 2026

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

neisw changed the title ~~Trt 1989 partition management~~ Trt-1989: partition management Mar 17, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 17, 2026

TRT-1989: update migration helpers

b2d93df

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

pkg/db/migration.go Outdated Show resolved Hide resolved

pkg/db/migration.go Show resolved Hide resolved

TRT-1989: lint and rabbit

98d456e

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

TRT-1989: lint harder

a297fe9

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 17, 2026

Conversation

neisw commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Mar 17, 2026

Uh oh!

openshift-ci bot commented Mar 17, 2026

Uh oh!

coderabbitai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented Mar 17, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci-robot commented Mar 17, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

openshift-ci-robot commented Mar 17, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented Mar 17, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Mar 17, 2026

Uh oh!

openshift-ci bot commented Mar 17, 2026

Uh oh!

neisw commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

neisw commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 17, 2026 •

edited

Loading

openshift-ci-robot commented Mar 17, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 17, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 17, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 17, 2026 •

edited by openshift-ci bot

Loading