Skip to content

P4 #6: Clean up recording_releases UNIQUE constraints and document dual-MB-ID design on songs #124

@dprodger

Description

@dprodger

Context

From doc/architecture-review-2026-04.mdP4 #6. Two distinct schema hygiene items.

Item A: Redundant UNIQUE constraints on recording_releases

The recording_releases junction table has two overlapping unique constraints:

  • UNIQUE (recording_id, release_id, disc_number, track_number)
  • UNIQUE (recording_id, release_id)

The first is strictly stronger than the second — any pair (recording_id, release_id) that is unique alone is automatically unique with any additional columns. The review calls this out as redundant.

Which to keep?

Depends on actual data:

  • If the same (recording, release) pair can legitimately appear at multiple positions in a release (e.g., a hidden track + a main track on the same album), then the (recording_id, release_id) constraint is too strict and should be dropped, leaving the four-column one.
  • If a (recording, release) pair only ever appears once, the four-column constraint is redundant — drop it, leaving the two-column one.

Check the data with:

SELECT recording_id, release_id, COUNT(*)
FROM recording_releases
GROUP BY recording_id, release_id
HAVING COUNT(*) > 1;

If zero rows: drop the four-column constraint. If any rows: drop the two-column constraint.

Item B: Dual-MB-ID design on songs

The songs table has two MusicBrainz ID columns:

  • musicbrainz_id
  • second_mb_id

This is not documented anywhere and reads as confusing. The review asks for it to be documented (in a SQL comment on the column, in CLAUDE.md, or both) so future developers understand why two columns exist.

Assumption (needs verification): some jazz standards have multiple MB work IDs that the service treats as the same canonical song for some operations, e.g., a standard and its famous recording being modeled as separate MB works.

Proposed fix

  1. Run the query above against production to determine which UNIQUE constraint to drop.
  2. Write an Alembic migration (see P3 Returning to list view ignores search string #1 issue) that drops the redundant constraint. Or, if Alembic is not yet adopted, a one-off SQL file + hand-application.
  3. Add a comment on the songs.second_mb_id column:
    COMMENT ON COLUMN songs.second_mb_id IS
      'Secondary MusicBrainz work ID. Some jazz standards correspond to
       multiple MB works; this column pairs the canonical entry with an
       alternate that should be treated as the same song for <reason>.';
  4. Add a note in CLAUDE.md under the database schema section.

Risks

  • Dropping a UNIQUE constraint is low-risk if the data already satisfies it (which it must, since the constraint is currently enforced).
  • Getting the second_mb_id documentation wrong is a medium risk — need to confirm the actual semantics with someone who knows the history, not just guess from the column name.

Not in scope

  • Deeper schema normalization / other tables.
  • Eliminating second_mb_id entirely (that would require a real migration strategy, not documentation).

Context links

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions