design(tdd 0043 + adr 0008): rework authors on the build model (Opus); Sonnet only for reviews#118
Merged
Conversation
…, Sonnet only for reviews Gap-closure TDD 0043 flips the default rework model sonnet→opus so the pipeline uses Opus for ALL code-writing (build AND rework) and Sonnet only for the review gates. Rationale (NFR-3): the review gate runs on Sonnet, so Sonnet-rework means the rework AUTHOR and the REVIEWER are the same model — undercutting NFR-3's author↔reviewer diversity on rework iterations. Opus-rework → Sonnet-review restores it. Changes (TDD 0043): scripts/lib/gates.sh (two THROUGHLINE_REWORK_MODEL fallbacks sonnet→opus), skills/implement/SKILL.md + README.md (doc reconciliation), tests/bounded-rework-loop.test.sh (four assertion sites: A1/E1/E2→opus, A2 override→sonnet). ~20 lines across 4 files. ADR 0008 (accepted): records the rework-on-build-model decision and REVISES ADR 0007's rework-model cost-reduction consequence specifically. ADR 0007's core halt-model decision stays accepted/binding (not superseded); append-only — 0007's body untouched; INDEX annotated. Tradeoff accepted: Opus rework costs more and reintroduces the opportunistic- refactoring risk 0019/0007 guarded against — bounded by the model-INDEPENDENT FR-66 scope cap (oversized rework hard-reset before ship) + Opus 4.8's restraint. Override path unchanged (THROUGHLINE_REWORK_MODEL=sonnet restores the old profile). Gates: tdd-lint structural + --bounds exit 0; independent design-reviewer DESIGN_REVIEW: PASS (first pass BLOCKed on the ADR 0007 conflict + an incomplete test plan; both resolved — ADR 0008 scoped revision + all four test sites named). Open assumptions & waivers: - wander risk accepted — resolved: Opus 4.8 is materially less wander-prone, and the FR-66 per-attempt scope cap + pre-pass hard-reset oversized rework regardless of model, so a wandering Opus rework is bounded/rejected, not shipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes the pipeline use Opus for all code-writing (build AND rework) and Sonnet only for the review gates — by flipping the default rework model
sonnet → opus. Gap-closure TDD 0043 + new ADR 0008.Why (NFR-3 — this is the principled part)
NFR-3: "the review gate runs on a different model so the reviewer does not share the author's blind spots." The review gate runs on Sonnet. With Sonnet-rework, the rework author (Sonnet) and the reviewer (Sonnet) are the same model — so on every rework iteration the reviewer shares the author's blind spots, the exact thing NFR-3 prevents. Opus-rework → Sonnet-review restores author↔reviewer diversity. So this isn't just preference — it closes an NFR-3 gap on rework iterations. (Operator consistency on the strongest model is the secondary benefit.)
Changes
scripts/lib/gates.sh(twoTHROUGHLINE_REWORK_MODELfallbacks),skills/implement/SKILL.md+README.md(doc reconciliation),tests/bounded-rework-loop.test.sh(four assertion sites — A1/E1/E2 → opus, A2 override → sonnet).accepted): records the decision and revises ADR 0007's rework-model consequence only. ADR 0007's halt-model core decision staysaccepted/binding (not superseded; append-only — 0007's body untouched; INDEX annotated). Runtime-verify gate is unchanged (stays tiered, per your call).Honest tradeoff
This reverses a deliberate 0019/ADR-0007 disposition (Sonnet-rework was chosen as a cost reduction + to avoid Opus "opportunistic refactoring"). Accepted because: Opus 4.8 is materially less wander-prone, and the model-independent FR-66 scope cap hard-resets an oversized rework before it ships regardless of author.
THROUGHLINE_REWORK_MODEL=sonnetrestores the old profile.Gates
tdd-lintstructural +--bounds: exit 0.DESIGN_REVIEW: PASS(ADR 0008 scoped revision verified sound + proportional; all four test sites named; honest-framing added).Open assumptions & waivers
Note on landing
This affects every future build's rework gate. It does not touch any in-flight run (runner reads its scripts from the plugin cache). Merge when ready; the next
/implementrun after merge picks up Opus rework. (If you want it in the current run, I'll setTHROUGHLINE_REWORK_MODEL=opuson the next resume as already discussed.)🤖 Generated with Claude Code