Skip to content

docs: log adversarial round 8 (reverse defect-catch; bidirectional evidence)#23

Merged
vpciii merged 2 commits into
mainfrom
docs/adversarial-round8
Jun 21, 2026
Merged

docs: log adversarial round 8 (reverse defect-catch; bidirectional evidence)#23
vpciii merged 2 commits into
mainfrom
docs/adversarial-round8

Conversation

@vpciii

@vpciii vpciii commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Round 8 is the data point rounds 5–7 were building toward: a reverse defect-catch.

Roles swapped and un-steered — Gemini authored the 6-practice expansion (methodology PR #22), Claude adversary-reviewed it cold, and this time the authored change had real defects: §15/§17 duplicate ADR 0014/0020 (violating ADR 0022), §13/§14 contradict the "no premature tooling" exclusion, §14 lacks deployable-service scoping, the set inverts scale-to-work, and it left the doc self-inconsistent. BLOCK.

Combined with rounds 1/4 (Gemini-adversary catching Claude-author defects), the trial now has independent defect-catches in both directions — genuine bidirectional support for model-agnosticism. Rounds 7+8 characterize the reverse direction fully: no false-positives on clean work (7), real catches on flawed work (8).

Residual: no third model yet; small same-week sample. The author-steering confound still applies to the design-ADR rounds (3/5/6).

🤖 Generated with Claude Code

…ional)

Round 8 supplies the evidence round 7 couldn't: a Gemini-authored change
(PR #22, the 6-practice expansion) that DID contain real defects, caught
cold by Claude-as-adversary (duplication vs ADR 0014/0020, contradiction
with the no-premature-tooling exclusion, missing CD scoping, self-
inconsistency). Combined with rounds 1/4 (Gemini-adversary catching
Claude-author defects), independent defect-catches now exist in BOTH
directions — bidirectional support for model-agnosticism. Rounds 7+8
together characterize the reverse direction: no false positives on clean
work, real catches on flawed work. Residual gap: no third model yet.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@claude

claude Bot commented Jun 21, 2026

Copy link
Copy Markdown

Claude finished @vpciii's task in 1m 55s —— View job


Review — adversarial round 8 log entry

  • Read the PR diff
  • Read repo standards (experiment README/trial guide, ADR index)
  • Verify claims against the log's own framing + prior rows
  • Check Conventional Commits / process drift
  • Post review

Reviewed against this project's own artifacts — the trial guide (experiments/adversarial-review/README.md), the ADR index, and the log's internal conventions. This is a documentation-only change to the trial log; judged on accuracy and self-consistency, not generic style.

Verdict: clean, with one process-drift item to resolve before/with merge.

What checks out

  • Tally arithmetic is correct. Runs 8 = opn-mcp ×4 (PR 13/14/15/16) + methodology ×4 (PR 18/19/21/22) ✓. Modes 3 code review (13/14/16) + 5 design decision (18/15/19/21/22) ✓. Verdicts 5 BLOCK + 1 PUSH BACK + 2 NO STRONG OBJECTION = 8 ✓. 1 retrospective + 7 pre-merge = 8 ✓.
  • ADR references are real and correctly attributed — 0014 (refactoring guide), 0020 (user-facing docs ride-along), 0022 (single-source), 0018 (global summary). The §15/§17-duplicate-0014/0020 and ADR-0022-violation framing maps onto actual ADRs.
  • The narrative is honestly hedged, which is the point of this log: the new tally bullet preserves the third-model gap, the small/same-week sample, the varied adversary tier, and keeps the author-steering confound attached to rounds 3/5/6. The "bidirectional support" claim is correctly scoped to the two independent directions (rounds 1/4 and round 8) rather than overclaiming from the steered rounds.
  • Conventional Commits ✓ — docs: prefix, accurate scope.

One finding — dangling detail-doc reference (process drift)

experiments/adversarial-review/log.md:17 — the new row points to methodology docs/reviews/pr22_adversarial_review.md, but that file does not exist in this repo (on this branch or any branch). Every other methodology row resolves to a real file — pr18_, pr19_, pr21_adversarial_review.md are all present in docs/reviews/. Round 8 is the lone methodology row whose write-up is missing.

The README is explicit that "Detailed per-review write-ups live in the reviewed project (e.g. a docs/reviews/ file in that repo)" — and for a methodology PR, the reviewed project is this repo. So the log is asserting a primary-evidence artifact for its headline data point ("the reverse defect-catch") that isn't actually committed. For a log whose whole purpose is to be the auditable basis of a graduate-or-kill decision, the one row carrying the most weight shouldn't be the one with an unresolvable citation.

Recommend committing docs/reviews/pr22_adversarial_review.md (ideally in this PR, to keep the doc and the behavior it describes in the same change) before this lands. Fix this →

Minor (non-blocking)

  • log.md:21 — the prose says "modes: 3 code review, 5 design decision" but the table header column is "Mode". Consistent enough; no change needed.

Nice work keeping the honest-confound framing intact while adding the new direction — the residuals section is the most valuable part of this entry.

Round 9: Gemini BLOCKed Claude's counter-proposal (PR #24) — and most of it
held (the refinement duplicates ADR 0015; §12 overlaps §6/§10; asymmetric CD
treatment; ADR-number collision). Claude conceded those and refuted one
finding as a false positive (the "stale §11 refs" claim — §12 was appended,
not renumbered). That false positive is the trial's first, and useful
calibration: a verifiable factual overreach, refutable by grep.

Both PR #22 (Gemini's expansion) and PR #24 (Claude's counter) are now
withdrawn — the methodology resisted expansion in both directions. Records
the adjudication (pr24_adjudication.md) and the new named failure mode
(adversary-authored-replacement bias).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vpciii vpciii merged commit 7dfea7c into main Jun 21, 2026
@vpciii vpciii deleted the docs/adversarial-round8 branch June 21, 2026 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant