docs: cross-model adversarial review trial (Antigravity/Gemini vs Claude) by vpciii · Pull Request #16 · vpciii/methodology

vpciii · 2026-06-17T13:51:15Z

Summary

Sets up a manual, global process for using a second, different model — Google Antigravity (Gemini 3) — to adversarially review work Claude produced, for both code reviews and design decisions. It complements the same-model Claude PR reviewer (ADR 0021): a different model has different blind spots, so it catches a class of defect the author's own model structurally shares.

Per your call: trial-first (no ADR yet) and portable prompt + practice only (no Antigravity config wiring).

What's here

experiments/adversarial-review/adversary-prompt.md — the skeptic persona you paste into Antigravity/Gemini. Refute-don't-bless framing, grounded in the project's own artifacts, with separate Code review and Design decision blocks and a BLOCK / PUSH BACK / NO STRONG OBJECTION verdict.
experiments/adversarial-review/README.md — the trial guide: how to run it manually (Antigravity IDE or agy CLI), a trial-log template so graduate-or-kill rests on evidence (signal vs. same-model+CI / false-positive rate / cost), and the graduation trigger → ADR with the automation rollout path (run-gemini-cli / agy headless) as an explicit ADR-gated next step.

Deliberately a trial, not a ratified practice

Lives in a new experiments/ dir, not methodology.md's numbered practices or the decision guide.
No merge gate, no adoption obligation for any project.
Graduates to an ADR only if the log shows it earns its keep across >1 repo; otherwise the directory is deleted and the negative result recorded.

Repo map kept honest: README layout row + CHANGELOG [Unreleased], both marked trial / no-adoption.

🤖 Generated with Claude Code

Set up a manual, global process for using a second, different model (Google Antigravity / Gemini 3) to adversarially review work Claude produced — code reviews and design decisions. Complements the same-model Claude PR reviewer (ADR 0021): a different model has different blind spots, so it catches a class of defect the author's own model shares. Deliberately a TRIAL, not a ratified practice — absent from the numbered practices and the decision guide, no merge gate, no adoption obligation. Ships the adversary prompt (skeptical persona, refute-don't-bless, with code-review and design-decision modes) and a trial-log template so the graduate-or-kill decision rests on evidence. Documents the automation rollout path (run-gemini-cli / agy headless, mirroring ADR 0021) as an explicit ADR-gated next step, not part of the trial. Per the methodology's keep-artifacts-honest rule, the README layout table and CHANGELOG [Unreleased] record the addition, marked trial / no-adoption. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

claude · 2026-06-17T13:51:47Z

Claude finished @vpciii's task in 1m 48s —— View job

Cross-model adversarial review trial — review

Read governing artifacts (ADRs 0001/0004/0005/0019/0021, methodology.md, README/CHANGELOG)
Review the 4 changed files against this project's own standards
Post findings

This is a clean, well-disciplined docs PR. It reads like it was written against this repo's own rules rather than generic best practice, and it holds up. No blocking issues.

What's right (against this project's standards):

Reversible-by-default (ADR 0004) is honored structurally. Trial-first, lives in a quarantined experiments/ dir, no merge gate, explicit graduation-trigger and a kill path ("delete the directory, record the negative result in an ADR"). A defended "no" as a first-class outcome also matches the spirit of ADR 0019.
Artifact honesty (ADR 0005) is kept. Both the README layout row and CHANGELOG.md [Unreleased] ride along in the same PR and are explicitly marked trial / no-adoption — no drift between the repo map and reality.
Terms align with the glossary/ADRs. expand-contract, blast radius/reversibility, appetite-vs-scope, and the ADR-trigger framing all match methodology.md and ADR 0004 — no domain-term drift.
Process: Conventional Commit (docs:), Keep-a-Changelog Added under [Unreleased], and the ../../adr/0021-... link resolves correctly from experiments/adversarial-review/.

Two low-severity notes (neither blocks):

The portable prompt's Inputs: lines are project-shaped and won't literally resolve when dogfooded on this repo — adversary-prompt.md:60-61 and :80-81 name spec.md, docs/glossary.md, CONTRIBUTING.md, architecture.md. This methodology repo's own layout is adr/ (not docs/adr/), methodology.md (not spec.md), templates/project-CONTRIBUTING.md, and no top-level glossary. The shared framing already says "the project's own artifacts" generically, which mirrors ADR 0021's "reads whichever governing artifacts a repo has" — so this is just the Inputs: enumeration being downstream-project-flavored. Consider a one-line "use whichever of these your repo has" caveat so the prompt is honest when pointed at this repo.
README.md:47 links to ./log.md, which doesn't exist yet (create-on-first-use). There's no markdown link-checker in CI (.github/workflows/ only has the two review workflows), so this won't fail a check — but a relative link to a not-yet-existing file is the kind of thing ADR 0005 cares about. The inline "create on first use" wording makes the intent clear, so this is borderline; flagging only for completeness.

One question worth the author answering (not a blocker): introducing a new top-level experiments/ directory is itself a small structural decision (the territory ADRs 0001/0005 govern). The PR makes a deliberate, well-argued case for trial-first, no ADR yet, and the reversibility + explicit graduation→ADR path is exactly the right way to defer that — so I read this as correctly handled rather than process drift. Noting it only because it's the one judgment call the change rests on.
· experiment/adversarial-review

Express the practice in terms of roles (author model / adversary model) rather than specific vendors. The adversary prompt and guide no longer name Claude or Gemini in their body; a single "Current roster" table is the one place that binds the roles to today's models. Swapping or upgrading a model is now a one-table edit — the prompt, the run steps, and the role language stay untouched. This keeps the global methodology model-agnostic: new or updated models drop in without rewrites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Make the graduation section explicit: the existing same-model reviewer (ADR 0021) is vendor-bound in CI (the prompt is already model-neutral, the binding is not), and the graduating ADR will generalize BOTH the author-model and adversary-model reviewers together on one pattern — a shared model-neutral review policy plus thin, swappable per-provider bindings. Deliberately deferred so the live, cross-repo CI path isn't refactored twice. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Roles swapped for the first time: Gemini authored opn-mcp PR #16 (the timezone fix), Claude adversary-reviewed it cold and un-steered. Verdict NO STRONG OBJECTION — the fix was correct, complete, and single-source- respecting, so the adversary raised no false positives. Partial validation: shows the practice doesn't false-positive in reverse, but doesn't yet demonstrate reverse-direction defect-catching (the change had no catchable defect). Still untested: a real flaw in a Gemini-authored change, and a third model. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vpciii and others added 2 commits June 17, 2026 09:54

vpciii merged commit 31f05ee into main Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: cross-model adversarial review trial (Antigravity/Gemini vs Claude)#16

docs: cross-model adversarial review trial (Antigravity/Gemini vs Claude)#16
vpciii merged 3 commits into
mainfrom
experiment/adversarial-cross-model-review

vpciii commented Jun 17, 2026

Uh oh!

claude Bot commented Jun 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vpciii commented Jun 17, 2026

Summary

What's here

Deliberately a trial, not a ratified practice

Uh oh!

claude Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cross-model adversarial review trial — review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 17, 2026 •

edited

Loading