From 1840325a415d219e53fd6a95461968ea716b5aec Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:42:52 +0700
Subject: [PATCH 01/56] docs(spec): add validated SPEC-024 + accept ADR-0013

AGENTS.md operating-layer cycle planning artifacts:
- SPEC-024 drafted and validated via /user:spec-validate (5 lenses);
  revisions DEC-004 corrected + DEC-005..008 added (four-zone contract,
  install.sh jq ownership, WORKFLOW dedup, edge-case tip model, doc-map
  file-trigger row).
- ADR-0013 flipped proposed -> accepted with the After-state addendum.
- BACKLOG: ID-015 (this work), plus ID-016/017 from the SPEC-025 retro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                              |   4 +
 .../0013-agents-md-operating-layer.md         |   4 +-
 .../SPEC-024-agents-md-operating-layer.md     | 141 ++++++++++++++++++
 3 files changed, 148 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/SPEC-024-agents-md-operating-layer.md
diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 5da6a94..2696e1c 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -34,9 +34,13 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-002 | Absorb skills/hooks developed in ops-toolkit into the kit (internal lane) | item e | SPEC-007 (PARKED; see its Parked note) | full | parked |
 | ID-003 | Deep orchestration scan of the copied repos vs our WORKFLOW (report + absorb plan) | item 2 | `docs/research/2026-05-20-orchestration-deep-scan.md` | normal | shipped (report delivered; recs folded into SPEC-006) |
 | ID-012 | Goal-loop fidelity (one theme, two parts). P1: stop-criteria in the `/spec` template (pin `## Verification` + `## Open questions` so specs are pointer-`/goal`-ready) -- build now, tiny lane. P2: QA gate around the autonomous loop (verify-into-loops + SPEC-006 completeness clauses + Reviewer 5) -- held until the pointer-`/goal` pattern has real runs | maintainer Q3 2026-05-20 + goal-readiness convergence 2026-05-21 | SPEC-012 | normal (P1) / full (P2) | P1 shipped (see CHANGELOG); P2 held |
+| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | speccing |
+| ID-016 | Promote the "new SPEC -> BACKLOG status row" doc-impact check to a guard (a hook or a kit-health line); it was missed across multiple cycles | SPEC-025 retro 2026-05-21 (recurrence clears the PHILOSOPHY section 5 bar) | TBD | normal | queued |
+| ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-<slug>.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
+- ID-016 / ID-017 came out of the SPEC-025 retro; both are kit-internal hygiene, independent of the SPEC-024 chain.
 
 ## Schema
 
diff --git a/docs/decisions/0013-agents-md-operating-layer.md b/docs/decisions/0013-agents-md-operating-layer.md
index d870b75..2f565c9 100644
--- a/docs/decisions/0013-agents-md-operating-layer.md
+++ b/docs/decisions/0013-agents-md-operating-layer.md
@@ -1,6 +1,8 @@
 # ADR-0013: AGENTS.md as the tool-agnostic operating-layer entrypoint
 
-## Status: proposed (2026-05-21).
+## Status: accepted (2026-05-21). Scope approved 2026-05-21; ships as backlog item ID-015 / SPEC-024 (full lane, all six decisions). Entry stays BACKLOG-ID-first; the freeform "griller" entry (a casual intent with no ID) is deferred per that scope call.
+
+**Addendum (2026-05-21):** SPEC-024 scope also adds an observable **After-state** section: a definition-of-done picture made of statements that are false-now / true-after and each checkable by a human or command. It lands as a `## After state` spec-template section (feeding `## Acceptance Criteria`) and is projected into the six-section goal's `Done-when`. The matching anatomy piece is already added to the personal `goal-craft` skill so it applies beyond this repo. Load-bearing rule: observable, not narrated, or it is fluff and gets cut (PHILOSOPHY: every file justifies itself).
 
 ## Context
 A study of `hoangnb24/harness-experimental` (OpenAI "harness engineering" framing) showed an operator writing a rich, multi-section `/goal` (Context to read first / Constraints / Operating rules / Validation loop / Done when / Pause if) and running it reliably against a brownfield repo. That altitude is not prompt skill. Each section is a pointer to a *named artifact* the harness installs into the consuming repo: an ordered source-of-truth list, a feature-intake/risk doc, a test matrix, a done-definition, and an "ask before" list, all anchored by a single agent entrypoint, `AGENTS.md`. The operator references structure the repo already carries; they do not invent it in the prompt.
diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
new file mode 100644
index 0000000..c338ad5
--- /dev/null
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -0,0 +1,141 @@
+# Spec: AGENTS.md operating layer + brownfield backfill lane
+Generated: 2026-05-21
+Status: VALIDATED
+
+> Implements ADR-0013 (accepted 2026-05-21) and its After-state addendum. Backlog: ID-015 (full lane).
+> Validated 2026-05-21 via /user:spec-validate (5 lenses); revisions recorded in the Decision Log (DEC-004 corrected, DEC-005..DEC-008 added).
+> Research note: the standard brownfield 4-agent research pass was skipped because the relevant
+> surfaces were already mapped in the session that produced ADR-0013 (WORKFLOW.md, CLAUDE.md,
+> commands/assign.md, commands/spec.md, install.sh, the WORKFLOW doc-impact map). No CONTEXT.md was
+> regenerated; this spec + ADR-0013 are the implementation context. Regenerate CONTEXT.md at
+> /user:execute time only if a worker needs more than the files named per task.
+
+## Problem
+The kit out-enforces `hoangnb24/harness-experimental` (real hooks, verifier, push-blocker vs their advisory markdown) but under-ships the *legible in-repo operating layer a `/goal` can point at*:
+
+- The entrypoint is `CLAUDE.md` (+ `WORKFLOW.md`), which is Claude-Code-specific. There is no tool-agnostic front door.
+- "Pause if / ask a human" is enforced by safety hooks but never *stated* as a directive a goal can mirror.
+- There is no brownfield "review the codebase and backfill docs" entry; the cycle is greenfield-feature oriented (`/think -> /spec`).
+
+Consequence: operators hand-write the rich six-section `/goal` structure every time, with no in-repo referent, so they never reach the altitude shown in the trigger screenshot. The harness-experimental operator does not write that structure into the prompt; they reference structure the repo already carries (its `AGENTS.md` read-order, done-definition, and "ask before" list). We lack that carrier.
+
+## Solution
+
+### Approaches considered
+- **A: `AGENTS.md` canonical operating layer; `CLAUDE.md` becomes a thin CC-specific pointer (chosen).** Tradeoff: some operate-contract prose moves out of CLAUDE.md; one-time churn.
+- **B: Keep `CLAUDE.md` canonical, add `AGENTS.md` as a pointer to it.** Rejected: a Codex/Gemini agent reading `AGENTS.md` gets redirected to a CC-specific file; defeats portability.
+- **C: Ship the full harness-experimental scaffold (empty `product/`, `stories/`, `TEST_MATRIX.md`).** Rejected: violates PHILOSOPHY "every file must justify its existence" + "no phantom features". Their product *is* the scaffold; ours is not.
+
+### Chosen approach + why
+Adopt `AGENTS.md` as the tool-agnostic front door carrying the portable operate-contract (ordered read list, task loop, done-definition, explicit "Pause if" list). `CLAUDE.md` shrinks to the CC-only layer (hooks, slash commands, plugin) and points at `AGENTS.md` for the operate-contract. Add a `backfill` brownfield lane to `WORKFLOW.md`. Make the goal-crafter (`commands/assign.md`) emit the six-section operating directive, each section projecting an `AGENTS.md` artifact. Add an observable `## After state` section to the spec template and project it into the goal's `Done-when`. (ADR-0013 decisions 1, 2, 4, 5 + addendum. Decisions 3 and 6 are honored as Out-of-Scope guards below.)
+
+### Extensibility & boundaries
+- **Load-bearing dimension: number of agent runtimes.** `AGENTS.md` is plain markdown any runtime reads. Enforcement stays Claude-Code-only (the hooks). Adding a runtime means it *reads* `AGENTS.md`; it does NOT inherit the guardrails until the v3.x agent-hook work lands. The spec must not over-claim portability of enforcement.
+- **Units (each independently testable):** `AGENTS.md` (entrypoint), `CLAUDE.md` (CC layer), `WORKFLOW.md` (lanes), `commands/assign.md` (projection), `commands/spec.md` (after-state template). Each describable in <=3 sentences.
+
+### Architecture
+```
+AGENTS.md  (tool-agnostic front door)
+  | sections: ordered read list, task loop, done-definition, "Pause if"
+  |
+  +-- CLAUDE.md  (CC-specific: hooks/commands/plugin) --points to--> AGENTS.md (no restating)
+  +-- WORKFLOW.md  (lanes incl. new `backfill`) <-- AGENTS.md points here for lane selection
+  +-- commands/assign.md  (goal-crafter)
+  |        projects --> six-section /goal:
+  |          Context-to-read  <- AGENTS.md read list
+  |          Constraints      <- AGENTS.md/CLAUDE.md rules
+  |          Operating rules  <- AGENTS.md task loop
+  |          Validation loop  <- spec ## Verification
+  |          Done-when        <- AGENTS.md done-definition + spec ## After state
+  |          Pause-if         <- AGENTS.md "Pause if" list
+  +-- commands/spec.md template  (new ## After state, feeds ## Acceptance Criteria)
+```
+
+## Technical Design
+
+### Interfaces (I/O contract)
+- **`AGENTS.md`** consumes: nothing (it is the root). Produces: **four portable operate-contract zones** (ordered read list, task loop, done-definition, "Pause if" list). The goal-crafter *composes* the six-section `/goal` from these four zones plus the active spec's `## Verification` and `## After state` (and the CLAUDE.md rules), per the Architecture diagram above; the mapping is a composition, not 1:1. Invariant: the four zone names are stable; renaming one without updating `assign.md` breaks the projection.
+- **`commands/assign.md`** consumes: the resolved `AGENTS.md` sections + the `_meta/BACKLOG.md` row for `ID-NNN`. Produces: a six-section operating directive in `.claude/goals/<slug>.md`. Invariant: BACKLOG-ID-first entry (no freeform-intent path in this spec).
+- **`commands/spec.md`** template: gains a `## After state` block between `## Solution` and `## Acceptance Criteria`. Invariant: every After-state bullet is observable (checkable by a human or a command), never narrated prose.
+
+### Data model changes
+- New file `AGENTS.md` at kit root and `examples/hello-spec/AGENTS.md`.
+- `commands/spec.md` template gains a `## After state` section.
+
+### API changes (the goal-crafter contract)
+- `commands/assign.md` output shape changes from a tight contract goal (outcome/verify/scope/blocker) to the six-section operating directive, each section pointing at `AGENTS.md`. `Done-when` includes the spec's observable After-state.
+
+### UI changes
+None. The kit ships no UI.
+
+### Infrastructure changes
+- `WORKFLOW.md`: add the `backfill` lane to the lane table + the cycle. Add an `AGENTS.md` companion-doc row to the doc-impact map. The existing bolded self-maintaining rule is scoped to a new top-level *dir*; `AGENTS.md` is a top-level *file*, so TASK-007 also adds a "new top-level file" trigger row so the map self-maintains for files too.
+- `install.sh`: add the `AGENTS.md` copy tip alongside the existing `CLAUDE.md` tip. (Separately, this cycle's install dogfood exercised the settings merge/clean against a `settings.json` that already carried third-party hooks, the previously-untested path; TASK-009 owns it as a regression test and applies any jq fix the test surfaces. See DEC-004.)
+- `tests/test-meta.sh`: assert `AGENTS.md` carries the four portable zones, and that `assign.md` carries the six-section projection.
+
+## Task Breakdown
+Each task is atomic: implementable in one session, fits in ~50% of a context window.
+
+### Phase 1: Operating layer
+- [ ] TASK-001: Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal.
+- [ ] TASK-002: Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first.
+- [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it.
+
+### Phase 2: Lanes + projection
+- [ ] TASK-004: Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present.
+- [ ] TASK-005: Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside.
+- [ ] TASK-006: Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present.
+
+### Phase 3: Sync (the doc-impact map)
+- [ ] TASK-007: Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.
+- [ ] TASK-008: Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`.
+- [ ] TASK-009: Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes.
+- [ ] TASK-010: Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry (include the install.sh jq fix as a fix line), and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature + the install fix.
+
+## Acceptance Criteria (global)
+- [ ] All tasks pass their individual acceptance criteria.
+- [ ] `AGENTS.md` is the front door (root + examples) carrying the four portable zones; `CLAUDE.md` and `WORKFLOW.md` point at it (no restating); `backfill` lane present; `assign.md` projects the six sections; `spec.md` template has `## After state`.
+- [ ] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0.
+
+## Verification
+`bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `grep -q 'Pause if' AGENTS.md` AND `grep -q backfill WORKFLOW.md` AND `grep -q '## After state' commands/spec.md`
+
+## After state
+Observable; each bullet is false today and a real check once shipped.
+- [ ] `AGENTS.md` exists at the kit root and `grep -q 'Pause if' AGENTS.md` passes. (Today: no `AGENTS.md` anywhere in the kit.)
+- [ ] `CLAUDE.md` points at `AGENTS.md` and no longer restates the read-order / done-definition / pause list.
+- [ ] `WORKFLOW.md` lists a `backfill` lane. (Today: only `tiny` / `normal` / `full` / `bug`.)
+- [ ] `commands/assign.md` produces a six-section directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if), not a one-line contract goal.
+- [ ] `commands/spec.md` template contains a `## After state` section, and a newly generated spec carries observable after-state bullets.
+- [ ] `examples/hello-spec/AGENTS.md` exists so a downstream repo inherits the front door.
+
+## Edge Cases
+1. A non-CC runtime (Codex/Gemini) reads `AGENTS.md` but gets no hook enforcement: `AGENTS.md` must state enforcement is CC-only so no operator assumes the guardrails are portable.
+2. A downstream repo already has its own `AGENTS.md`: `install.sh` does not copy `AGENTS.md` at all. Like `CLAUDE.md`, it only prints a copy *tip* (`install.sh:299`), so there is no clobber path to guard. The real requirement is that the printed tip read as copy-if-absent, not a blind overwrite (it must not imply the installer will replace an existing `AGENTS.md`). (Commands are symlinked and rules are copy-with-skip; `AGENTS.md` follows the tip model, not either of those.)
+3. An After-state bullet that cannot be made checkable: `/spec-validate` and `/review` must flag it and require cutting or rewriting it (the observable-not-narrated rule is enforced at the gate, not just stated).
+
+## Failure modes
+| Failure class | Detection signal | Mitigation / recovery |
+|---|---|---|
+| `AGENTS.md` ↔ `CLAUDE.md` / `WORKFLOW.md` drift (a CC-layer doc restates the contract) | doc-verifier or `/review` finds duplicated read-order/done/pause in `CLAUDE.md` or `WORKFLOW.md` | `AGENTS.md` is the single source; both `CLAUDE.md` and `WORKFLOW.md` point only (replace, don't duplicate) |
+| Over-claiming portable enforcement | a reviewer reads "works with Codex" as "enforced under Codex" | `AGENTS.md` states advisory-only under non-CC runtimes; PHILOSOPHY honesty rule |
+| After-state rots into aspirational fluff | bullets are not checkable on re-read | observable-not-narrated rule + the spec-validate/review gate cut non-checkable bullets |
+| `backfill` lane silently edits app code | a backfill run changes behavior, not just docs | lane definition forbids app-behavior change; pause-if covers it |
+
+## Out of Scope
+- **Freeform "griller" entry** (a casual intent with no BACKLOG ID): BACKLOG-ID-first stays canonical (the 2026-05-21 scope call). Why: preserves the detector/mutator + traceability discipline.
+- **Portable enforcement / agent-hooks for Codex/Gemini** (ADR-0013 decision 6): deferred to the v3.x multi-runtime work. We add portable *guidance*, not portable *guardrails*.
+- **Empty `product/` / `stories/` / `TEST_MATRIX.md` scaffolds** (ADR-0013 decision 3): created only when real content exists.
+
+## Decision Log
+- DEC-001: `AGENTS.md` canonical, `CLAUDE.md` a thin pointer. Rationale: tool-agnostic front door + no duplication. Alternatives B (CLAUDE.md canonical) and C (full empty scaffold) rejected. Who: ADR-0013 (human-accepted).
+- DEC-002: `backfill` is a new lane, not a reuse of `normal`. Rationale: brownfield doc-backfill has different inputs (existing code) and a no-app-change constraint. Who: ADR-0013 decision 4.
+- DEC-003: After-state is a spec section + a `Done-when` projection target, governed by observable-not-narrated. Rationale: a checkable picture both human and agent verify; fluff is rejected by PHILOSOPHY. Who: ADR-0013 addendum (human-requested).
+- DEC-004: The install dogfood for this cycle exercised the settings merge/clean against a `settings.json` that already carried third-party hooks, the previously-untested path. TASK-009 owns this end to end: it adds the regression test and applies any minimal jq fix the test surfaces. (Earlier framing claimed the bugs "were already fixed"; `install.sh` is unchanged on this branch, so the work is owned by TASK-009, not pre-done.) Rationale: a merge that drops or crashes on a real user's existing hooks blocks install. Who: auto (raised in dogfood), corrected in spec-validate.
+- DEC-005: `AGENTS.md` carries **four** portable zones (read list, task loop, done-definition, Pause-if); the six-section `/goal` is a *composition* of those four plus the spec's `## Verification` and `## After state`, not a 1:1 mapping. Rationale: the Architecture diagram already shows two goal sections sourced from the spec, so the earlier "six contract zones in `AGENTS.md` / 1:1" framing (Interfaces, TASK-001 AC, TASK-009) was internally contradictory and unsatisfiable. Who: spec-validate Reviewer 5/4.
+- DEC-006: The operate-contract prose (ordered read-order + done-definition) lives in `WORKFLOW.md` today, not kit-root `CLAUDE.md`. TASK-002 therefore reconciles `WORKFLOW.md` (point at `AGENTS.md`), not just `CLAUDE.md`, and a failure-mode row now covers `AGENTS.md`↔`WORKFLOW.md` drift. Rationale: without this the dedup is a no-op on the named file and the real three-way overlap ships. Who: spec-validate Reviewer 3/5.
+- DEC-007: Edge Case 2 reframed to the tip model: `install.sh` never copies `AGENTS.md` (tip-only, like `CLAUDE.md`), so the "must not clobber" guard described a mechanism that does not exist. Rationale: accuracy; the real requirement is the tip wording. Who: spec-validate Reviewer 2.
+- DEC-008: TASK-007 adds a "new top-level file" trigger row to the doc-impact map. Rationale: the bolded self-maintaining rule is scoped to a new top-level *dir*; a top-level *file* like `AGENTS.md` was not actually covered, so line-71's "already requires" claim was false. Who: spec-validate Reviewer 4.
+
+## Open questions
+(none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From 7c5015c6110ef67384c075e0e8ffe5794aef837f Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:45:06 +0700
Subject: [PATCH 02/56] docs(agents): add AGENTS.md operating-layer front door

Tool-agnostic front door carrying the four portable operate-contract
zones (read-in-order list, task loop, done-definition, Pause-if list)
that commands/assign.md projects into the six-section /goal. States
plainly that enforcement is Claude-Code-only (the hooks); under other
runtimes AGENTS.md is advisory only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 AGENTS.md | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..6b4b19b
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,89 @@
+# AGENTS.md: the operating layer
+
+> Tool-agnostic front door. Any agent runtime (Claude Code, Codex, Gemini, a
+> human) reads this first to learn how work is done in this repo. It carries the
+> portable operate-contract: what to read, how to run a unit of work, when work
+> is done, and when to stop and ask.
+
+## Enforcement boundary (read this first)
+
+**This file is a contract, not a guardrail.** Enforcement is Claude-Code-only: the
+hooks (safety-gate, push-to-main blocker, anti-rationalization Stop hook, the
+verification pipeline) are what actually block bad outcomes, and they run only
+under Claude Code. Under any other runtime (Codex, Gemini, a bare LLM)
+`AGENTS.md` is **advisory only**: it tells an agent what to do, but nothing
+enforces it. Do not assume the guardrails are portable. They are not, until the
+v3.x multi-runtime agent-hook work lands. See `docs/PHILOSOPHY.md` (honesty rule:
+never over-claim portable enforcement) and `CLAUDE.md` for the CC-specific layer
+(hooks, slash commands, plugin).
+
+The rest of this file is four portable zones. The goal-crafter
+(`commands/assign.md`) projects them into a six-section `/goal` (see "How a goal
+is composed" at the end).
+
+---
+
+## 1. Read in this order
+
+Orient before you touch anything. Read top to bottom; stop when you have enough.
+
+1. **AGENTS.md** (this file) - how work is done here; the operate-contract.
+2. **CLAUDE.md** - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin.
+3. **docs/specs/SPEC-NNN-<slug>.md** - the active spec; the shared contract for the cycle. Read its `## Verification` and `## After state` before implementing.
+4. **docs/architecture.md** / **WORKFLOW.md** - reference, not required per task. Read `docs/architecture.md` for how the pieces fit; read `WORKFLOW.md` for the lanes and the gate at each phase boundary.
+
+## 2. Task loop
+
+How to do one unit of work. The smallest verifiable increment, verified, committed.
+
+1. **Size the lane.** Pick `tiny` / `normal` / `full` / `bug` / `backfill` per `WORKFLOW.md`. When in doubt between two lanes, take the heavier one.
+2. **Read the spec and its acceptance criteria.** For a spec-driven task: the active spec's task row, its AC, its `## Verification`, and its `## After state`. No spec (tiny lane): the one obvious edit.
+3. **Implement the smallest verifiable increment.** One logical change. No speculative features, no premature abstraction; clarity over cleverness.
+4. **Verify.** Run the spec's `## Verification` command (or the lane's check). Do not claim a result you did not run.
+5. **Commit.** Conventional commit, one logical change. No spec/ticket IDs in the subject line.
+
+If you cannot make progress, see zone 4 (Pause if) and stop with a named blocker note. Do not churn.
+
+## 3. Done means
+
+A task or goal is done only when **its acceptance criteria are met AND the
+verifier actually ran the check**, not when you claim they pass. Self-reported
+"done" is not proof.
+
+Concretely, done means: **acceptance criteria met, the check actually ran (not
+just asserted), review recorded + report written, and the final response says
+what changed and what was not attempted.** If you could not run the check, report
+that plainly; the anti-rationalization hook is the backstop for premature
+completion under Claude Code, but the honesty obligation is yours under any
+runtime.
+
+## 4. Pause if (ask a human)
+
+Stop and ask a human before acting on any of these. These are decisions with
+direction or irreversible cost that a goal loop must not make on its own.
+
+- **Architecture direction** - a change to how the pieces fit, a new component, an interface or data-model shape.
+- **Source-of-truth hierarchy** - which file or section is canonical when two disagree (for example, moving the operate-contract between `AGENTS.md`, `CLAUDE.md`, and `WORKFLOW.md`).
+- **Validation removal** - weakening, deleting, or bypassing a test, an assertion, a hook, or any guardrail.
+- **Risk-classification change** - moving work to a lighter lane, or narrowing a `full`-lane trigger (auth, authz, hooks, data model, data loss, audit/security, external provider, API contract, migration).
+- **Privacy / security** - secrets, credentials, access scope, anything that touches what data leaves the repo or who can reach it.
+
+When you pause: write the named blocker, state the decision you are not making and why, and stop.
+
+---
+
+## How a goal is composed (for `commands/assign.md`)
+
+The goal-crafter projects these four zones into a six-section `/goal`. The mapping
+is a **composition, not 1:1**: two of the six sections come from the active spec,
+not from this file. Keep the four zone names stable; renaming one without updating
+`commands/assign.md` breaks the projection.
+
+| `/goal` section | Source |
+|---|---|
+| Context-to-read | AGENTS.md zone 1 (Read in this order) |
+| Constraints | CLAUDE.md / AGENTS.md rules |
+| Operating rules | AGENTS.md zone 2 (Task loop) |
+| Validation loop | the active spec's `## Verification` |
+| Done-when | AGENTS.md zone 3 (Done means) + the active spec's `## After state` |
+| Pause-if | AGENTS.md zone 4 (Pause if) |

From c343d9297e036c5e32939dd2ec7aebb34734a9b2 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:48:04 +0700
Subject: [PATCH 03/56] docs(workflow): point CC-layer docs at AGENTS.md

AGENTS.md is the canonical operate-contract front door. Remove the
restated ordered read-order and core done-definition from WORKFLOW.md
(now pointers to AGENTS.md zone 1 and zone 3) and add the AGENTS.md
front-door pointer to CLAUDE.md. WORKFLOW.md required-reading now names
AGENTS.md first; kit-specific completeness clauses are kept.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md   |  4 +++-
 WORKFLOW.md | 18 +++++++++++-------
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index e93ec6a..9640766 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -57,7 +57,9 @@ The kit unified the spec location onto `docs/specs/SPEC-NNN-<slug>.md` for both
 
 ## Workflow
 
-The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`.
+The operate-contract (read-order, task loop, done-definition, the "Pause if" list) is canonical in [`AGENTS.md`](AGENTS.md), the tool-agnostic front door. Read it first; this file is the Claude-Code layer on top of it. Do not restate the operate-contract here; point at `AGENTS.md`.
+
+The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after `AGENTS.md` and this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`.
 
 `/user:kit-health` is the maintainer-only self-assessment against PHILOSOPHY.md. Run it before tagging.
 
diff --git a/WORKFLOW.md b/WORKFLOW.md
index b2a890c..b672f77 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -7,9 +7,12 @@
 > hook, and the verification pipeline.
 
 ## Required reading (in order)
-1. CLAUDE.md            - project context: stack, structure, rules
-2. docs/specs/SPEC-NNN-<slug>.md - the active spec; the shared contract for the cycle
-3. docs/architecture.md - how the pieces fit (reference; not required per task)
+`AGENTS.md` is the front door and owns the read-order. Read it first, then this
+list; do not restate the order here. See `AGENTS.md` zone 1 ("Read in this order").
+1. AGENTS.md            - the operate-contract: read-order, task loop, done-definition, "Pause if"
+2. CLAUDE.md            - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin
+3. docs/specs/SPEC-NNN-<slug>.md - the active spec; the shared contract for the cycle
+4. docs/architecture.md - how the pieces fit (reference; not required per task)
 
 ## Size the work first (risk-tiered intake)
 Pick a lane before you start. Smaller work skips ceremony.
@@ -71,10 +74,11 @@ session start
 **Detector/mutator split.** `/user:start` and `/user:next` only read and render; `/user:assign` is the only mutator. **Activator-agnostic.** `/user:assign` writes only the `.claude/goals/<slug>.md` draft and surfaces its body; activation (starting the loop) is done by whatever primitive is present (the built-in `/goal`, the `ralph-loop` plugin, or the `goal-craft` skill). The kit NEVER writes `.claude/last-goal.md`; if no activator exists, the draft is a plain reusable file. **"Even the goal loop follows WORKFLOW"** is delivered honestly: the safety subset is hard-enforced by existing hooks (anti-rationalization, the verification pipeline, the push-to-main blocker); decision/doc completeness is warned + logged to `~/.claude/dwarves-kit/logs/completeness.log` and reviewed at `/user:ship` + `/user:retro`, not hard-blocked mid-loop (PHILOSOPHY rejects hard-gating process completeness).
 
 ## Completion contract
-A task is done only when its acceptance criteria are met and the verifier has
-actually run the tests, not when you claim they pass. If you cannot run the
-check, report that plainly; the anti-rationalization hook is the backstop for
-premature completion. Self-reported "done" is not proof; the task-verifier is.
+The done-definition is canonical in `AGENTS.md` zone 3 ("Done means"); do not
+restate it here. In the kit, the task-verifier is what proves "done" (self-reported
+"done" is not proof), and the anti-rationalization hook is the backstop for
+premature completion. The clauses below add kit-specific completeness checks on top
+of that done-definition.
 
 ### Completeness clauses (warn + log, reviewed at ship)
 Two self-check clauses run during Build/Reflect. Both WARN and LOG to `~/.claude/dwarves-kit/logs/completeness.log` (the `spec-drift-guard` logging shape); neither hard-blocks. `/user:ship` and `/user:retro` review that log at the gate. Hard blocks stay reserved for the safety subset (PHILOSOPHY rejects hard-gating process completeness).

From 5e4bfc1ea143b41fbb88cdfb2475e0e0b3348736 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:50:41 +0700
Subject: [PATCH 04/56] docs(agents): add hello-spec AGENTS.md downstream
 template

Mirror the kit-root AGENTS.md four-zone shape (Read in this order,
Task loop, Done means, Pause if) as a downstream template for projects
consuming dwarves-kit, voiced for the spm sample CLI rather than copied
verbatim. Keeps the CC-only enforcement caveat and the goal-composition
note. Reference it from the hello-spec README (front-of-reading-order).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 examples/hello-spec/AGENTS.md | 87 +++++++++++++++++++++++++++++++++++
 examples/hello-spec/README.md | 16 ++++---
 2 files changed, 97 insertions(+), 6 deletions(-)
 create mode 100644 examples/hello-spec/AGENTS.md

diff --git a/examples/hello-spec/AGENTS.md b/examples/hello-spec/AGENTS.md
new file mode 100644
index 0000000..1ede012
--- /dev/null
+++ b/examples/hello-spec/AGENTS.md
@@ -0,0 +1,87 @@
+# AGENTS.md: the operating layer (spm)
+
+> Tool-agnostic front door for the `spm` project. Any agent runtime (Claude Code,
+> Codex, Gemini, a human) reads this first to learn how work is done in this repo.
+> It carries the portable operate-contract: what to read, how to run a unit of
+> work, when work is done, and when to stop and ask. This is the downstream
+> template that ships with dwarves-kit; replace the `spm` specifics with your own.
+
+## Enforcement boundary (read this first)
+
+**This file is a contract, not a guardrail.** `spm` uses dwarves-kit, and the
+kit's enforcement is Claude-Code-only: the hooks (safety-gate, push-to-main
+blocker, anti-rationalization Stop hook, the verification pipeline) are what
+actually block bad outcomes, and they run only under Claude Code. Under any other
+runtime (Codex, Gemini, a bare LLM) this file is **advisory only**: it tells an
+agent what to do, but nothing enforces it. Do not assume the guardrails are
+portable. See `CLAUDE.md` for the Claude-Code layer (stack, rules, where the spec
+lives).
+
+The rest of this file is four portable zones. A goal-crafter projects them into a
+six-section operating directive (see "How a goal is composed" at the end).
+
+---
+
+## 1. Read in this order
+
+Orient before you touch anything. Read top to bottom; stop when you have enough.
+
+1. **AGENTS.md** (this file) - how work is done here; the operate-contract.
+2. **CLAUDE.md** - the project anchor: what `spm` is, the stack (Python 3.12 / uv / pytest / ruff), code-quality rules, where specs live.
+3. **docs/specs/SPEC-NNN-<slug>.md** - the active spec; the shared contract for the cycle. Read its `## Verification` and `## After state` before implementing. (Today that is `docs/specs/SPEC-001-version-flag.md`.)
+4. **WORKFLOW.md** - reference, not required per task. Read it for the lanes and the gate at each phase boundary.
+
+## 2. Task loop
+
+How to do one unit of work. The smallest verifiable increment, verified, committed.
+
+1. **Size the lane.** Pick `tiny` / `normal` / `full` / `bug` / `backfill` per `WORKFLOW.md`. When in doubt between two lanes, take the heavier one.
+2. **Read the spec and its acceptance criteria.** For a spec-driven task: the active spec's task row, its AC, its `## Verification`, and its `## After state`. No spec (tiny lane): the one obvious edit.
+3. **Implement the smallest verifiable increment.** One logical change. No speculative features (`spm` does install/freeze/list; new subcommands need a spec), no premature abstraction (no `BaseCommand` until there are 6 commands); clarity over cleverness.
+4. **Verify.** Run the spec's `## Verification` command, or the lane's check: `uv run pytest && uv run ruff check .`. Do not claim a result you did not run.
+5. **Commit.** Conventional commit, one logical change. No spec/ticket IDs in the subject line.
+
+If you cannot make progress, see zone 4 (Pause if) and stop with a named blocker note. Do not churn.
+
+## 3. Done means
+
+A task or goal is done only when **its acceptance criteria are met AND the
+verifier actually ran the check**, not when you claim they pass. Self-reported
+"done" is not proof.
+
+Concretely, done means: **acceptance criteria met, the check actually ran (not
+just asserted), review recorded + report written, and the final response says
+what changed and what was not attempted.** If you could not run the check, report
+that plainly; under Claude Code the anti-rationalization hook is the backstop for
+premature completion, but the honesty obligation is yours under any runtime.
+
+## 4. Pause if (ask a human)
+
+Stop and ask a human before acting on any of these. These are decisions with
+direction or irreversible cost that a goal loop must not make on its own.
+
+- **Architecture direction** - a new subcommand, a new module, a change to the argparse dispatch shape, or any new public interface for the CLI.
+- **Source-of-truth hierarchy** - which file or section is canonical when two disagree (for example, the operate-contract split across `AGENTS.md`, `CLAUDE.md`, and `WORKFLOW.md`).
+- **Validation removal** - weakening, deleting, or skipping a test, a ruff rule, or any check; suppressing a `pip` subprocess error instead of propagating its exit code.
+- **Risk-classification change** - moving work to a lighter lane, or narrowing a `full`-lane trigger (anything touching how `pip` is invoked, the wheel build, the published package, or `pyproject.toml` dependencies).
+- **Privacy / security** - secrets, credentials, access scope, anything that touches what data leaves the repo or who can reach it.
+
+When you pause: write the named blocker, state the decision you are not making and why, and stop.
+
+---
+
+## How a goal is composed
+
+A goal-crafter projects these four zones into a six-section operating directive.
+The mapping is a **composition, not 1:1**: two of the six sections come from the
+active spec, not from this file. Keep the four zone names stable; renaming one
+breaks the projection.
+
+| Goal section | Source |
+|---|---|
+| Context-to-read | AGENTS.md zone 1 (Read in this order) |
+| Constraints | CLAUDE.md / AGENTS.md rules |
+| Operating rules | AGENTS.md zone 2 (Task loop) |
+| Validation loop | the active spec's `## Verification` |
+| Done-when | AGENTS.md zone 3 (Done means) + the active spec's `## After state` |
+| Pause-if | AGENTS.md zone 4 (Pause if) |
diff --git a/examples/hello-spec/README.md b/examples/hello-spec/README.md
index 10b063a..143cf05 100644
--- a/examples/hello-spec/README.md
+++ b/examples/hello-spec/README.md
@@ -4,11 +4,14 @@ A small, self-contained example showing what dwarves-kit produces when you run i
 
 ## What this shows
 
-A hypothetical Python CLI called `spm` ("simple package manager") needs a `--version` flag. This directory contains the three files dwarves-kit would generate or rely on to ship that change end-to-end. Each is a normal output, not a mock.
+A hypothetical Python CLI called `spm` ("simple package manager") needs a `--version` flag. This directory contains the files dwarves-kit would generate or rely on to ship that change end-to-end. Each is a normal output, not a mock.
 
-The point: see what `docs/specs/SPEC-001-version-flag.md` actually looks like; see how `CLAUDE.md` anchors a contractor's session; see how the kit's commands chain.
+The point: see what `docs/specs/SPEC-001-version-flag.md` actually looks like; see how `AGENTS.md` is the tool-agnostic front door a downstream repo inherits; see how `CLAUDE.md` anchors a contractor's session; see how the kit's commands chain.
 
-## The three files
+## The files
+
+### `AGENTS.md`
+The downstream front door. Tool-agnostic operating layer (read-order, task loop, done-definition, "Pause if" list) that any agent runtime reads first. This is the realistic-placeholder template a project inherits from dwarves-kit; `CLAUDE.md` sits under it as the Claude-Code-specific layer. Enforcement (hooks, verifier) is Claude-Code-only; under other runtimes `AGENTS.md` is advisory.
 
 ### `CLAUDE.md`
 Project anchor. Read by Claude Code at session start, by `/user:spec` for stack detection, by every worker subagent in `/user:execute`. Tells Claude what the project is, what the stack is, where the spec lives, and what code-quality rules apply.
@@ -49,9 +52,10 @@ PASS -> next task; FAIL:fixable -> fix-agent (max 2 retries); FAIL:escalate -> h
 
 ## Reading order
 
-1. Read `CLAUDE.md` first; it's the project anchor a contractor sees first.
-2. Then `docs/specs/SPEC-001-version-flag.md`: the actual feature plan.
-3. Notice: no `--version` code yet. The spec is the input to `/user:execute`. The example stops at "spec ready to build", not "feature implemented", to keep the example readable.
+1. Read `AGENTS.md` first; it's the tool-agnostic front door (read-order, task loop, done-definition, Pause-if).
+2. Then `CLAUDE.md`: the Claude-Code project anchor that sits under the front door.
+3. Then `docs/specs/SPEC-001-version-flag.md`: the actual feature plan.
+4. Notice: no `--version` code yet. The spec is the input to `/user:execute`. The example stops at "spec ready to build", not "feature implemented", to keep the example readable.
 
 ---
 

From 81929c65ae6b1163ac551a9d4ee22834b2d944ed Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:52:52 +0700
Subject: [PATCH 05/56] docs(spec): mark operating-layer tasks verified

TASK-001..003 (Phase 1) passed task-verifier; SPEC-024 checklist updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-024-agents-md-operating-layer.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index c338ad5..e26879e 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -77,8 +77,8 @@ None. The kit ships no UI.
 Each task is atomic: implementable in one session, fits in ~50% of a context window.
 
 ### Phase 1: Operating layer
-- [ ] TASK-001: Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal.
-- [ ] TASK-002: Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first.
+- [x] TASK-001 (DONE 7c5015c, verified): Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal.
+- [x] TASK-002 (DONE c343d92, verified): Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first.
 - [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it.
 
 ### Phase 2: Lanes + projection

From f9c04cf2bda52bccb0044fa0d648a88886f0da86 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:54:01 +0700
Subject: [PATCH 06/56] feat(workflow): add backfill brownfield lane

Brownfield lane: review an existing codebase and write the
operating-layer docs (AGENTS.md / CLAUDE.md / specs) without
changing application behavior. Doc-output only, no app-code edits,
/spec optional. Consistent with the lanes list in AGENTS.md zone 2.
The lane explicitly forbids app-behavior change, which the
"backfill lane silently edits app code" failure-mode row depends on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index b672f77..8492aed 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -23,6 +23,7 @@ Pick a lane before you start. Smaller work skips ceremony.
 | normal | one bounded feature or fix | /spec, /execute, /review, /ship |
 | full   | touches auth, authz, hooks, data model, data loss, audit/security, an external provider, an API contract, a migration, or weakens validation | /think, /spec, /spec-validate, /execute, /review-team, /docs, /ship, /retro |
 | bug    | a defect, regression, or failing test (not a new feature) | /debug (root cause before any fix), then /review |
+| backfill | brownfield: review an existing codebase and write the operating-layer docs (AGENTS.md / CLAUDE.md / specs) | review the code, write the docs. Doc-output only; no app-behavior change, no app-code edits. /spec optional. |
 
 When in doubt between two lanes, take the heavier one. Anything in the full-lane
 trigger list uses the full lane unless you explicitly narrow the scope and say why.
@@ -62,7 +63,7 @@ session start
                           for whatever goal-loop activator is present, hand off to the
                           lane's first command. Mutator; does NOT execute, never writes
                           last-goal.md.
-  -> the lane runs        tiny | normal | full (see the cycle table above)
+  -> the lane runs        tiny | normal | full | bug | backfill (see the lane table above)
        normal/full -> /user:spec -> /user:spec-validate -> /user:execute (verify pipeline)
                       (opt-in: /user:devs-team + /user:visual-team before spec; /user:test-plan before execute;
                        /user:ui-design for downstream UI work, after /user:design)

From b7bb3ee547a2c31c9c263a2ee1d1ead994739580 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 22:58:40 +0700
Subject: [PATCH 07/56] feat(assign): project six-section goal from AGENTS.md

Replace the four-field contract-goal shape in Step 3 (Objective / Scope
fence / Termination-on-blocker / Verification) with a six-section
operating directive, each section projecting from an AGENTS.md zone or
the active spec, matching AGENTS.md's "How a goal is composed" table:

  Context-to-read <- AGENTS.md zone 1 (Read in this order)
  Constraints     <- AGENTS.md / CLAUDE.md rules + scope fence
  Operating rules <- AGENTS.md zone 2 (Task loop)
  Validation loop <- the active spec's ## Verification
  Done-when       <- AGENTS.md zone 3 + the spec's ## After state
  Pause-if        <- AGENTS.md zone 4 (Pause if)

Done-when quotes the spec's observable ## After state bullets (falls back
to AGENTS.md "Done means" when a spec lacks the section). Scope fence
folds into Constraints; termination-on-blocker folds into Pause-if. Step 4
now points at the six-section directive. Spec-first opening move kept.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/assign.md | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/commands/assign.md b/commands/assign.md
index 3be3c95..3d491fd 100644
--- a/commands/assign.md
+++ b/commands/assign.md
@@ -14,15 +14,18 @@ Read `$ARGUMENTS` for the `ID-NNN`. Find that row in the `_meta/BACKLOG.md` Acti
 
 If a `.claude/goals/<slug>.md` already exists for this id (one draft per id), re-surface the existing draft instead of creating a duplicate, and skip to Step 5. Mirrors SPEC-005 edge 6: the filesystem is the source of truth.
 
-### Step 3: Goal-craft the breakdown
+### Step 3: Project the six-section operating directive
 
-From the item's Title + Target artifact, craft a goal with:
-- **Objective**: the outcome that makes the item done, artifact-shaped and verifiable.
-- **Scope fence**: the files/dirs in play, plus an explicit `Not:` list of adjacent things to leave alone.
-- **Termination-on-blocker**: if blocked, write a named blocker note and stop (no churn).
-- **Verification**: the command(s) that prove it done. If the Target artifact is a SPEC with a `## Verification` section, point the goal at that doc (a pointer-`/goal`); otherwise name the real check.
+From the item's Title + Target artifact, craft the goal as a **six-section operating directive**, not a one-line contract. Each section *projects* from an `AGENTS.md` zone or a section of the active spec; this is the projection contract, and it must match `AGENTS.md`'s "How a goal is composed" table (keep the two in sync, or the `AGENTS.md`↔`assign.md` drift failure-mode fires). The mapping is a composition, not 1:1: two of the six sections come from the active spec, not from `AGENTS.md`.
 
-If the repo is spec-driven and the lane is normal/full, the goal is **spec-first**: its opening move is the lane's first command (`/user:spec`), not building code. This matches the `goal-craft` skill's spec-driven-repo rule.
+1. **Context-to-read** <- `AGENTS.md` zone 1 ("Read in this order"). Carry the ordered read list (AGENTS.md, CLAUDE.md, the active `docs/specs/SPEC-NNN-<slug>.md`, then reference docs) so the loop orients before touching anything.
+2. **Constraints** <- the `AGENTS.md` / `CLAUDE.md` rules plus the item's scope fence: the files/dirs in play and an explicit `Not:` list of adjacent things to leave alone. This is where the old scope fence lives now.
+3. **Operating rules** <- `AGENTS.md` zone 2 ("Task loop"): size the lane, read the spec + its AC, implement the smallest verifiable increment, verify, commit (one logical change, no spec/ticket IDs in the subject).
+4. **Validation loop** <- the active spec's `## Verification`. If the Target artifact is a SPEC, point the goal at that doc's `## Verification` (a pointer-`/goal`); otherwise name the real check command. Do not write "tests pass"; name the command.
+5. **Done-when** <- `AGENTS.md` zone 3 ("Done means") + the active spec's `## After state` (the observable bullets). The goal is done only when its AC are met, the check actually ran, review is recorded + a report written, and the final response says what changed and what was not attempted. Quote the spec's `## After state` observable bullets here verbatim as the done-picture. If the spec lacks an `## After state` section, fall back to `AGENTS.md` "Done means" alone.
+6. **Pause-if** <- `AGENTS.md` zone 4 ("Pause if"): stop and ask a human (with a named blocker note, no churn) on architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, or privacy/security. This is where the old termination-on-blocker lives now.
+
+If the repo is spec-driven and the lane is normal/full, the directive is **spec-first**: its opening move is the lane's first command (`/user:spec`), not building code. This matches the `goal-craft` skill's spec-driven-repo rule.
 
 ### Step 4: Write the draft (the SPEC-005 contract / ADR-0011)
 
@@ -36,7 +39,7 @@ target_spec: <SPEC-NNN, or "(none)">
 status: drafted
 created: <YYYY-MM-DD>
 ---
-<the goal text from Step 3, ready to paste into an activator>
+<the six-section operating directive from Step 3 (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if), ready to paste into an activator>
 ```
 
 `.claude/` is gitignored (per-machine drafts). The filesystem (`ls .claude/goals/*.md`) is authoritative; if a `.claude/goals/INDEX.md` cache exists, rebuild its row from the files. Do NOT write `.claude/last-goal.md`: the built-in `/goal` owns that slot (ADR-0011).

From 86e3aa0b4f02f4ffb83b7a592acb7ea27a892bd0 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:01:04 +0700
Subject: [PATCH 08/56] feat(spec): add After state section to spec template

Insert an observable definition-of-done section between ## Solution and
## Acceptance Criteria in the commands/spec.md template block. Each bullet
is false-now/true-after and checkable by a human or a command. The inline
rule states observable, not narrated, or it is fluff and gets cut, anchored
to PHILOSOPHY's every-file-justifies-itself honesty rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/spec.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/commands/spec.md b/commands/spec.md
index f8b442d..fb8066e 100644
--- a/commands/spec.md
+++ b/commands/spec.md
@@ -119,6 +119,12 @@ Each task must be atomic: implementable in one session, fits in 50% of a context
 ### Phase 3: Polish
 - [ ] TASK-004: [description] — [acceptance criteria]
 
+## After state
+The definition-of-done picture. Each bullet is false now and true after, and each is checkable by a human or a command. This feeds `## Acceptance Criteria` below and projects into the goal's `Done-when`.
+Rule: observable, not narrated. If a bullet cannot be verified by reading a file, running a command, or seeing a state, it is fluff and gets cut (PHILOSOPHY: every file justifies its existence). Pair the after-state with a "(Today: ...)" current-state note where it sharpens the contrast.
+- [ ] [observable end state]. (Today: [current state].)
+- [ ] [observable end state, checkable by `<command>`].
+
 ## Acceptance Criteria (global)
 - [ ] All tasks pass their individual acceptance criteria
 - [ ] Tests cover happy path + edge cases listed below

From 8a1219178e7c6f6845c7b583a4df4ab09ec8ba44 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:02:31 +0700
Subject: [PATCH 09/56] docs(spec): record verified lanes and projection
 progress

TASK-004 backfill lane, TASK-005 six-section projection, TASK-006 After-state template: all passed task-verifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-024-agents-md-operating-layer.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index e26879e..922190f 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -82,9 +82,9 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win
 - [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it.
 
 ### Phase 2: Lanes + projection
-- [ ] TASK-004: Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present.
-- [ ] TASK-005: Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside.
-- [ ] TASK-006: Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present.
+- [x] TASK-004 (DONE f9c04cf, verified): Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present.
+- [x] TASK-005 (DONE b7bb3ee, verified): Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside.
+- [x] TASK-006 (DONE 86e3aa0, verified): Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present.
 
 ### Phase 3: Sync (the doc-impact map)
 - [ ] TASK-007: Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.

From 9fc3e251ff2d8425518a45b7bada44c661c8007a Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:04:07 +0700
Subject: [PATCH 10/56] docs(workflow): map AGENTS.md and top-level-file
 impacts

Extend the WORKFLOW.md doc-impact map for the operating-layer docs:

- Add a "new top-level file under the kit root" trigger row, mirroring
  the existing top-level-dir row, so a top-level file like AGENTS.md no
  longer slips through the self-maintaining net.
- Add an AGENTS.md companion-doc row (CLAUDE.md/WORKFLOW.md pointers,
  examples/hello-spec/AGENTS.md, commands/assign.md, tests/test-meta.sh).
- Note that the backfill lane produces operating-layer docs, so a
  backfill run knows which docs it writes.
- Widen the self-maintaining note to cover both dirs and files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index 8492aed..9daac45 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -104,9 +104,13 @@ Per change-type, the companion docs that must move with it. This covers the enum
 | a new `docs/decisions/` ADR | README + `docs/architecture.md` cross-refs |
 | a new `docs/specs/SPEC-NNN` | `_meta/BACKLOG.md` status, the spec's `Status:` header |
 | **a new top-level dir under the kit root** | **this doc-impact map (WORKFLOW.md)**, README "Project structure", `docs/architecture.md` |
+| **a new top-level file under the kit root** | **this doc-impact map (WORKFLOW.md)**, README "Project structure", `docs/architecture.md` |
+| `AGENTS.md` (kit root) | `CLAUDE.md` + `WORKFLOW.md` pointers (must not drift), `examples/hello-spec/AGENTS.md` (downstream template), `commands/assign.md` (the six-section projection reads its zones), `tests/test-meta.sh` |
 | any shipped change (normal/full) | `CHANGELOG.md`, `VERSION`, `.claude-plugin/plugin.json` version, `tool.toml` version, `docs/retro/v<ver>.md` |
 
-The bolded row is self-maintaining: adding a new top-level dir must update this map.
+The bolded rows are self-maintaining: adding a new top-level dir or file must update this map.
+
+The `backfill` lane (see the lane table) produces operating-layer docs rather than touching a source path: a backfill run writes `AGENTS.md`, `CLAUDE.md`, and any specs for the reviewed codebase, so its companion docs are those it writes.
 
 **Version surfaces.** The version string is duplicated and must stay in sync: it lives in `VERSION` (the source of truth), `.claude-plugin/plugin.json`, and `tool.toml`. Bumping the version means updating those; `marketplace.json` inherits it via `"source": "."` and needs no bump. The kit does NOT keep component counts (`N hooks`, `N commands`, etc.) in prose: describe the component set qualitatively, never as a hand-maintained number that silently rots.
 

From f51537e470c810e89610f46b219c2edfda46212c Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:06:15 +0700
Subject: [PATCH 11/56] docs: cross-reference AGENTS.md and backfill lane

README "Project structure" now lists AGENTS.md (the tool-agnostic
operate-contract front door) and WORKFLOW.md, and notes CLAUDE.md is the
Claude-Code layer on top. README Workflow section names the backfill lane.
docs/architecture.md data-flow intro cross-references AGENTS.md as the front
door that CLAUDE.md and WORKFLOW.md point at, and names the backfill lane.

These are structure listings and cross-references; the operate-contract itself
stays in AGENTS.md (not restated here).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md            | 6 +++++-
 docs/architecture.md | 2 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index ec08296..6a1615d 100644
--- a/README.md
+++ b/README.md
@@ -156,6 +156,8 @@ Don't run both install paths on the same machine -- hooks would register twice.
 /user:retro          Retrospective (10 min, after shipping)
 ```
 
+Work is sized by risk lane before it starts (tiny / normal / full / bug, plus a `backfill` lane for brownfield: review an existing codebase and write the operating-layer docs without changing app behavior). The lanes, the gate at each phase boundary, and the operate-contract the agent follows live in [`AGENTS.md`](AGENTS.md) and [`WORKFLOW.md`](WORKFLOW.md).
+
 ## Debug mode
 
 Set `DWARVES_KIT_DEBUG=1` to see what every hook is doing:
@@ -216,10 +218,12 @@ These tools complement the kit but are installed separately:
 ```
 dwarves-kit/
   tool.toml                     Kit metadata (name, version, language=bash, deps)
+  AGENTS.md                     Tool-agnostic operate-contract front door (any runtime reads it first)
+  WORKFLOW.md                   The cycle, the risk-tier lanes, the gate at each boundary
   MANUAL.md                     Operator reference for the commands
   RUNBOOK.md                    Hook misbehavior diagnosis + recovery
   README.md / CONTRIBUTING.md / CHANGELOG.md / VERSION / LICENSE
-  CLAUDE.md                     Project template
+  CLAUDE.md                     Project template; the Claude-Code layer on top of AGENTS.md
   install.sh / settings.json    Bash install path
   .claude-plugin/               Plugin install path (plugin.json, marketplace.json)
   .github/workflows/test.yml    CI: macOS + Ubuntu test matrix
diff --git a/docs/architecture.md b/docs/architecture.md
index fba6b98..97aeb79 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -18,7 +18,7 @@ The kit is intentionally flat. Component dirs sit at the top of the repo, not ne
 
 ## Data flow through `docs/specs/SPEC-NNN-<slug>.md`
 
-> The imperative companion to the descriptive map below is [`WORKFLOW.md`](../WORKFLOW.md) (repo root): the same lifecycle phrased as the contract an agent follows, with risk-tier lanes and the gate at each boundary.
+> The front door to all of this is [`AGENTS.md`](../AGENTS.md) (repo root): the tool-agnostic operate-contract that any runtime reads first; `CLAUDE.md` and `WORKFLOW.md` point at it rather than restate it. The imperative companion to the descriptive map below is [`WORKFLOW.md`](../WORKFLOW.md) (repo root): the same lifecycle phrased as the contract an agent follows, with risk-tier lanes (including the `backfill` brownfield lane: review an existing codebase and write the operating-layer docs, doc-output only, no app-behavior change) and the gate at each boundary.
 
 `docs/specs/SPEC-NNN-<slug>.md` is the shared contract for the full lifecycle. It is the single source of truth that crosses command boundaries:
 

From 6ec78805dac145f14eacd5b5f5da9f526d9f502e Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:09:25 +0700
Subject: [PATCH 12/56] test(meta): assert AGENTS.md zones, projection, and
 install merge

Pin the cycle's structural outputs so a wording flip fails CI:
- kit-root AGENTS.md: the four portable zones, the literal "Pause if",
  and the Claude-Code-only-enforcement statement.
- commands/assign.md: the six-section /goal projection (writer side).
- low-cost guards for hello-spec AGENTS.md and spec.md "## After state".

Add the install merge-with-existing-hooks regression: seed a HOME whose
settings.json already carries a third-party hook, run install.sh through
the jq clean+merge path, and assert the hook survives, the JSON stays
valid, and a dwarves-kit hook is merged in. The test passes against the
current install.sh, so no jq fix was needed (DEC-004: bug not present).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 tests/test-meta.sh | 144 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 144 insertions(+)

diff --git a/tests/test-meta.sh b/tests/test-meta.sh
index f0d62fd..1597a8e 100755
--- a/tests/test-meta.sh
+++ b/tests/test-meta.sh
@@ -149,6 +149,150 @@ INPLACE_BROKEN=$(for f in "$INPLACE_HOME/.claude/dwarves-kit/hooks/"*.sh; do [ -
 assert_eq "in-place install keeps hook scripts resolvable (broken: ${INPLACE_BROKEN:-none})" "" "$INPLACE_BROKEN"
 rm -rf "$INPLACE_HOME"
 
+# ============================================================
+echo ""
+echo "=== AGENTS.md operating layer (SPEC-024) ==="
+# ============================================================
+# Part A: pin the cycle's structural outputs so a wording flip fails CI.
+#   1. kit-root AGENTS.md exists + carries the four portable zones + the literal
+#      "Pause if" + the CC-only-enforcement statement.
+#   2. commands/assign.md carries the six-section /goal projection (the writer
+#      side of the AGENTS.md->assign.md projection).
+#   3. Low-cost regression guards for TASK-003 (hello-spec AGENTS.md) and
+#      TASK-006 (spec.md "## After state").
+
+AGENTS_MD="$KIT_DIR/AGENTS.md"
+TOTAL=$((TOTAL + 1))
+if [ -f "$AGENTS_MD" ]; then
+  echo -e "  ${GREEN}PASS${NC} AGENTS.md exists at kit root (SPEC-024)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} AGENTS.md missing at kit root"
+  FAIL=$((FAIL + 1))
+fi
+
+# The four portable zones (DEC-005). Pin the heading literals, not prose.
+for ZONE in "## 1. Read in this order" "## 2. Task loop" "## 3. Done means" "## 4. Pause if (ask a human)"; do
+  TOTAL=$((TOTAL + 1))
+  if grep -qF "$ZONE" "$AGENTS_MD" 2>/dev/null; then
+    echo -e "  ${GREEN}PASS${NC} AGENTS.md has zone '$ZONE'"
+    PASS=$((PASS + 1))
+  else
+    echo -e "  ${RED}FAIL${NC} AGENTS.md missing zone '$ZONE'"
+    FAIL=$((FAIL + 1))
+  fi
+done
+
+# The literal "Pause if" (the fourth zone's stable phrase, also the goal section).
+TOTAL=$((TOTAL + 1))
+if grep -qF 'Pause if' "$AGENTS_MD" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} AGENTS.md carries the literal 'Pause if'"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} AGENTS.md lost the literal 'Pause if'"
+  FAIL=$((FAIL + 1))
+fi
+
+# The CC-only-enforcement statement (PHILOSOPHY honesty rule: never over-claim
+# portable enforcement). A drift to "enforcement is portable" would be a lie.
+TOTAL=$((TOTAL + 1))
+if grep -qF 'Enforcement is Claude-Code-only' "$AGENTS_MD" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} AGENTS.md states enforcement is Claude-Code-only"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} AGENTS.md lost the CC-only-enforcement statement"
+  FAIL=$((FAIL + 1))
+fi
+
+# commands/assign.md carries the six-section projection (the writer side). A
+# wording flip on any section name breaks the AGENTS.md->assign.md projection.
+ASSIGN_MD="$KIT_DIR/commands/assign.md"
+for SECTION in "Context-to-read" "Constraints" "Operating rules" "Validation loop" "Done-when" "Pause-if"; do
+  TOTAL=$((TOTAL + 1))
+  if grep -qF "$SECTION" "$ASSIGN_MD" 2>/dev/null; then
+    echo -e "  ${GREEN}PASS${NC} assign.md has projection section '$SECTION'"
+    PASS=$((PASS + 1))
+  else
+    echo -e "  ${RED}FAIL${NC} assign.md missing projection section '$SECTION'"
+    FAIL=$((FAIL + 1))
+  fi
+done
+
+# Low-cost regression guards: TASK-003 (hello-spec AGENTS.md w/ "Pause if") and
+# TASK-006 (spec.md template's "## After state"). Pin both so they cannot silently
+# regress.
+DEMO_AGENTS="$KIT_DIR/examples/hello-spec/AGENTS.md"
+TOTAL=$((TOTAL + 1))
+if [ -f "$DEMO_AGENTS" ] && grep -qF 'Pause if' "$DEMO_AGENTS" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} examples/hello-spec/AGENTS.md exists + carries 'Pause if' (TASK-003)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} examples/hello-spec/AGENTS.md missing or lost 'Pause if'"
+  FAIL=$((FAIL + 1))
+fi
+
+TOTAL=$((TOTAL + 1))
+if grep -qF '## After state' "$KIT_DIR/commands/spec.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} commands/spec.md template carries '## After state' (TASK-006)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} commands/spec.md template lost '## After state'"
+  FAIL=$((FAIL + 1))
+fi
+
+# ------------------------------------------------------------
+# Part B: install.sh merge-with-existing-hooks regression (DEC-004).
+# The existing installer test runs into a HOME with NO settings.json, so it never
+# exercises the jq clean+merge path. This test pre-seeds settings.json with a
+# THIRD-PARTY hook (a command that does NOT contain "dwarves-kit") and asserts the
+# merge preserves it, yields valid JSON, and still pulls in a dwarves-kit hook.
+MERGE_HOME=$(mktemp -d)
+mkdir -p "$MERGE_HOME/.claude"
+THIRD_PARTY_CMD="/opt/acme/hooks/audit-log.sh"
+# Build the pre-existing settings via jq so it is always well-formed JSON.
+jq -n --arg cmd "$THIRD_PARTY_CMD" '{
+  hooks: {
+    PreToolUse: [
+      { matcher: "Bash", hooks: [ { type: "command", command: $cmd } ] }
+    ]
+  }
+}' > "$MERGE_HOME/.claude/settings.json"
+
+HOME="$MERGE_HOME" bash "$KIT_DIR/install.sh" >/dev/null 2>&1
+MERGED_SETTINGS="$MERGE_HOME/.claude/settings.json"
+
+# (a) the third-party hook command survived the merge.
+TOTAL=$((TOTAL + 1))
+if grep -qF "$THIRD_PARTY_CMD" "$MERGED_SETTINGS" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} install merge preserves the third-party hook (DEC-004)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} install merge DROPPED the third-party hook (merge bug)"
+  FAIL=$((FAIL + 1))
+fi
+
+# (b) the resulting settings.json is valid JSON.
+TOTAL=$((TOTAL + 1))
+if jq '.' "$MERGED_SETTINGS" >/dev/null 2>&1; then
+  echo -e "  ${GREEN}PASS${NC} merged settings.json is valid JSON"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} merged settings.json is not valid JSON (merge corrupted it)"
+  FAIL=$((FAIL + 1))
+fi
+
+# (c) at least one dwarves-kit hook was merged in alongside the third-party one.
+TOTAL=$((TOTAL + 1))
+KIT_HOOK_COUNT=$(jq '[.hooks | to_entries[] | .value[] | .hooks[] | select(.command | tostring | contains("dwarves-kit"))] | length' "$MERGED_SETTINGS" 2>/dev/null || echo 0)
+if [ "${KIT_HOOK_COUNT:-0}" -gt 0 ]; then
+  echo -e "  ${GREEN}PASS${NC} install merge added at least one dwarves-kit hook ($KIT_HOOK_COUNT)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} install merge added no dwarves-kit hooks"
+  FAIL=$((FAIL + 1))
+fi
+rm -rf "$MERGE_HOME"
+
 # ============================================================
 echo ""
 echo "=== Agent files ==="

From 43fa85d6ef3fbfd90ccab286d61b1f527a01126f Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:12:51 +0700
Subject: [PATCH 13/56] chore(release): v1.7.0 AGENTS.md operating layer

Bump VERSION, plugin.json, and tool.toml 1.6.0 -> 1.7.0 for the
SPEC-024 feature release. Add the 1.7.0 CHANGELOG entry (AGENTS.md
front door, backfill lane, assign goal projection, spec After-state
section, doc-impact-map rows, downstream AGENTS.md template, and the
added install settings-merge regression coverage framed as a guard,
not a fix). Add a copy-if-absent AGENTS.md tip to install.sh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .claude-plugin/plugin.json |  2 +-
 CHANGELOG.md               | 13 +++++++++++--
 VERSION                    |  2 +-
 install.sh                 |  3 +++
 tool.toml                  |  2 +-
 5 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
index 9d9046e..78fa52a 100644
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "dwarves-kit",
-  "version": "1.6.0",
+  "version": "1.7.0",
   "description": "Spec-driven Claude Code workflow with verification pipeline. Hooks, slash commands, and verification subagents.",
   "author": {
     "name": "Dwarves Foundation",
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9b9bc88..a4a5733 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -42,9 +42,18 @@ All notable changes to dwarves-kit are documented here.
 - **`context-readiness.sh` + `session-state-save.sh` count fragility**: `find ... | grep -v` and `grep -c ... || echo 0` both mishandled the zero-match case under `set -e`/pipefail. In a source-file-free repo the hook aborted with no output; a spec with 0 done tasks rendered `tasks:0\n0/N` plus an `integer expected` error. Now uses `{ grep -v ... || true; }` and `grep -c ... || true`. Found during SPEC-010 execution (ID-013).
 - **`install.sh` never materialized the hooks `settings.json` references (SPEC-025)**: settings hard-code every hook (and the statusline) at `$HOME/.claude/dwarves-kit/hooks/<script>.sh`, but the bash installer only `chmod`'d the scripts at its own `$KIT_DIR/hooks/`, merged settings, and created `logs/`. That coincidence held only for the documented in-place clone (README Option 2: clone to `~/.claude/dwarves-kit`); from a dev checkout or CI clone elsewhere, `~/.claude/dwarves-kit/` got only `logs/` and all 14 hooks plus the statusline pointed at missing files, so a fresh session opened with `SessionStart ... No such file or directory` and every hook was silently dead. `install.sh` now links each `hooks/*.sh` into `~/.claude/dwarves-kit/hooks/` when the kit lives elsewhere (per-file symlinks, mirroring the existing `commands/` step, so dev edits stay live), detects the in-place layout and skips linking (a naive loop there deletes the real scripts and leaves broken self-referential symlinks, a regression the first cut shipped and the SDD pass then caught), and the uninstall removes only the symlinks it created. `tests/test-meta.sh` gains a 3-assertion guard (referenced scripts exist in `hooks/`; an isolated out-of-place install resolves every path; an in-place install keeps the scripts resolvable) so neither failure mode can regress past CI. Meta suite 213 -> 216.
 
-## [1.6.0] - 2026-05-20
+## [1.7.0] - 2026-05-21
 
-Orchestration layer (SPEC-003) plus upstream-audit absorption and lineage hygiene (SPEC-002).
+AGENTS.md operating-layer front door plus a brownfield backfill lane and a goal-projection enrichment (SPEC-024).
+
+### Added
+
+- **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
+- **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
+- **Six-section goal projection in `/user:assign`**: the goal draft `/user:assign` writes now projects the backlog item across six fixed sections, so a handed-off goal carries fuller context into its lane.
+- **`## After state` section in the `/user:spec` template**: specs now scaffold the post-implementation end state, complementing the existing Verification / Open-questions stop-criteria.
+- **doc-impact-map rows for `AGENTS.md` and the top-level files**: the WORKFLOW doc-impact map (reviewed by `/user:ship` + `/user:retro`) now lists `AGENTS.md` and the other root-level docs so a change that touches them is flagged for a doc update.
+- **Regression coverage for the install settings-merge against a pre-existing third-party hook (test/guard, not a fix)**: `tests/test-meta.sh` gains a block that asserts `install.sh`'s `settings.json` merge preserves an existing third-party hook entry. This is added coverage, not a bug fix: the current installer already preserves third-party hooks (the jq merge unions correctly), so no installer change was needed. The test pins that behavior so it cannot silently regress. Meta suite to 234 asserts.
 
 ### Added
 
diff --git a/VERSION b/VERSION
index dc1e644..bd8bf88 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-1.6.0
+1.7.0
diff --git a/install.sh b/install.sh
index 9f3f146..134d396 100755
--- a/install.sh
+++ b/install.sh
@@ -299,4 +299,7 @@ echo ""
 echo "Tip: Copy CLAUDE.md template to your project root:"
 echo "  cp $KIT_DIR/CLAUDE.md ./CLAUDE.md"
 echo ""
+echo "Tip: If your project does not already have an AGENTS.md, copy one to its root:"
+echo "  cp $KIT_DIR/AGENTS.md ./AGENTS.md"
+echo ""
 echo "To uninstall: bash $KIT_DIR/install.sh --uninstall"
diff --git a/tool.toml b/tool.toml
index 5decea8..5ae93ee 100644
--- a/tool.toml
+++ b/tool.toml
@@ -4,7 +4,7 @@ language    = "bash"
 status      = "stable"
 entry       = "install.sh"
 created     = 2026-03-29
-version     = "1.6.0"
+version     = "1.7.0"
 
 # Consumer repos: any Dwarves project that adopts the spec-driven flow.
 # Not bound to a specific repo; the kit is per-machine, not per-project.

From 2f2b0a3aa6a8b96e8c5bf1ccd201370f0371190e Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:14:37 +0700
Subject: [PATCH 14/56] docs(spec): record verified sync-phase progress

TASK-007..010 (doc-impact map, README/architecture cross-refs, test asserts + install-merge guard, v1.7.0 release) passed task-verifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-024-agents-md-operating-layer.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index 922190f..5fb1459 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -87,10 +87,10 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win
 - [x] TASK-006 (DONE 86e3aa0, verified): Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present.
 
 ### Phase 3: Sync (the doc-impact map)
-- [ ] TASK-007: Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.
-- [ ] TASK-008: Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`.
-- [ ] TASK-009: Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes.
-- [ ] TASK-010: Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry (include the install.sh jq fix as a fix line), and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature + the install fix.
+- [x] TASK-007 (DONE 9fc3e25, verified): Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.
+- [x] TASK-008 (DONE f51537e, verified): Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`.
+- [x] TASK-009 (DONE 6ec7880, verified; merge test passes on current install.sh -> no jq fix needed, DEC-004 resolved as no-bug): Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes.
+- [x] TASK-010 (DONE 43fa85d, verified): Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry (include the install.sh jq fix as a fix line), and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature + the install fix.
 
 ## Acceptance Criteria (global)
 - [ ] All tasks pass their individual acceptance criteria.

From 3ba70226675257790e84b315b5d7c690a980d037 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:29:45 +0700
Subject: [PATCH 15/56] fix(workflow): drop read-order restatement; add drift
 guards

Addresses /review-team findings on the SPEC-024 build:
- WORKFLOW.md Required-reading no longer restates the AGENTS.md-owned ordered list (the spec's primary drift failure mode, realized in output); it now points only.
- tests/test-meta.sh: negative anti-drift assertions (WORKFLOW.md + CLAUDE.md carry no numbered read-order restatement); pin all four hello-spec AGENTS.md zones; pin assign.md Done-when references the spec After-state.
- backfill lane now has a routing line in the spine.
- SPEC-024 bookkeeping: TASK-003 + global AC + After-state boxes checked; TASK-010 wording corrected to no-fix reality.
- BACKLOG: ID-018 (install tip cp -n) and ID-019 (demo SPEC After-state) logged as deferred LOW findings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md                                   | 13 ++++---
 _meta/BACKLOG.md                              |  5 ++-
 .../SPEC-024-agents-md-operating-layer.md     | 24 ++++++-------
 tests/test-meta.sh                            | 34 +++++++++++++++++++
 4 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index 9daac45..45ff4fe 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -6,13 +6,10 @@
 > safety-gate hook, the push-to-main blocker, the anti-rationalization Stop
 > hook, and the verification pipeline.
 
-## Required reading (in order)
-`AGENTS.md` is the front door and owns the read-order. Read it first, then this
-list; do not restate the order here. See `AGENTS.md` zone 1 ("Read in this order").
-1. AGENTS.md            - the operate-contract: read-order, task loop, done-definition, "Pause if"
-2. CLAUDE.md            - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin
-3. docs/specs/SPEC-NNN-<slug>.md - the active spec; the shared contract for the cycle
-4. docs/architecture.md - how the pieces fit (reference; not required per task)
+## Required reading
+`AGENTS.md` is the front door and owns the read-order; it is the single source.
+Read `AGENTS.md` zone 1 ("Read in this order") for the full ordered list, then
+return here. This file does not restate the list, so the two cannot drift.
 
 ## Size the work first (risk-tiered intake)
 Pick a lane before you start. Smaller work skips ceremony.
@@ -68,6 +65,8 @@ session start
                       (opt-in: /user:devs-team + /user:visual-team before spec; /user:test-plan before execute;
                        /user:ui-design for downstream UI work, after /user:design)
                       -> /user:review -> /user:docs -> /user:ship -> /user:retro
+       backfill    -> review the codebase, write AGENTS.md / CLAUDE.md / specs, then
+                      /user:review (optional). No /user:execute; no app-code edits.
   -> on ship              /user:ship reviews the completeness log; ID-NNN drops off the
                           queue (CHANGELOG is the canonical shipped record).
 ```
diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 2696e1c..4577968 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -34,13 +34,16 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-002 | Absorb skills/hooks developed in ops-toolkit into the kit (internal lane) | item e | SPEC-007 (PARKED; see its Parked note) | full | parked |
 | ID-003 | Deep orchestration scan of the copied repos vs our WORKFLOW (report + absorb plan) | item 2 | `docs/research/2026-05-20-orchestration-deep-scan.md` | normal | shipped (report delivered; recs folded into SPEC-006) |
 | ID-012 | Goal-loop fidelity (one theme, two parts). P1: stop-criteria in the `/spec` template (pin `## Verification` + `## Open questions` so specs are pointer-`/goal`-ready) -- build now, tiny lane. P2: QA gate around the autonomous loop (verify-into-loops + SPEC-006 completeness clauses + Reviewer 5) -- held until the pointer-`/goal` pattern has real runs | maintainer Q3 2026-05-20 + goal-readiness convergence 2026-05-21 | SPEC-012 | normal (P1) / full (P2) | P1 shipped (see CHANGELOG); P2 held |
-| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | speccing |
+| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | executing (built + reviewed; ship next) |
 | ID-016 | Promote the "new SPEC -> BACKLOG status row" doc-impact check to a guard (a hook or a kit-health line); it was missed across multiple cycles | SPEC-025 retro 2026-05-21 (recurrence clears the PHILOSOPHY section 5 bar) | TBD | normal | queued |
 | ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-<slug>.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued |
+| ID-018 | install.sh prints blind `cp` tips for AGENTS.md and CLAUDE.md; harden both to copy-if-absent at the command level (`cp -n` or a guard) so the printed command matches its "if absent" prose, or record why prose-only suffices | SPEC-024 review 2026-05-21 (security lens) | TBD | tiny | queued |
+| ID-019 | Demo `examples/hello-spec/docs/specs/SPEC-001-version-flag.md` lacks a `## After state` section; add one so the demo is a complete exemplar of the post-SPEC-024 spec template | SPEC-024 review 2026-05-21 (test-coverage lens) | TBD | tiny | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
 - ID-016 / ID-017 came out of the SPEC-025 retro; both are kit-internal hygiene, independent of the SPEC-024 chain.
+- ID-018 / ID-019 came out of the SPEC-024 /review-team pass (deferred LOW findings); both are tiny-lane polish, independent of the SPEC-024 ship.
 
 ## Schema
 
diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index 5fb1459..b15cf44 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -79,7 +79,7 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win
 ### Phase 1: Operating layer
 - [x] TASK-001 (DONE 7c5015c, verified): Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal.
 - [x] TASK-002 (DONE c343d92, verified): Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first.
-- [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it.
+- [x] TASK-003 (DONE 5e4bfc1, verified): Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it.
 
 ### Phase 2: Lanes + projection
 - [x] TASK-004 (DONE f9c04cf, verified): Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present.
@@ -90,24 +90,24 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win
 - [x] TASK-007 (DONE 9fc3e25, verified): Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.
 - [x] TASK-008 (DONE f51537e, verified): Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`.
 - [x] TASK-009 (DONE 6ec7880, verified; merge test passes on current install.sh -> no jq fix needed, DEC-004 resolved as no-bug): Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes.
-- [x] TASK-010 (DONE 43fa85d, verified): Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry (include the install.sh jq fix as a fix line), and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature + the install fix.
+- [x] TASK-010 (DONE 43fa85d, verified): Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry, and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature and the install-merge regression test as coverage (not a fix; DEC-004 resolved as no-bug, see TASK-009).
 
 ## Acceptance Criteria (global)
-- [ ] All tasks pass their individual acceptance criteria.
-- [ ] `AGENTS.md` is the front door (root + examples) carrying the four portable zones; `CLAUDE.md` and `WORKFLOW.md` point at it (no restating); `backfill` lane present; `assign.md` projects the six sections; `spec.md` template has `## After state`.
-- [ ] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0.
+- [x] All tasks pass their individual acceptance criteria.
+- [x] `AGENTS.md` is the front door (root + examples) carrying the four portable zones; `CLAUDE.md` and `WORKFLOW.md` point at it (no restating); `backfill` lane present; `assign.md` projects the six sections; `spec.md` template has `## After state`.
+- [x] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0.
 
 ## Verification
 `bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `grep -q 'Pause if' AGENTS.md` AND `grep -q backfill WORKFLOW.md` AND `grep -q '## After state' commands/spec.md`
 
 ## After state
-Observable; each bullet is false today and a real check once shipped.
-- [ ] `AGENTS.md` exists at the kit root and `grep -q 'Pause if' AGENTS.md` passes. (Today: no `AGENTS.md` anywhere in the kit.)
-- [ ] `CLAUDE.md` points at `AGENTS.md` and no longer restates the read-order / done-definition / pause list.
-- [ ] `WORKFLOW.md` lists a `backfill` lane. (Today: only `tiny` / `normal` / `full` / `bug`.)
-- [ ] `commands/assign.md` produces a six-section directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if), not a one-line contract goal.
-- [ ] `commands/spec.md` template contains a `## After state` section, and a newly generated spec carries observable after-state bullets.
-- [ ] `examples/hello-spec/AGENTS.md` exists so a downstream repo inherits the front door.
+Observable; each bullet was false before this cycle and is a real check now (all verified at review).
+- [x] `AGENTS.md` exists at the kit root and `grep -q 'Pause if' AGENTS.md` passes. (Was: no `AGENTS.md` anywhere in the kit.)
+- [x] `CLAUDE.md` points at `AGENTS.md` and no longer restates the read-order / done-definition / pause list. (`WORKFLOW.md` likewise points; the no-restate invariant is now pinned by test-meta.)
+- [x] `WORKFLOW.md` lists a `backfill` lane. (Was: only `tiny` / `normal` / `full` / `bug`.)
+- [x] `commands/assign.md` produces a six-section directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if), not a one-line contract goal.
+- [x] `commands/spec.md` template contains a `## After state` section, and a newly generated spec carries observable after-state bullets.
+- [x] `examples/hello-spec/AGENTS.md` exists so a downstream repo inherits the front door.
 
 ## Edge Cases
 1. A non-CC runtime (Codex/Gemini) reads `AGENTS.md` but gets no hook enforcement: `AGENTS.md` must state enforcement is CC-only so no operator assumes the guardrails are portable.
diff --git a/tests/test-meta.sh b/tests/test-meta.sh
index 1597a8e..d9d0be9 100755
--- a/tests/test-meta.sh
+++ b/tests/test-meta.sh
@@ -231,6 +231,19 @@ else
   FAIL=$((FAIL + 1))
 fi
 
+# Review issue 2: the downstream template (the file real projects copy) must pin
+# all four zone headings too, not just "Pause if" - same teeth as the kit root.
+for ZONE in "## 1. Read in this order" "## 2. Task loop" "## 3. Done means" "## 4. Pause if (ask a human)"; do
+  TOTAL=$((TOTAL + 1))
+  if grep -qF "$ZONE" "$DEMO_AGENTS" 2>/dev/null; then
+    echo -e "  ${GREEN}PASS${NC} examples/hello-spec/AGENTS.md has zone '$ZONE'"
+    PASS=$((PASS + 1))
+  else
+    echo -e "  ${RED}FAIL${NC} examples/hello-spec/AGENTS.md missing zone '$ZONE'"
+    FAIL=$((FAIL + 1))
+  fi
+done
+
 TOTAL=$((TOTAL + 1))
 if grep -qF '## After state' "$KIT_DIR/commands/spec.md" 2>/dev/null; then
   echo -e "  ${GREEN}PASS${NC} commands/spec.md template carries '## After state' (TASK-006)"
@@ -240,6 +253,27 @@ else
   FAIL=$((FAIL + 1))
 fi
 
+# Review issue 6: assign.md Done-when must reference the spec's "## After state"
+# (the projection source), not merely carry the "Done-when" label.
+TOTAL=$((TOTAL + 1))
+if grep -qF '## After state' "$ASSIGN_MD" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} assign.md Done-when references the spec's '## After state'"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} assign.md Done-when lost the '## After state' projection source"
+  FAIL=$((FAIL + 1))
+fi
+
+# Review issue 1 (anti-drift): the spec's primary failure mode is a CC-layer doc
+# RESTATING the ordered read-list that AGENTS.md owns (zone 1 is the single source).
+# WORKFLOW.md and CLAUDE.md must point, not carry a numbered "1. AGENTS.md /
+# 2. CLAUDE.md ..." restatement. A reappearance is drift; fail loudly. Scoped to
+# these two CC-layer docs; AGENTS.md itself legitimately carries the list.
+for DOC in WORKFLOW.md CLAUDE.md; do
+  RESTATE=$(grep -cE '^[0-9]+\.[[:space:]]+(AGENTS|CLAUDE)\.md' "$KIT_DIR/$DOC" 2>/dev/null || true)
+  assert_eq "$DOC does not restate the AGENTS.md read-order list (no drift)" "0" "$RESTATE"
+done
+
 # ------------------------------------------------------------
 # Part B: install.sh merge-with-existing-hooks regression (DEC-004).
 # The existing installer test runs into a HOME with NO settings.json, so it never

From 0aa5b784079f71e46933325ce10cefa9d728af6a Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:31:56 +0700
Subject: [PATCH 16/56] docs: sync MANUAL reads/writes and CLAUDE downstream
 note for AGENTS.md

doc-impact-map companions the build missed: MANUAL /user:assign now lists the AGENTS.md/spec projection source + the six-section output; CLAUDE downstream-template note names examples/hello-spec/AGENTS.md as the front door. doc-verifier PASS (9 claims).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md | 2 +-
 MANUAL.md | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 9640766..05306f7 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -67,4 +67,4 @@ Hooks fire on Claude Code events automatically; do not call them manually. Debug
 
 ## Template for downstream projects
 
-If you are scaffolding a new project that will use dwarves-kit, do not copy THIS file. Use `examples/hello-spec/CLAUDE.md` as the template; it shows the full shape (Project, Tech Stack, Commands, Repository Structure, Code Quality Rules, Workflow, Spec Location) with realistic placeholder content.
+If you are scaffolding a new project that will use dwarves-kit, do not copy THIS file. Use `examples/hello-spec/CLAUDE.md` as the template; it shows the full shape (Project, Tech Stack, Commands, Repository Structure, Code Quality Rules, Workflow, Spec Location) with realistic placeholder content. The downstream front door is `examples/hello-spec/AGENTS.md` (the tool-agnostic operate-contract); copy it alongside, with `CLAUDE.md` as the Claude-Code layer on top.
diff --git a/MANUAL.md b/MANUAL.md
index b53ba12..3d14202 100644
--- a/MANUAL.md
+++ b/MANUAL.md
@@ -60,8 +60,8 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 ### `/user:assign`
 
 **Phase:** orchestrate (backlog item -> goal draft -> lane)
-**Reads:** `_meta/BACKLOG.md` Active queue (by `ID-NNN`), the item's Lane column
-**Writes:** `.claude/goals/<slug>.md` (the SPEC-005 draft contract; never `.claude/last-goal.md`)
+**Reads:** `_meta/BACKLOG.md` Active queue (by `ID-NNN`), the item's Lane column, `AGENTS.md` zones (the projection source for the six-section goal) + the active spec's `## Verification` / `## After state`
+**Writes:** `.claude/goals/<slug>.md` (the SPEC-005 draft contract; never `.claude/last-goal.md`), a six-section operating directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if)
 **When to invoke:** you picked an item from "what's left?" and want it scoped into a goal and routed into the right lane
 **Common gotcha:** it is a mutator-dispatcher: it sets up the goal and hands off, it does NOT execute. It detects the goal-loop activator (built-in `/goal`, `ralph-loop`, or `goal-craft`) and degrades to a plain draft file if none is installed. Idempotent per id. Source: SPEC-006 + ADR-0011.
 

From ca6e9417fa5d0293f49a8e2be03999849159de5d Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:35:26 +0700
Subject: [PATCH 17/56] chore(release): mark operating-layer spec shipped

Flip SPEC-024 Status VALIDATED -> SHIPPED; drop ID-015 from the BACKLOG active queue (v1.7.0; CHANGELOG is the canonical shipped record).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                                 | 2 +-
 docs/specs/SPEC-024-agents-md-operating-layer.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 4577968..a324876 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -34,7 +34,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-002 | Absorb skills/hooks developed in ops-toolkit into the kit (internal lane) | item e | SPEC-007 (PARKED; see its Parked note) | full | parked |
 | ID-003 | Deep orchestration scan of the copied repos vs our WORKFLOW (report + absorb plan) | item 2 | `docs/research/2026-05-20-orchestration-deep-scan.md` | normal | shipped (report delivered; recs folded into SPEC-006) |
 | ID-012 | Goal-loop fidelity (one theme, two parts). P1: stop-criteria in the `/spec` template (pin `## Verification` + `## Open questions` so specs are pointer-`/goal`-ready) -- build now, tiny lane. P2: QA gate around the autonomous loop (verify-into-loops + SPEC-006 completeness clauses + Reviewer 5) -- held until the pointer-`/goal` pattern has real runs | maintainer Q3 2026-05-20 + goal-readiness convergence 2026-05-21 | SPEC-012 | normal (P1) / full (P2) | P1 shipped (see CHANGELOG); P2 held |
-| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | executing (built + reviewed; ship next) |
+| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | shipped (v1.7.0; see CHANGELOG) |
 | ID-016 | Promote the "new SPEC -> BACKLOG status row" doc-impact check to a guard (a hook or a kit-health line); it was missed across multiple cycles | SPEC-025 retro 2026-05-21 (recurrence clears the PHILOSOPHY section 5 bar) | TBD | normal | queued |
 | ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-<slug>.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued |
 | ID-018 | install.sh prints blind `cp` tips for AGENTS.md and CLAUDE.md; harden both to copy-if-absent at the command level (`cp -n` or a guard) so the printed command matches its "if absent" prose, or record why prose-only suffices | SPEC-024 review 2026-05-21 (security lens) | TBD | tiny | queued |
diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index b15cf44..6ec16e9 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -1,6 +1,6 @@
 # Spec: AGENTS.md operating layer + brownfield backfill lane
 Generated: 2026-05-21
-Status: VALIDATED
+Status: SHIPPED
 
 > Implements ADR-0013 (accepted 2026-05-21) and its After-state addendum. Backlog: ID-015 (full lane).
 > Validated 2026-05-21 via /user:spec-validate (5 lenses); revisions recorded in the Decision Log (DEC-004 corrected, DEC-005..DEC-008 added).

From 7acd2b25858002bcc55cef7fcbeff3f8ead3ca07 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:37:21 +0700
Subject: [PATCH 18/56] docs(retro): agents-md operating-layer cycle retro

Captures the spec-validate phantom-fix catch, the execute/integration-checker presence-not-absence blind spot that missed the WORKFLOW restatement, and recurring worker shell/hook friction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 ...RO-2026-05-21-agents-md-operating-layer.md | 36 +++++++++++++++++++
 1 file changed, 36 insertions(+)
 create mode 100644 docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md

diff --git a/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md b/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md
new file mode 100644
index 0000000..104aa78
--- /dev/null
+++ b/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md
@@ -0,0 +1,36 @@
+# Retro: AGENTS.md operating-layer cycle
+Date: 2026-05-21
+Sprint: single-session cycle (spec-validate -> execute -> review-team -> docs -> ship)
+Spec: SPEC-024 (v1.7.0, PR #7)
+
+## Metrics
+- Tasks planned: 10, completed: 10, deferred: 0 (2 LOW review findings -> backlog ID-018/019)
+- Commits: 16 (build 15 + ship-state 1), atomic, all passed the commit-format hook
+- Files changed: 18 (+453 / -57)
+- Tests: test-meta 216 -> 241 (+25 asserts), test-hooks 92; both exit 0 throughout
+- Verification pipeline: 10/10 task-verifier PASS, 0 retries, 0 escalations; integration-checker PASS (10/10 wired, 6/6 chains)
+- Completeness log: clean
+- Key commits: 7c5015c (AGENTS.md), b7bb3ee (six-section projection), 6ec7880 (test asserts + install-merge guard), 3ba7022 (review fixes)
+
+## What worked
+- **Adversarial spec-validate paid for itself before any code.** It caught two real defects in the draft: (1) the phantom `install.sh` jq "fix" claimed done with no diff (DEC-004), and (2) an unsatisfiable TASK-001 AC (four AGENTS.md zones described, six demanded, "1:1 mapping" false by the spec's own diagram; DEC-005). Both would have produced a broken or dishonest build. Fixing the spec was cheap; fixing it post-execute would not have been.
+- **The honest "test + fix-if-surfaced" reframe of DEC-004 resolved cleanly.** Because the spec stopped claiming a fix and made TASK-009 own "add the test, apply a fix only if it fails", execute proved no bug existed and shipped a regression guard instead of a false changelog line. Honesty in the spec produced an honest changelog.
+- **Grep-based acceptance criteria made every task mechanically verifiable.** 10/10 PASS, 0 retries. Doc/config/bash tasks with `grep -q` ACs gave the task-verifier real teeth without a unit-test framework.
+- **The doc-impact map did its job at /docs.** It surfaced two companions the build missed (MANUAL `/user:assign` Reads/Writes; CLAUDE downstream-template note); the doc-verifier then confirmed the fixes (9 claims, 0 contradictions).
+
+## What hurt
+- **The execute pipeline AND the integration-checker both rubber-stamped the WORKFLOW read-order restatement.** TASK-002 was "replace, don't duplicate", but its worker left a numbered 1-4 read-list AND a "do not restate" pointer (the exact anti-pattern). The task-verifier passed it (checked the pointer existed + AGENTS.md-first; never checked the old list was gone). The integration-checker's seam-2 "no duplication" check also passed it (saw the "do not restate" sentence and stopped). Only the independent /review-team architecture lens caught it. Root cause: presence checks ("the pointer exists") do not catch absence-invariant failures ("the duplicate is gone"). Negative invariants need explicit negative assertions.
+- **Recurring shell/hook friction made every worker rediscover the same three workarounds.** fish `noclobber` aborted `>` redirects to existing temp files; the commit-format/commit-msg path mis-parsed heredoc `-m` messages as a 459-char subject; the safety-gate blocked `rm -f` of temp files. Each worker burned cycles relearning: use `>|`, `git commit -F`/Write-tool message files, and `mv` instead of `rm`. I started pre-warning these in later worker prompts, which helped.
+- **The orchestrator edited an acceptance criterion mid-execute.** When DEC-004 resolved as no-bug, I changed TASK-010's CHANGELOG expectation from "fix" to "coverage" during the build. Necessary and later recorded as a spec amendment, but mutating the contract mid-flight (rather than at a checkpoint) is a process smell the architecture review flagged.
+- **Retro-file naming is still three-way inconsistent** (this file's `RETRO-YYYY-MM-DD-<slug>.md`, a stray `RETRO-2026-05-21.md`, an old `v1.3-v1.5.md`; the retro skill body says `RETRO-[date].md`, WORKFLOW says `v<version>.md`). This is ID-017, still open.
+
+## Action items
+- [ ] Teach the task-verifier + integration-checker to check ABSENCE for replace/remove tasks (assert the replaced/removed content is gone, not just that the new pointer exists). The specific anti-drift test now exists; the general verifier gap does not. -- relates to ID-016 -- owner: Han -- next kit cycle
+- [ ] Pre-warn the shell/hook gotchas in the `commands/execute.md` worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`/Write; no `rm` -> `mv`). One block, saves every future worker the rediscovery. -- new backlog item -- owner: Han -- next kit cycle
+- [ ] Resolve ID-017 (retro-file naming): pick `RETRO-YYYY-MM-DD-<slug>.md`, reconcile the retro skill body + CLAUDE.md + WORKFLOW.md. -- already queued (ID-017) -- owner: Han
+- [ ] ID-018 (install tip `cp -n`) and ID-019 (demo SPEC-001 `## After state`) -- already queued from review -- owner: Han
+
+## Kit feedback
+- **Investigate the commit-message heredoc mis-parse.** Workers reported the commit-format/commit-msg hook reading a multi-line heredoc `-m` body as a single 459-char subject and rejecting it. If real, the hook should read only the first line as the subject. Worth a look (possible new backlog item); not yet confirmed as a kit bug vs heredoc misuse.
+- The commit-format hook correctly blocked spec-ID/phase-marker subjects (it caught my own "phase 1" subject). Working as intended.
+- The verification pipeline's blind spot (presence-not-absence) is the highest-signal kit finding this cycle: a "replace" task that left both copies passed two independent verifiers.

From 71e8f6177f8f6a14e1b7f27ebdea9bc22ff7dcf8 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:38:53 +0700
Subject: [PATCH 19/56] docs(backlog): queue retro action items

ID-020 verifier absence-checks (the replace-task blind spot both verifiers missed), ID-021 reduce execute-worker shell/hook friction + investigate the heredoc commit mis-parse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index a324876..656229b 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -39,11 +39,14 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-<slug>.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued |
 | ID-018 | install.sh prints blind `cp` tips for AGENTS.md and CLAUDE.md; harden both to copy-if-absent at the command level (`cp -n` or a guard) so the printed command matches its "if absent" prose, or record why prose-only suffices | SPEC-024 review 2026-05-21 (security lens) | TBD | tiny | queued |
 | ID-019 | Demo `examples/hello-spec/docs/specs/SPEC-001-version-flag.md` lacks a `## After state` section; add one so the demo is a complete exemplar of the post-SPEC-024 spec template | SPEC-024 review 2026-05-21 (test-coverage lens) | TBD | tiny | queued |
+| ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
+| ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
 - ID-016 / ID-017 came out of the SPEC-025 retro; both are kit-internal hygiene, independent of the SPEC-024 chain.
 - ID-018 / ID-019 came out of the SPEC-024 /review-team pass (deferred LOW findings); both are tiny-lane polish, independent of the SPEC-024 ship.
+- ID-020 / ID-021 came out of the SPEC-024 retro; ID-020 (verifier absence-checks) relates to ID-016 (both are guard/verifier hardening) and is the highest-signal kit finding of the cycle.
 
 ## Schema
 

From 17644bc3b9c83b6c81b11d6fb1a5d46c38a618f3 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:44:35 +0700
Subject: [PATCH 20/56] chore: revert version bump, fold changes into
 Unreleased

The 1.6.0 -> 1.7.0 bump went against the kit convention (accumulate under CHANGELOG [Unreleased], cut versions separately) and had clobbered the prior ## [1.6.0] heading. Revert all three version surfaces to 1.6.0; restore the [1.6.0] release section; move the SPEC-024 notes into [Unreleased]. DEC-009.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .claude-plugin/plugin.json                    |  2 +-
 CHANGELOG.md                                  | 19 +++++++------------
 VERSION                                       |  2 +-
 _meta/BACKLOG.md                              |  2 +-
 ...RO-2026-05-21-agents-md-operating-layer.md |  2 +-
 .../SPEC-024-agents-md-operating-layer.md     |  3 ++-
 tool.toml                                     |  2 +-
 7 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
index 78fa52a..9d9046e 100644
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "dwarves-kit",
-  "version": "1.7.0",
+  "version": "1.6.0",
   "description": "Spec-driven Claude Code workflow with verification pipeline. Hooks, slash commands, and verification subagents.",
   "author": {
     "name": "Dwarves Foundation",
diff --git a/CHANGELOG.md b/CHANGELOG.md
index a4a5733..bc0f927 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,12 @@ All notable changes to dwarves-kit are documented here.
 
 ### Added
 
+- **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
+- **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
+- **Six-section goal projection in `/user:assign`**: the goal draft `/user:assign` writes now projects the backlog item across six fixed sections (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if) sourced from the `AGENTS.md` zones + the active spec, so a handed-off goal carries fuller context into its lane.
+- **`## After state` section in the `/user:spec` template**: specs now scaffold the observable post-implementation end state, complementing the existing Verification / Open-questions stop-criteria.
+- **doc-impact-map rows for `AGENTS.md` and the top-level files**: the WORKFLOW doc-impact map (reviewed by `/user:ship` + `/user:retro`) now lists `AGENTS.md` and a "new top-level file" trigger so a change that touches them is flagged for a doc update.
+- **Regression coverage for the install settings-merge against a pre-existing third-party hook (test/guard, not a fix)**: `tests/test-meta.sh` gains a block asserting `install.sh`'s `settings.json` merge preserves an existing third-party hook entry. This is added coverage, not a bug fix: the current installer already unions third-party hooks correctly, so no installer change was needed. Plus review-driven anti-drift assertions (CLAUDE.md/WORKFLOW.md carry no read-order restatement) and hello-spec AGENTS.md zone pins. Meta suite to 241 asserts. Source: SPEC-024.
 - **`/user:ui-design` command + the downstream UI-design loop (SPEC-020)**: writes a structured `## UI design` brief (aesthetic-direction preamble + layout + states matrix + named viewports + a11y bars + 3-tier token ladder + voice) into the active spec (else the pre-spec brief), delegates generation to the external `frontend-design` skill (the kit ships no renderer), critiques via `/user:visual-team`, and runs a bounded auto-revise loop (E6 `<user-feedback>` injection-wrap, E7 unconditional accumulated-feedback re-send, terminate on SOLID / RECONSIDER / max-2 cap). Opt-in, report-only, downstream-facing (the PHILOSOPHY carve-out + kit-health allow-note now name both `/user:visual-team` and `/user:ui-design`). The kit is now 20 commands. Brief enriched per the 2026-05-21 deep scan (`docs/research/2026-05-21-ui-design-loop-deep-scan.md`): aesthetic-direction from `frontend-design`'s real input, token ladder + states matrix + a11y bars + voice adapted from `nextlevelbuilder/ui-ux-pro-max-skill` (its renderer / fonts / `.cjs`+`.py` tooling rejected per bash-over-binaries), loop shapes from gstack. Dogfooded through `/user:spec-validate` twice; the re-dogfood caught an unimplementable numeric stop (visual-team emits no combined score) and folded it to a SOLID-verdict stop. Credits: gstack (loop shapes), `frontend-design` (generator), `ui-ux-pro-max-skill` (brief sub-shapes). Source: SPEC-020.
 - **`/user:absorb` command + the absorption ritual (SPEC-004)**: a maintainer-only, proposal-only external-absorption audit that generalizes SPEC-002/SPEC-014's one-shot surveys into a recurring ritual. `docs/ABSORPTION.md` carries it: two lanes (Credits drift re-audit + a seed-rescan of the SPEC-014 survey set, scanning the interest areas workflow/agents/QA/UI), the adoption rubric (>=10), the gate, and the **human merge gate** (discovery + scoring + drafting are automatic; adopting a source or adding it to Credits is maintainer-approved, preserving "synthesize, don't originate"). `/user:absorb` writes a dated, ranked + capped, proposal-only report under `docs/absorption/` (HEAD-SHA baseline for since-last-run, an overflow appendix so a real ADOPT is never dropped, a `git status` self-check); QA/UI candidates needing binaries route to "recommend external". The kit is now 19 commands. Think+Design narrowed lane B to a seed-rescan (web-search discovery deferred as tool-weak). Also corrected the plugin manifests' stale hook/agent counts (now 14 hooks / 11 agents). Source: SPEC-004 + `docs/ABSORPTION.md`; DATA-not-instructions guard from ADR-0008.
 - **Reviewer 5 (Solution-Design & Extensibility Critic)** in `/user:spec-validate`: flags shallow or non-extensible designs, with a calibration clause (no false-positive storm) and a legacy-grace clause for specs predating the richer template. Source: SPEC-008; forked from `superpowers:brainstorming` ("design for isolation and clarity") + its spec-document-reviewer calibration. Not a runtime dependency.
@@ -42,18 +48,7 @@ All notable changes to dwarves-kit are documented here.
 - **`context-readiness.sh` + `session-state-save.sh` count fragility**: `find ... | grep -v` and `grep -c ... || echo 0` both mishandled the zero-match case under `set -e`/pipefail. In a source-file-free repo the hook aborted with no output; a spec with 0 done tasks rendered `tasks:0\n0/N` plus an `integer expected` error. Now uses `{ grep -v ... || true; }` and `grep -c ... || true`. Found during SPEC-010 execution (ID-013).
 - **`install.sh` never materialized the hooks `settings.json` references (SPEC-025)**: settings hard-code every hook (and the statusline) at `$HOME/.claude/dwarves-kit/hooks/<script>.sh`, but the bash installer only `chmod`'d the scripts at its own `$KIT_DIR/hooks/`, merged settings, and created `logs/`. That coincidence held only for the documented in-place clone (README Option 2: clone to `~/.claude/dwarves-kit`); from a dev checkout or CI clone elsewhere, `~/.claude/dwarves-kit/` got only `logs/` and all 14 hooks plus the statusline pointed at missing files, so a fresh session opened with `SessionStart ... No such file or directory` and every hook was silently dead. `install.sh` now links each `hooks/*.sh` into `~/.claude/dwarves-kit/hooks/` when the kit lives elsewhere (per-file symlinks, mirroring the existing `commands/` step, so dev edits stay live), detects the in-place layout and skips linking (a naive loop there deletes the real scripts and leaves broken self-referential symlinks, a regression the first cut shipped and the SDD pass then caught), and the uninstall removes only the symlinks it created. `tests/test-meta.sh` gains a 3-assertion guard (referenced scripts exist in `hooks/`; an isolated out-of-place install resolves every path; an in-place install keeps the scripts resolvable) so neither failure mode can regress past CI. Meta suite 213 -> 216.
 
-## [1.7.0] - 2026-05-21
-
-AGENTS.md operating-layer front door plus a brownfield backfill lane and a goal-projection enrichment (SPEC-024).
-
-### Added
-
-- **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
-- **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
-- **Six-section goal projection in `/user:assign`**: the goal draft `/user:assign` writes now projects the backlog item across six fixed sections, so a handed-off goal carries fuller context into its lane.
-- **`## After state` section in the `/user:spec` template**: specs now scaffold the post-implementation end state, complementing the existing Verification / Open-questions stop-criteria.
-- **doc-impact-map rows for `AGENTS.md` and the top-level files**: the WORKFLOW doc-impact map (reviewed by `/user:ship` + `/user:retro`) now lists `AGENTS.md` and the other root-level docs so a change that touches them is flagged for a doc update.
-- **Regression coverage for the install settings-merge against a pre-existing third-party hook (test/guard, not a fix)**: `tests/test-meta.sh` gains a block that asserts `install.sh`'s `settings.json` merge preserves an existing third-party hook entry. This is added coverage, not a bug fix: the current installer already preserves third-party hooks (the jq merge unions correctly), so no installer change was needed. The test pins that behavior so it cannot silently regress. Meta suite to 234 asserts.
+## [1.6.0] - 2026-05-20
 
 ### Added
 
diff --git a/VERSION b/VERSION
index bd8bf88..dc1e644 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-1.7.0
+1.6.0
diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 656229b..b790ed8 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -34,7 +34,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-002 | Absorb skills/hooks developed in ops-toolkit into the kit (internal lane) | item e | SPEC-007 (PARKED; see its Parked note) | full | parked |
 | ID-003 | Deep orchestration scan of the copied repos vs our WORKFLOW (report + absorb plan) | item 2 | `docs/research/2026-05-20-orchestration-deep-scan.md` | normal | shipped (report delivered; recs folded into SPEC-006) |
 | ID-012 | Goal-loop fidelity (one theme, two parts). P1: stop-criteria in the `/spec` template (pin `## Verification` + `## Open questions` so specs are pointer-`/goal`-ready) -- build now, tiny lane. P2: QA gate around the autonomous loop (verify-into-loops + SPEC-006 completeness clauses + Reviewer 5) -- held until the pointer-`/goal` pattern has real runs | maintainer Q3 2026-05-20 + goal-readiness convergence 2026-05-21 | SPEC-012 | normal (P1) / full (P2) | P1 shipped (see CHANGELOG); P2 held |
-| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | shipped (v1.7.0; see CHANGELOG) |
+| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | shipped (CHANGELOG [Unreleased], no version bump; PR #7) |
 | ID-016 | Promote the "new SPEC -> BACKLOG status row" doc-impact check to a guard (a hook or a kit-health line); it was missed across multiple cycles | SPEC-025 retro 2026-05-21 (recurrence clears the PHILOSOPHY section 5 bar) | TBD | normal | queued |
 | ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-<slug>.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued |
 | ID-018 | install.sh prints blind `cp` tips for AGENTS.md and CLAUDE.md; harden both to copy-if-absent at the command level (`cp -n` or a guard) so the printed command matches its "if absent" prose, or record why prose-only suffices | SPEC-024 review 2026-05-21 (security lens) | TBD | tiny | queued |
diff --git a/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md b/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md
index 104aa78..1e4b4b5 100644
--- a/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md
+++ b/docs/retro/RETRO-2026-05-21-agents-md-operating-layer.md
@@ -1,7 +1,7 @@
 # Retro: AGENTS.md operating-layer cycle
 Date: 2026-05-21
 Sprint: single-session cycle (spec-validate -> execute -> review-team -> docs -> ship)
-Spec: SPEC-024 (v1.7.0, PR #7)
+Spec: SPEC-024 (PR #7; CHANGELOG [Unreleased], no version bump)
 
 ## Metrics
 - Tasks planned: 10, completed: 10, deferred: 0 (2 LOW review findings -> backlog ID-018/019)
diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md
index 6ec16e9..ec3c12a 100644
--- a/docs/specs/SPEC-024-agents-md-operating-layer.md
+++ b/docs/specs/SPEC-024-agents-md-operating-layer.md
@@ -90,7 +90,7 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win
 - [x] TASK-007 (DONE 9fc3e25, verified): Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row.
 - [x] TASK-008 (DONE f51537e, verified): Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`.
 - [x] TASK-009 (DONE 6ec7880, verified; merge test passes on current install.sh -> no jq fix needed, DEC-004 resolved as no-bug): Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes.
-- [x] TASK-010 (DONE 43fa85d, verified): Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry, and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature and the install-merge regression test as coverage (not a fix; DEC-004 resolved as no-bug, see TASK-009).
+- [x] TASK-010 (DONE 43fa85d; version bump REVERTED post-ship per DEC-009): `CHANGELOG.md` entry and the `install.sh` `AGENTS.md` tip. - AC: CHANGELOG records the feature + the install-merge regression test as coverage (not a fix; DEC-004 resolved as no-bug, see TASK-009). The version bump originally done here was reverted to 1.6.0; the kit accumulates under CHANGELOG `[Unreleased]` and does not cut a version per spec (DEC-009).
 
 ## Acceptance Criteria (global)
 - [x] All tasks pass their individual acceptance criteria.
@@ -135,6 +135,7 @@ Observable; each bullet was false before this cycle and is a real check now (all
 - DEC-005: `AGENTS.md` carries **four** portable zones (read list, task loop, done-definition, Pause-if); the six-section `/goal` is a *composition* of those four plus the spec's `## Verification` and `## After state`, not a 1:1 mapping. Rationale: the Architecture diagram already shows two goal sections sourced from the spec, so the earlier "six contract zones in `AGENTS.md` / 1:1" framing (Interfaces, TASK-001 AC, TASK-009) was internally contradictory and unsatisfiable. Who: spec-validate Reviewer 5/4.
 - DEC-006: The operate-contract prose (ordered read-order + done-definition) lives in `WORKFLOW.md` today, not kit-root `CLAUDE.md`. TASK-002 therefore reconciles `WORKFLOW.md` (point at `AGENTS.md`), not just `CLAUDE.md`, and a failure-mode row now covers `AGENTS.md`↔`WORKFLOW.md` drift. Rationale: without this the dedup is a no-op on the named file and the real three-way overlap ships. Who: spec-validate Reviewer 3/5.
 - DEC-007: Edge Case 2 reframed to the tip model: `install.sh` never copies `AGENTS.md` (tip-only, like `CLAUDE.md`), so the "must not clobber" guard described a mechanism that does not exist. Rationale: accuracy; the real requirement is the tip wording. Who: spec-validate Reviewer 2.
+- DEC-009: No version bump for this spec. TASK-010 had bumped 1.6.0 -> 1.7.0, but the kit's convention is to accumulate changes under CHANGELOG `[Unreleased]` and cut a version separately (1.7.0 had also clobbered the prior `## [1.6.0]` heading). Reverted all three version surfaces to 1.6.0 and moved the SPEC-024 notes into `[Unreleased]`. Rationale: per-spec version bumps were not the established pattern; the bump also corrupted the changelog history. Who: maintainer call (2026-05-21).
 - DEC-008: TASK-007 adds a "new top-level file" trigger row to the doc-impact map. Rationale: the bolded self-maintaining rule is scoped to a new top-level *dir*; a top-level *file* like `AGENTS.md` was not actually covered, so line-71's "already requires" claim was false. Who: spec-validate Reviewer 4.
 
 ## Open questions
diff --git a/tool.toml b/tool.toml
index 5ae93ee..5decea8 100644
--- a/tool.toml
+++ b/tool.toml
@@ -4,7 +4,7 @@ language    = "bash"
 status      = "stable"
 entry       = "install.sh"
 created     = 2026-03-29
-version     = "1.7.0"
+version     = "1.6.0"
 
 # Consumer repos: any Dwarves project that adopts the spec-driven flow.
 # Not bound to a specific repo; the kit is per-machine, not per-project.

From de2b7fbfb353ab55c73889006da14325dddf5315 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Thu, 21 May 2026 23:55:35 +0700
Subject: [PATCH 21/56] docs: add orchestration manual (flows, loops, stop
 conditions)

New docs/ORCHESTRATION.md: the flow/loop view of the kit's orchestration layer. Catalogs 1 spine, 5 lanes, 3 bounded loops, 7 alt/branch flows, 8 opt-in side-flows, and the 4 hard stops; per-flow trigger/steps/stop/enforcer/branches; 6 ASCII diagrams (master lifecycle + per-loop). Cross-linked from WORKFLOW.md + README.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md             |   1 +
 WORKFLOW.md           |   2 +
 docs/ORCHESTRATION.md | 372 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 375 insertions(+)
 create mode 100644 docs/ORCHESTRATION.md

diff --git a/README.md b/README.md
index 6a1615d..47f55a7 100644
--- a/README.md
+++ b/README.md
@@ -221,6 +221,7 @@ dwarves-kit/
   AGENTS.md                     Tool-agnostic operate-contract front door (any runtime reads it first)
   WORKFLOW.md                   The cycle, the risk-tier lanes, the gate at each boundary
   MANUAL.md                     Operator reference for the commands
+  docs/ORCHESTRATION.md         The flow/loop view: lanes, loops, triggers, stop conditions (ASCII diagrams)
   RUNBOOK.md                    Hook misbehavior diagnosis + recovery
   README.md / CONTRIBUTING.md / CHANGELOG.md / VERSION / LICENSE
   CLAUDE.md                     Project template; the Claude-Code layer on top of AGENTS.md
diff --git a/WORKFLOW.md b/WORKFLOW.md
index 45ff4fe..e1a16ca 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -5,6 +5,8 @@
 > It suggests and routes; it does not block. The only hard stops are the
 > safety-gate hook, the push-to-main blocker, the anti-rationalization Stop
 > hook, and the verification pipeline.
+> For the visual flow/loop view (every flow + alt-flow, its trigger, and its
+> stop condition, with ASCII diagrams), see `docs/ORCHESTRATION.md`.
 
 ## Required reading
 `AGENTS.md` is the front door and owns the read-order; it is the single source.
diff --git a/docs/ORCHESTRATION.md b/docs/ORCHESTRATION.md
new file mode 100644
index 0000000..c88bbdc
--- /dev/null
+++ b/docs/ORCHESTRATION.md
@@ -0,0 +1,372 @@
+# ORCHESTRATION.md: the flow/loop view
+
+> Reference manual for the dwarves-kit **orchestration layer**: the machinery that
+> moves one unit of work from intake to shipped. This is the flow-and-loop view.
+> For per-command operator detail read `MANUAL.md`; for the rules contract read
+> `WORKFLOW.md`; for component fit read `docs/architecture.md`; for why-decisions
+> read `docs/PHILOSOPHY.md` + `docs/decisions/`. This doc cross-links those; it
+> does not restate them.
+
+## At a glance
+
+| Kind | Count | Members |
+|---|---|---|
+| Backbone | 1 | the spine (intake -> shipped) |
+| Primary intake lanes | 5 | `tiny`, `normal`, `full`, `bug`, `backfill` |
+| Bounded loops (engines) | 3 | goal loop, debug loop, execute verification pipeline |
+| Alternate / branch flows | 7 | retry, escalate, ambiguous-spec, no-activator, idempotent re-run, DO-NOT-SHIP gate, completeness warn+log |
+| Opt-in side-flows | 8 | `/user:design`, `/user:devs-team`, `/user:visual-team`, `/user:ui-design`, `/user:test-plan`, `/user:review-team`, `/user:absorb`, `/user:kit-health` |
+| Hard stops (the only blockers) | 4 | safety-gate, push-to-main blocker, anti-rationalization, verification pipeline |
+
+Everything else **suggests and routes; it does not block**. The four hard stops are
+the only places the kit refuses to proceed. Keep that split in mind throughout.
+
+---
+
+## 1. The state model (what the flows move between)
+
+Three stores. Each flow reads and/or writes these; nothing is re-entered between phases.
+
+```text
+  _meta/BACKLOG.md            docs/specs/SPEC-NNN-<slug>.md      .claude/goals/<slug>.md
+  ┌─────────────────┐         ┌──────────────────────────┐      ┌──────────────────────┐
+  │ the Active queue│         │ the contract             │      │ ephemeral goal drafts│
+  │ ID-NNN rows     │ ──────▶ │ Status: DRAFT            │ ◀──▶ │ (gitignored,         │
+  │ status:         │  assign │        -> VALIDATED      │      │  per-machine)        │
+  │ queued/speccing/│         │        -> SHIPPED        │      │ one draft per ID     │
+  │ validated/      │         │ tasks, AC, Verification, │      └──────────────────────┘
+  │ executing/      │         │ After state, Open Qs     │        the built-in /goal owns
+  │ shipped (parked)│         └──────────────────────────┘        .claude/last-goal.md;
+  └─────────────────┘                                             the kit NEVER writes it
+```
+
+**Detector vs mutator** (load-bearing): `/user:start` and `/user:next` only **read and
+render** the queue + drafts. `/user:assign` is the **only mutator**: it writes a goal
+draft, flips a backlog status, and hands off. No other entry point mutates state on intake.
+
+---
+
+## 2. The spine (master lifecycle flowchart)
+
+How a committed backlog item becomes shipped work, end to end. This is the backbone;
+every lane is a longer or shorter walk along it.
+
+```text
+  session start
+       │
+       ▼
+  /user:start ........... DETECT: render the BACKLOG Active queue + active goal drafts
+       │                  (read-only; suggests the next command)
+       ▼
+  /user:assign ID-NNN ... MUTATE: goal-craft a draft (.claude/goals/<slug>.md),
+       │                  pick the lane from the item, detect the goal-loop activator,
+       │                  flip status (queued -> speccing | executing), hand off.
+       │                  Does NOT execute. Never writes last-goal.md.
+       ▼
+  ┌─────────────────── pick a lane (Section 3) ───────────────────┐
+  │  tiny      normal        full            bug         backfill  │
+  └───────┬──────┬─────────────┬──────────────┬─────────────┬─────┘
+          │      │             │              │             │
+          │      ▼             ▼              ▼             ▼
+          │  /user:spec    /user:think    /user:debug   review code,
+          │      │         /user:spec        │          write AGENTS.md
+          │      │         /user:spec-validate│          /CLAUDE.md/specs
+          │      │             │              │          (no app code)
+          │      ▼             ▼              │             │
+          │  /user:execute (verification pipeline, Section 5.3)      │
+          │      │             │              │             │
+          │      ▼             ▼              ▼             │
+          │  /user:review  /user:review-team  (root cause   │
+          │      │             │              recorded,     │
+          │      ▼             ▼              fix verified,  │
+          │  (normal)      /user:docs         human-confirm)│
+          │      │             │                            │
+          │      │             ▼                            │
+          └──────┴────────▶ /user:ship ◀────────────────────┘
+                              │   (ship gate: blocks on DO NOT SHIP;
+                              │    push-to-main blocker; flips spec -> SHIPPED;
+                              │    ID-NNN drops off the queue)
+                              ▼
+                           /user:retro (full lane) -> docs/retro/RETRO-YYYY-MM-DD-<slug>.md
+```
+
+Opt-in beats (Section 6) slot in along this path: `/user:design` between think and spec;
+`/user:devs-team` + `/user:visual-team` before the spec hardens; `/user:test-plan` before
+execute; `/user:ui-design` for downstream UI work.
+
+---
+
+## 3. Primary intake lanes (5)
+
+Pick a lane **before** you start. Smaller work skips ceremony. When in doubt between two,
+take the heavier one.
+
+| # | Lane | Trigger (when) | Path | Stop / exit |
+|---|---|---|---|---|
+| 1 | `tiny` | typo, copy, comment, one obvious edit | edit -> verify -> done. No spec. | the edit verifies |
+| 2 | `normal` | one bounded feature or fix | `/spec` -> `/execute` -> `/review` -> `/ship` | shipped, ID off queue |
+| 3 | `full` | touches auth/authz, hooks, data model, data loss, audit/security, an external provider, an API contract, a migration, or weakens validation | `/think` -> `/spec` -> `/spec-validate` -> `/execute` -> `/review-team` -> `/docs` -> `/ship` -> `/retro` | shipped + retro written |
+| 4 | `bug` | a defect, regression, or failing test (not a new feature) | `/debug` (root cause first) -> `/review` | root cause recorded, fix verified, human-confirmed |
+| 5 | `backfill` | brownfield: adopt the kit onto existing code | review the code, write the operating-layer docs (AGENTS.md / CLAUDE.md / specs). `/spec` optional. | docs written; **no app-behavior change, no app-code edits** |
+
+```text
+                       is it a defect / regression / failing test ?
+                                   │ yes            │ no
+                                   ▼                ▼
+                                 bug          new work on an existing repo
+                                              with no operate-layer docs ?
+                                                   │ yes        │ no
+                                                   ▼            ▼
+                                                backfill    how big / how risky ?
+                                                            ├─ trivial edit ....... tiny
+                                                            ├─ one bounded change . normal
+                                                            └─ risk-list match .... full
+```
+
+The `full` trigger list is a hard tripwire: anything on it uses `full` unless you explicitly
+narrow the scope and say why.
+
+---
+
+## 4. The bounded-loop principle
+
+The kit ships **bounded in-session loops** and declines **unbounded outer loops**. A bounded
+loop continues *within* the current session under a model-evaluated stop condition plus the
+safety subset; an unbounded loop spawns *new* sessions without one (that is autonomous-runtime
+territory: GSD v2 / OMC, out of scope). All three engines below are bounded.
+
+---
+
+## 5. The three bounded loops (engines)
+
+### 5.1 Goal loop
+
+A continuation that keeps the current session working a single objective until a verifiable
+stop holds. Wired from the backlog by `/user:assign`, activated by whatever loop primitive is
+present.
+
+- **Trigger**: an objective handed to an activator: the built-in `/goal`, the `ralph-loop`
+  plugin, or the `goal-craft` skill. `/user:assign` writes the draft and surfaces its body;
+  it does **not** start the loop itself (activator-agnostic).
+- **Enforcer**: the **anti-rationalization Stop hook** (blocks premature "done"), plus the
+  rest of the safety subset (verification pipeline, push-to-main blocker).
+- **Stop condition**: the objective's `## Verification` command(s) pass AND the done-definition
+  holds (AGENTS.md "Done means" + the spec's `## After state`). On a blocker it cannot resolve,
+  it appends a named note to the spec's `## Open questions` and stops (no churn).
+- **Branches**: no-activator (degrades to a plain reusable draft file, Section 7.4);
+  blocker-hit (write Open-questions note, stop).
+
+```text
+   activator starts the objective
+            │
+            ▼
+   ┌───▶ do the next increment ──▶ run ## Verification
+   │            ▲                        │
+   │            │                  pass? │
+   │            │              ┌─────────┴─────────┐
+   │            │           no │                   │ yes
+   │            │              ▼                   ▼
+   │            │     anti-rationalization     ALL done? ──no──┐
+   │            │     blocks "done";           │ yes           │
+   │            └─────  keep working ◀─────────┘               │
+   │                                                            │
+   │   hit a blocker you can't resolve?                         ▼
+   └── write a note to spec ## Open questions ─────────────▶  STOP
+```
+
+### 5.2 Debug loop (`/user:debug`, the `bug` lane)
+
+A systematic four-phase loop under one iron law. Off-cycle: a bug-lane entry point, not a
+linear phase.
+
+- **Trigger**: `/user:debug` on a defect, regression, or failing test.
+- **Iron law**: **NO FIX WITHOUT A RECORDED ROOT CAUSE.** Evidence accrues in an append-only
+  ledger `.claude/debug/<slug>.md` whose `## Root cause` heading is the contract.
+- **Enforcer**: the **guess-fix guard** (a gated mode of the anti-rationalization hook): it
+  blocks a fix/done claim while an open debug ledger still has an empty `## Root cause`. Silent
+  in non-debug sessions.
+- **Stop condition**: root cause recorded + fix verified + **human-confirmed**.
+- **Branches**: regression -> `git bisect`; failing-test-first -> routed into the execute
+  verification pipeline (5.3); the **3-fix architecture wall** (after 3 failed fixes, stop and
+  reconsider the design rather than keep patching).
+
+```text
+   /user:debug
+       │
+       ▼
+   Phase 1: Root cause ───▶ Phase 2: Pattern ───▶ Phase 3: Hypothesis ───▶ Phase 4: Implementation
+       │  (ledger              (reproduce,            (predict, then          (apply the fix)
+       │  ## Root cause)       narrow; bisect          test the guess)             │
+       │                       if regression)              │                       ▼
+       │                                                   │                  verified? ──no──┐
+   guess-fix guard: a fix/done claim is BLOCKED            │                       │ yes      │
+   while ## Root cause is empty ◀──────────────────────────┘                       ▼          │
+                                                                            human-confirm     │
+   3 failed fixes in a row ──▶ STOP: architecture wall (reconsider design)         │          │
+                                                                                   ▼          │
+                                                                                 DONE ◀───────┘ (loop Phase 3-4)
+```
+
+### 5.3 Execute verification pipeline (the build engine)
+
+The core build loop: `/user:execute` dispatches one worker per task, verifies each in a fresh
+context, retries fixable failures, and checks cross-task wiring at the end. Self-reported
+"done" from a worker is never proof; the verifier is.
+
+- **Trigger**: `/user:execute` on a `VALIDATED`/`APPROVED` spec on a feature branch.
+- **Enforcer**: the **verification pipeline itself** is a hard stop (it gates each task).
+- **Stop condition**: every task PASS **and** the integration-checker PASS (multi-task specs).
+- **Branches**: `PASS` (advance), `FAIL:fixable` (retry via fix-agent, **max 2**), `FAIL:escalate`
+  or retries exhausted (stop -> human). Single-task specs skip the integration check.
+
+```text
+   /user:execute  (record pre-build base ref)
+        │
+        ▼
+   ┌── for each task in phase ──────────────────────────────────────────┐
+   │     worker subagent (fresh context) ──▶ task-verifier (read-only)   │
+   │                                              │                       │
+   │                          ┌───────────────────┼───────────────────┐  │
+   │                       PASS              FAIL:fixable        FAIL:escalate
+   │                          │                   │                    │  │
+   │                          │                   ▼                    │  │
+   │                          │            fix-agent (scoped)          │  │
+   │                          │            re-verify  ▲                │  │
+   │                          │            retry < 2 ─┘                │  │
+   │                          │            retries == 2 ──────────────▶│  │
+   │                          ▼                                        ▼  │
+   │                   mark task done                          ESCALATE to human
+   └──────────┬─────────────────────────────────────────────────────────┘
+              │ all tasks PASS
+              ▼
+   phase checkpoint (human: continue / review / stop)
+              │
+              ▼
+   integration-checker (read-only, diffs whole build from base ref)
+              │
+        ┌─────┼───────────────┐
+      PASS  FAIL:fixable   FAIL:escalate
+        │     │ (fix-agent,     │
+        │     │  reuse cap)     ▼
+        ▼     ▼            ESCALATE
+      build complete ◀── re-check
+```
+
+---
+
+## 6. Opt-in side-flows (8)
+
+Advisory, never blocking. They enrich a lane but are not required by any.
+
+| # | Flow | Trigger | Writes to | Stop |
+|---|---|---|---|---|
+| 1 | `/user:design` | between `/think` and `/spec`, when the solution needs working out | `docs/specs/DECISION-BRIEF.md` (folded into the spec by `/spec`) | solution agreed per section |
+| 2 | `/user:devs-team` | before the spec hardens; 5 engineering lenses | `## Design critique` in the active spec (else the brief) | verdict recorded |
+| 3 | `/user:visual-team` | a visual/UI design exists (downstream) | `## Visual critique` in the active spec (else brief, else inline) | verdict recorded |
+| 4 | `/user:ui-design` | downstream UI work, after `/design` | `## UI design` in the spec; generates via `frontend-design`; critiques via `/visual-team` | SOLID/RECONSIDER verdict or max-2 revise |
+| 5 | `/user:test-plan` | before `/execute`; derive a coverage matrix | `## Test plan` in the spec (consumed by `/execute`) | matrix written |
+| 6 | `/user:review-team` | PR-grade review; 3 lenses (security/architecture/test-coverage) in parallel | `REVIEW.md` (+ `TODOS.md`) | SHIP / FIX THEN SHIP / DO NOT SHIP |
+| 7 | `/user:absorb` | maintainer-only external-absorption audit | dated report under `docs/absorption/` | proposal-only report (human merge gate) |
+| 8 | `/user:kit-health` | maintainer self-assessment vs PHILOSOPHY, before tagging | report (stdout) | assessment rendered |
+
+All of these write **into the active spec** when the output binds to a spec (replace-not-stack),
+so a later reader and an earlier writer never split across two specs.
+
+---
+
+## 7. Alternate / branch flows (7)
+
+The edges that fire when the happy path does not hold.
+
+### 7.1 Retry (fixable failure)
+A `task-verifier` (or `integration-checker`) `FAIL:fixable` dispatches a scoped **fix-agent**,
+then re-verifies. Cap: **2 retries**. Rationale: 1-2 cycles catch import/assertion/off-by-one
+bugs; 3+ means a design problem, not a code bug.
+
+### 7.2 Escalate (unfixable or exhausted)
+`FAIL:escalate`, or retries hitting the cap, **stops the loop and hands to the human** with the
+full context (task, every verifier report, every fix attempt). Escalate verdicts are never
+auto-retried; they need judgment by definition.
+
+### 7.3 Ambiguous spec
+Spec detection is branch-aware. When more than one non-`SHIPPED`/`PARKED` spec is active and the
+branch slug does not disambiguate, the detectors emit `spec:ambiguous(...)` and **ask** rather
+than silently pick. Resolve by branch or by naming the spec.
+
+### 7.4 No activator installed
+`/user:assign` detects the goal-loop activator (`/goal` -> `ralph-loop` -> `goal-craft`). If none
+is installed it **degrades gracefully**: the draft is left as a plain reusable file you paste
+wherever. Only one-step activation is lost; the draft still works.
+
+### 7.5 Idempotent re-run
+Re-running `/user:assign` for an ID that already has a `.claude/goals/<slug>.md` **re-surfaces the
+existing draft** instead of duplicating it or double-advancing status. The filesystem is the
+source of truth.
+
+### 7.6 DO-NOT-SHIP gate
+`/user:ship` reads `REVIEW.md` first. `DO NOT SHIP` -> **stop**, fix first. `FIX THEN SHIP` ->
+apply fixes, then ship. No `REVIEW.md` -> **warn and ask** (run `/review`, run `/review-team`,
+or ship anyway); never silently skipped.
+
+### 7.7 Completeness warn + log (not a block)
+Two self-checks run during Build/Reflect and **warn + log** to
+`~/.claude/dwarves-kit/logs/completeness.log` without blocking: **decision-translation** (each
+Build-decision is referenced by a task/AC) and **doc-update** (a change touching X moves its
+companion docs per the WORKFLOW doc-impact map). `/user:ship` and `/user:retro` review that log
+at the gate. Hard blocks stay reserved for the safety subset.
+
+---
+
+## 8. The four hard stops (the only blockers)
+
+Everything above suggests or warns. These four refuse to proceed, because the cost of the
+mistake is irreversible:
+
+| Hard stop | Fires on | Mechanism |
+|---|---|---|
+| safety-gate | destructive Bash (`rm -rf`, `DROP TABLE`, `git reset --hard`, `kubectl delete`; build-artifact allowlist exempt) | PreToolUse hook, exit 2 |
+| push-to-main blocker | a push to `main`/`master`/protected | PreToolUse hook, exit 2 |
+| anti-rationalization | premature "done" claim; phantom-impl stub in the diff; guess-fix while `## Root cause` empty | Stop hook |
+| verification pipeline | a task whose acceptance criteria are unmet or whose tests did not actually run | `/execute` gate (worker -> verifier -> fix -> escalate) |
+
+---
+
+## 9. Quick reference
+
+### 9.1 Trigger -> flow -> stop -> enforcer
+
+| Trigger | Starts | Stop condition | Enforcer |
+|---|---|---|---|
+| `/user:start` | render queue + drafts | output rendered | none (detector) |
+| `/user:assign ID-NNN` | goal draft + lane routing | draft written, status flipped, handed off | none (mutator; idempotent) |
+| `/user:think` | decision brief | brief written (if BUILD) | advisory |
+| `/user:spec` | spec scaffold | spec exists, `Status: DRAFT` | spec-drift-guard hook |
+| `/user:spec-validate` | 5-lens adversarial review | `Status: VALIDATED` | advisory (full lane) |
+| `/user:execute` | verification pipeline | all tasks + integration PASS | verification pipeline (hard) |
+| `/user:debug` | 4-phase debug loop | root cause + fix verified + human-confirmed | iron law + guess-fix guard |
+| `/user:review[-team]` | review | verdict recorded in `REVIEW.md` | advisory |
+| `/user:docs` | doc sync + doc-verifier | docs match code | advisory |
+| `/user:ship` | ship pipeline | tagged/PR; spec `SHIPPED`; ID off queue | ship gate + push-to-main blocker (hard) |
+| `/user:retro` | retrospective | `docs/retro/RETRO-<date>-<slug>.md` written | advisory |
+| a `/goal` activator | goal loop | `## Verification` passes + done-definition | anti-rationalization (hard) |
+
+### 9.2 Stop-condition index (where each loop terminates)
+
+- **Goal loop**: `## Verification` command(s) pass and the done-definition holds; or a blocker
+  note is written to `## Open questions` and it stops.
+- **Debug loop**: `## Root cause` recorded, fix verified, human-confirmed; or the 3-fix
+  architecture wall halts it for a design rethink.
+- **Execute pipeline**: every task PASS and integration-checker PASS; or `FAIL:escalate` /
+  2 exhausted retries -> human.
+- **ui-design loop**: a SOLID or RECONSIDER verdict, or the max-2 revise cap.
+- **Lanes**: `tiny` when the edit verifies; `normal`/`full` when shipped (ID off queue);
+  `bug` at human-confirm; `backfill` when the docs are written with no app-behavior change.
+
+---
+
+## See also
+- `WORKFLOW.md` - the rules contract this manual visualizes (the cycle table, the lane table, the doc-impact map).
+- `MANUAL.md` - per-command operator detail (reads/writes/gotchas for each `/user:*`).
+- `docs/architecture.md` - component fit and the state model.
+- `docs/PHILOSOPHY.md` - why these boundaries exist (the bounded/unbounded loop note, the hard-stop reservation).
+- `AGENTS.md` - the tool-agnostic operate-contract the goal loop projects from.

From f8551c192653858f12b6bf197abdaaa55a8ce8d2 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:12:24 +0700
Subject: [PATCH 22/56] docs: add operator playbook (interaction scenarios)

New docs/PLAYBOOK.md: scenario -> trigger -> response -> orchestration hook for the 5 entry scenarios (what's-next, apply-SDD, autonomous-full-flow, iteration-heavy phase, vague brief). Leads with the hooks-vs-commands-vs-skills layering (only hooks auto-fire) and the BACKLOG-ID-first gap + the freeform->ID bridge. Includes an autonomy-dial and a triggers cheat-sheet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md        |   1 +
 docs/PLAYBOOK.md | 268 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 269 insertions(+)
 create mode 100644 docs/PLAYBOOK.md

diff --git a/README.md b/README.md
index 47f55a7..9294b8d 100644
--- a/README.md
+++ b/README.md
@@ -222,6 +222,7 @@ dwarves-kit/
   WORKFLOW.md                   The cycle, the risk-tier lanes, the gate at each boundary
   MANUAL.md                     Operator reference for the commands
   docs/ORCHESTRATION.md         The flow/loop view: lanes, loops, triggers, stop conditions (ASCII diagrams)
+  docs/PLAYBOOK.md              The interaction view: what you say -> what happens (operator scenarios)
   RUNBOOK.md                    Hook misbehavior diagnosis + recovery
   README.md / CONTRIBUTING.md / CHANGELOG.md / VERSION / LICENSE
   CLAUDE.md                     Project template; the Claude-Code layer on top of AGENTS.md
diff --git a/docs/PLAYBOOK.md b/docs/PLAYBOOK.md
new file mode 100644
index 0000000..d310910
--- /dev/null
+++ b/docs/PLAYBOOK.md
@@ -0,0 +1,268 @@
+# PLAYBOOK.md: operator scenarios (what you say -> what happens)
+
+> How to drive the kit from natural language. Scenario -> trigger phrase ->
+> response -> orchestration hook. This is the interaction view. For the flow/loop
+> internals read `docs/ORCHESTRATION.md`; for per-command detail read `MANUAL.md`;
+> for the rules contract read `WORKFLOW.md`.
+>
+> Worked examples use the maintainer's own phrasings; the behavior is operator-generic.
+
+## The one thing to understand first: three layers, only one is automatic
+
+Your sentence does not mechanically trigger a flow. There are three layers and only
+the first fires on its own:
+
+| Layer | Fires how | Examples |
+|---|---|---|
+| **Hooks** | **Automatic**, on Claude Code events | `context-readiness` (SessionStart suggestion), `safety-gate`, `anti-rationalization`, `spec-drift-guard`, `push-to-main` |
+| **`/user:*` commands** | **Invoked** - you type `/user:x`, OR Claude reads your intent and runs it | `/user:start`, `/user:assign`, `/user:spec`, `/user:execute`, `/user:ship` |
+| **Skills** | **Invoked** - Claude recognizes the situation and loads the skill | `goal-craft`, `superpowers:brainstorming`, `content-spec` |
+
+So when you say "apply SDD," no hook fires on the word "SDD." What happens is **Claude
+interprets** the phrase and **invokes** the right command/skill. The kit's hooks then act
+as guardrails around whatever runs. Keep this split in mind for every scenario below.
+
+A second load-bearing fact: the kit's orchestration is **BACKLOG-ID-first**. `/user:assign`
+takes an `ID-NNN`. A freeform intent with no ID ("apply SDD to this feature", a vague brief)
+has **no auto-path today**; Claude bridges it manually (Section 8). A real freeform front
+door is proposed in Section 9 (SPEC-026 / ID-022).
+
+---
+
+## Scenario 1: session start, "what's next / what's left"
+
+**Context.** Fresh session, you want orientation before doing anything.
+
+**What you say.** "what's next", "what's left to do", "where were we", or `/user:start`
+(`--brief` for one line, `--full` for the task checklist + recent commits).
+
+**Automatic vs interpreted.**
+- *Automatic*: the `context-readiness` hook already injected a one-line `next:` suggestion into
+  Claude's context when the session started. You may see Claude reference it unprompted.
+- *Interpreted*: Claude reads "what's next" and runs the `/user:start` behavior (a **detector**):
+  render the `_meta/BACKLOG.md` Active queue and the active `.claude/goals/` drafts. Read-only.
+
+**Resulting flow.** Detector only, no mutation:
+```text
+  "what's next" -> /user:start -> renders:
+     - BACKLOG Active queue (open ID-NNN + status: queued/speccing/validated/executing)
+     - active goal drafts in .claude/goals/
+     - the suggested next command
+```
+
+**Decision / approval points.** None (nothing is changed). You then choose the next move:
+`/user:assign ID-NNN` to start an item, or `/user:next` to pick up the next undone task of the
+already-active spec.
+
+**How to continue.** Say `assign ID-007` (start that item) or `next` (continue the active spec).
+If the queue is empty or you have a new idea, jump to Scenario 2 or 5.
+
+---
+
+## Scenario 2: mid-brainstorm, "apply SDD framework on this feature X"
+
+**Context.** You are brainstorming a feature. There is **no BACKLOG ID** for it yet.
+
+**What you say.** "apply SDD to feature X", "let's spec this", "run the spec-driven flow on X".
+
+**Automatic vs interpreted.**
+- *Not a keyword trigger.* No hook watches for "SDD". Nothing auto-fires.
+- *Interpreted*: Claude maps "SDD" to **the spec-driven lane** (normal or full) and proposes it.
+  "SDD" is a keyword **for Claude to interpret**, not a mechanical trigger.
+
+**Resulting flow (today).** Because there is no ID, Claude runs the **freeform -> ID bridge**
+(Section 8), then routes into the lane. By default Claude does **not** auto-run the whole flow;
+it sets up and starts the first step, checking the lane + scope with you:
+```text
+  "apply SDD to X"
+     -> Claude picks the lane (normal vs full) and confirms scope with you
+     -> bridge: crystallize (/user:think if fuzzy) -> write a BACKLOG row (new ID) -> /user:assign ID
+     -> /user:spec  (-> /user:spec-validate on full)
+     -> /user:execute (verification pipeline)
+     -> /user:review[-team] -> /user:docs -> /user:ship -> /user:retro
+```
+
+**Decision / approval points.** Lane + scope confirmation before the spec; spec approval; the
+spec-validate verdict (full lane); execute phase checkpoints. Claude stops at each unless you
+pre-authorize autonomy (Scenario 3).
+
+**Honest caveat.** "Apply SDD" does not auto-launch the full orchestration. Claude interprets it
+and drives it, with the hooks as guardrails. If you want "apply SDD" to be a real one-shot front
+door, that is the SPEC-026 enhancement (Section 9).
+
+---
+
+## Scenario 3: "run the full flow, do not interrupt, your call"
+
+**Context.** You trust the call and want autonomy end to end.
+
+**What you say (pick the autonomy level explicitly).**
+- "run the full lane autonomously; only stop at hard stops" -> autonomous up to the outward-facing step.
+- "run it all the way to a PR, your call" -> fully autonomous to merge-ready (push + PR included).
+- Or set a goal loop: `/goal <objective>` then "run it" (the bounded in-session loop).
+
+**Automatic vs interpreted.**
+- *Interpreted*: Claude drives the lane end to end without pausing at the **advisory** checkpoints.
+- *Automatic*: in a `/goal` loop the **anti-rationalization Stop hook** keeps the session working
+  until the stop condition holds; the **4 hard stops** still gate every step.
+
+**Resulting flow.**
+```text
+  /spec -> /spec-validate -> /execute (auto worker->verifier->fix<=2->integration; escalate on fail)
+        -> /review[-team] -> /docs -> /ship
+  (advisory phase checkpoints are skipped; the loop runs continuously)
+```
+
+**Where it STILL stops (always, even autonomous).**
+- The 4 hard stops: `safety-gate` (destructive Bash), `push-to-main` blocker,
+  `anti-rationalization` (premature/false "done"), the verification pipeline (a task that fails).
+- Outward-facing irreversible steps: the push and the PR. Claude confirms these **unless** you
+  said "all the way to a PR." That is the one place "your call" still asks once, by policy.
+
+**Decision / approval points.** Only the above. Everything advisory is auto-advanced.
+
+**The exact phrase to grant maximum autonomy:** *"Run the full lane autonomously, including the
+push and PR; only stop at the safety hard-stops or a real blocker."*
+
+---
+
+## Scenario 4: an iteration-heavy phase (discuss solution, revisit design)
+
+**Context.** You want to iterate on one phase (usually solution design) before committing to a spec.
+This is the **opposite** of Scenario 3: maximal checkpoints, no autonomy.
+
+**What you say.** "let's discuss the solution", "iterate on the design", "get back on the design
+for X", "explore approaches for X", or `/user:design`.
+
+**Automatic vs interpreted.**
+- *Interpreted*: Claude invokes `/user:design` (the opt-in interactive solution-design beat) and/or
+  `/user:devs-team` (5-lens engineering critique). For open-ended exploration, `superpowers:brainstorming`.
+
+**Resulting flow (a human-in-the-loop loop).**
+```text
+  /user:think (optional, if the idea needs challenging: 6 forcing questions)
+     -> /user:design  ── proposes 2-3 approaches, ONE question at a time,
+     │                   holds for your approval PER SECTION, appends the
+     │                   agreed Solution to docs/specs/DECISION-BRIEF.md
+     │   <iterate: you redirect, it revises, re-presents> ◀──┐
+     └───────────────────────────────────────────────────────┘
+     -> (optional) /user:devs-team  ── critique appended to the spec/brief
+     -> /user:spec  ── folds the DECISION-BRIEF Solution into the spec
+```
+
+**Stop condition.** The loop continues until **you approve** the solution section by section.
+It never auto-advances; every section pauses for you. `/user:design` is explicitly the
+"ran without my feedback" antidote.
+
+**Decision / approval points.** Per section in `/design`; the critique verdict (SOLID / REVISE /
+RECONSIDER); spec approval. To leave the loop: "the design is good, write the spec."
+
+---
+
+## Scenario 5: a vague / ambiguous goal
+
+**Context.** You have a fuzzy idea, not yet crystallized into something buildable.
+
+**What you say.** "I have a rough idea about X", "help me figure out what to build for X",
+"here's a vague brief: ...", or `/goal <fuzzy intent>` (then `goal-craft` sharpens it).
+
+**Automatic vs interpreted.**
+- *Interpreted*: Claude **interviews / grills** you. There is **no "goal-griller" skill**; the
+  grilling is `/user:think` (6 forcing questions) and/or `superpowers:brainstorming` (intent +
+  requirements exploration). The `goal-craft` skill **sharpens** a fuzzy intent into an
+  outcome-shaped `/goal` (verification, scope fence, termination-on-blocker).
+
+**Resulting flow.**
+```text
+  vague brief
+     -> Claude interviews: /user:think  and/or  superpowers:brainstorming
+        (challenge the idea, surface requirements, name the real outcome)
+     -> crystallize into a clear objective
+     -> [YOU APPROVE the crystallized objective]
+     -> bridge: write a BACKLOG row (new ID) -> /user:assign ID -> lane (Scenario 2's flow)
+```
+
+**Will Claude auto-hook it into the SDD orchestration after you approve?** Yes, **on your "go"** -
+Claude chains it: write the BACKLOG row, `/user:assign` the new ID, and start the lane's first
+command. But understand this is **Claude chaining the steps**, not the kit auto-wiring a freeform
+brief. And Claude will ask **how autonomous** to be from there (ties to Scenario 3).
+
+**Decision / approval points.** Claude will **not** write the spec or start the lane until you
+approve the crystallized objective. That approval gate is deliberate: a vague brief turned
+straight into a spec is how scope drift starts.
+
+---
+
+## 8. The freeform -> ID bridge (the step the kit does not auto-do)
+
+Scenarios 2 and 5 are freeform (no ID). The orchestration is ID-first, so Claude bridges:
+
+```text
+  freeform intent
+     1. crystallize   -> /user:think or brainstorming until the outcome is clear
+     2. allocate      -> pick the next ID-NNN, write a row into _meta/BACKLOG.md Active queue
+                         (Title, Source, Target artifact, Lane, Status: queued)
+     3. assign        -> /user:assign ID-NNN  (writes the goal draft, flips status, routes)
+     4. run the lane  -> Scenario 2's flow
+```
+
+This bridge is **manual today** (Claude runs it). It preserves ID-first traceability: even an
+ad-hoc idea gets a backlog row and an ID, so nothing ships untracked. The cost is the bookkeeping
+detour every time. Section 9 proposes removing it.
+
+---
+
+## 9. Pre-authorization phrases (autonomy dial)
+
+How much to say to set the autonomy level. Say one of these and Claude calibrates:
+
+| You say | Autonomy | Claude stops at |
+|---|---|---|
+| "propose it, don't run anything" | none | after planning; waits for go |
+| "run it, check with me at each phase" | low (default) | every advisory phase checkpoint |
+| "run the lane, only stop at hard stops" | high | the 4 hard stops + the push/PR (outward-facing) |
+| "run it all the way to a PR, your call" | max | only the 4 hard stops + a real blocker |
+
+The 4 hard stops (`safety-gate`, `push-to-main`, `anti-rationalization`, the verification pipeline)
+are **never** waived by any autonomy level.
+
+---
+
+## 10. Cheat-sheet: what you say -> what happens
+
+| You say | Claude invokes | Fires automatically | Stops at |
+|---|---|---|---|
+| "what's next / what's left" | `/user:start` (detector) | context-readiness suggestion | nothing (read-only) |
+| "assign ID-007" / "start ID-007" | `/user:assign ID-007` | (none) | hands off to the lane |
+| "apply SDD to X" (no ID) | bridge -> `/user:spec` lane | spec-drift-guard once a spec exists | lane + scope confirm, then per-phase |
+| "discuss / iterate the design" | `/user:design` (+ `/user:devs-team`) | (none) | every section (human-in-loop) |
+| "vague idea about X" | `/user:think` / brainstorming | (none) | your approval of the objective |
+| "run the full lane, your call" | the lane, autonomously | anti-rationalization (in a /goal loop) | hard stops + push/PR |
+| "fix this bug / it regressed" | `/user:debug` (bug lane) | guess-fix guard | root cause + human-confirm |
+| "review this" / "ship it" | `/user:review[-team]` / `/user:ship` | ship gate, push-to-main | DO-NOT-SHIP verdict; the push/PR |
+
+---
+
+## 11. Proposed enhancement: a real freeform front door
+
+Scenarios 2 and 5 expose a real gap: freeform intent ("apply SDD to X", a vague brief) has no
+auto-path; Claude bridges it by hand every time (Section 8). SPEC-024 deliberately deferred the
+"freeform griller entry" to keep BACKLOG-ID-first canonical.
+
+**Proposed**: extend `/user:assign` (the one mutator) to accept **freeform intent** in addition
+to `ID-NNN`. The freeform path runs the crystallize step, auto-allocates the next ID, writes the
+BACKLOG row, then proceeds exactly as ID-first, so "apply SDD to X" becomes a genuine one-shot
+front door without losing ID traceability.
+
+- Spec: `docs/specs/SPEC-026-freeform-front-door.md` (DRAFT).
+- Backlog: ID-022.
+
+Until that ships, the bridge in Section 8 is the supported path.
+
+---
+
+## See also
+- `docs/ORCHESTRATION.md` - the flow/loop view (lanes, loops, triggers, stop conditions, ASCII diagrams).
+- `WORKFLOW.md` - the rules contract (the cycle, the lanes, the gates).
+- `MANUAL.md` - per-command operator detail.
+- `AGENTS.md` - the operate-contract the goal loop projects from.

From e9d156a4750f14e0c85cd22aa32660019961e4af Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:12:24 +0700
Subject: [PATCH 23/56] docs(spec): draft freeform front door enhancement

SPEC-026 (DRAFT) + BACKLOG ID-022: extend /user:assign to accept freeform intent (not only ID-NNN), auto-allocating an ID + BACKLOG row so freeform intake is native. Unparks the SPEC-024-deferred griller entry; preserves ID-first traceability via an approve-before-allocate + row-before-draft invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                           |   2 +
 docs/specs/SPEC-026-freeform-front-door.md | 148 +++++++++++++++++++++
 2 files changed, 150 insertions(+)
 create mode 100644 docs/specs/SPEC-026-freeform-front-door.md

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index b790ed8..60cfce4 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -41,12 +41,14 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-019 | Demo `examples/hello-spec/docs/specs/SPEC-001-version-flag.md` lacks a `## After state` section; add one so the demo is a complete exemplar of the post-SPEC-024 spec template | SPEC-024 review 2026-05-21 (test-coverage lens) | TBD | tiny | queued |
 | ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
 | ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
+| ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | speccing |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
 - ID-016 / ID-017 came out of the SPEC-025 retro; both are kit-internal hygiene, independent of the SPEC-024 chain.
 - ID-018 / ID-019 came out of the SPEC-024 /review-team pass (deferred LOW findings); both are tiny-lane polish, independent of the SPEC-024 ship.
 - ID-020 / ID-021 came out of the SPEC-024 retro; ID-020 (verifier absence-checks) relates to ID-016 (both are guard/verifier hardening) and is the highest-signal kit finding of the cycle.
+- ID-022 came out of writing `docs/PLAYBOOK.md` (scenarios S2/S5): the freeform-intent gap. It unparks the SPEC-024-deferred griller entry; SPEC-026 drafts it. Independent of the other chains.
 
 ## Schema
 
diff --git a/docs/specs/SPEC-026-freeform-front-door.md b/docs/specs/SPEC-026-freeform-front-door.md
new file mode 100644
index 0000000..de93f3a
--- /dev/null
+++ b/docs/specs/SPEC-026-freeform-front-door.md
@@ -0,0 +1,148 @@
+# Spec: freeform front door (intent -> ID -> lane, no manual bookkeeping)
+Generated: 2026-05-22
+Status: DRAFT
+
+> Source: the PLAYBOOK.md scenarios (S2 "apply SDD to X", S5 vague brief) and SPEC-024's
+> deferred "freeform griller entry". Backlog: ID-022.
+
+## Problem
+The kit's orchestration is BACKLOG-ID-first: `/user:assign` takes an `ID-NNN`. A freeform
+intent with no ID ("apply SDD to this feature", a vague brief) has no front door. Today Claude
+bridges it by hand every time (`docs/PLAYBOOK.md` Section 8): crystallize -> write a BACKLOG row
+-> `/user:assign ID` -> lane. That bridge is correct but is unmanaged: it lives only in Claude's
+interpretation, so it is inconsistent run to run, and it is the friction SPEC-024 named when it
+deferred the "freeform griller entry" (Out of Scope, that cycle's scope call).
+
+Consequence: every ad-hoc idea pays a bookkeeping detour before the lane can run, and the most
+common way an operator actually starts work (freeform intent in chat) is the one path the kit
+does not support natively.
+
+## Solution
+
+### Approaches considered
+- **A: Extend `/user:assign` to accept freeform intent as well as `ID-NNN` (chosen).** The one
+  mutator gains a freeform path that crystallizes, auto-allocates the next ID, writes the BACKLOG
+  row, then proceeds exactly as the ID-first path. Tradeoff: `/assign`'s contract widens; the
+  crystallize step makes it non-trivially longer than today's lookup.
+- **B: A new `/user:griller` (or `/user:intake`) command** dedicated to freeform capture, separate
+  from `/assign`. Rejected: a second intake mutator splits the detector/mutator model (the kit has
+  exactly one mutator by design, SPEC-006) and duplicates the routing logic.
+- **C: A skill/keyword recognizer** (Claude detects "apply SDD / build X" and runs the bridge with
+  no command change). Rejected as the primary mechanism: it leaves the front door purely in model
+  interpretation (the exact fragility this spec exists to remove), and "guardrails over guidance"
+  prefers an invoked command over a prose convention. (A light recognizer convention can still
+  point at `/assign <freeform>`; see Out of Scope.)
+
+### Chosen approach + why
+Widen `/user:assign` so its argument is either an existing `ID-NNN` (today's path, unchanged) or
+freeform intent text. On the freeform path it: (1) crystallizes the intent (challenge + scope via
+the existing `/user:think` / brainstorming behavior, only as far as needed to name an outcome and
+a lane), (2) allocates the next `ID-NNN` and appends a BACKLOG Active-queue row (`Status: queued`),
+(3) then runs the identical ID-first path (goal draft, lane pick, activator detect, hand-off). One
+mutator, one routing path, ID traceability preserved (the ID is created on the fly, never skipped).
+
+### Extensibility & boundaries
+- **Load-bearing dimension: intake shape.** Today: one shape (ID). After: two shapes (ID,
+  freeform) behind one entry. A future third shape (e.g. an imported issue) is another branch in
+  the same resolver, not a new command.
+- **Units (independently testable):** the intent resolver (ID vs freeform), the ID allocator +
+  BACKLOG writer, the crystallize gate, the unchanged ID-first tail. Each describable in <=3
+  sentences.
+
+### Architecture
+```text
+  /user:assign <arg>
+       │
+       ▼
+   resolve arg ──┬── looks like ID-NNN ──▶ (today's path, unchanged) ──┐
+                 │                                                       │
+                 └── freeform text ──▶ crystallize (think/brainstorm)   │
+                                       └▶ allocate next ID + write       │
+                                          BACKLOG row (queued)           │
+                                          └▶ [human approves objective]──┤
+                                                                         ▼
+                                              goal draft -> lane pick -> activator -> hand off
+```
+
+## Technical Design
+
+### Interfaces (I/O contract)
+- **`commands/assign.md`** consumes: `$ARGUMENTS` = either `ID-NNN` OR freeform intent text.
+  Produces (freeform path): a new `_meta/BACKLOG.md` row with an allocated ID + `Status: queued`,
+  then the existing `.claude/goals/<slug>.md` draft. Invariant: the freeform path MUST write the
+  BACKLOG row before the goal draft (ID traceability first); it MUST pause for human approval of
+  the crystallized objective before allocating, so a vague brief never auto-creates a row.
+- Invariant: the ID-first path (an `ID-NNN` argument) is byte-for-byte unchanged.
+
+### Data model changes
+None to the schemas. The BACKLOG Active-queue row shape (SPEC-005 schema) is reused; the freeform
+path just writes one. `Source` column records "freeform intake (date)".
+
+### API changes (the `/user:assign` contract)
+`$ARGUMENTS` widens from "an `ID-NNN`" to "an `ID-NNN` or freeform intent". Detection rule:
+matches `^ID-[0-9]+$` (after trim) -> ID path; else -> freeform path.
+
+### UI changes
+None. The kit ships no UI.
+
+### Infrastructure changes
+- `commands/assign.md`: the resolver + freeform path + the approval gate.
+- `tests/test-meta.sh`: assert `assign.md` documents both paths and the "row before draft" +
+  "approve before allocate" invariants.
+- `docs/PLAYBOOK.md`: replace Section 8's "manual bridge" note with the native path once shipped.
+- `WORKFLOW.md` `## The spine`: note `/assign` accepts freeform, not only an ID.
+
+## Task Breakdown
+### Phase 1: Resolver + freeform path
+- [ ] TASK-001: Add the arg resolver to `commands/assign.md` (`ID-NNN` vs freeform) with the ID
+  path unchanged. - AC: `assign.md` documents the two-shape argument + the `^ID-[0-9]+$` rule.
+- [ ] TASK-002: Specify the freeform path: crystallize -> approval gate -> allocate next ID +
+  write BACKLOG row -> existing ID-first tail. - AC: `assign.md` carries the ordered freeform
+  steps and both invariants (row-before-draft, approve-before-allocate).
+
+### Phase 2: Guards + docs
+- [ ] TASK-003: `tests/test-meta.sh` assertions for the two-path contract + invariants. - AC:
+  `bash tests/test-meta.sh` exercises them and passes.
+- [ ] TASK-004: Update `docs/PLAYBOOK.md` (S2/S5/S8 -> native path) and `WORKFLOW.md` spine note.
+  - AC: PLAYBOOK no longer calls the bridge "manual today"; WORKFLOW spine says `/assign` accepts
+  freeform.
+
+## After state
+Observable; each bullet is false now and a real check once shipped.
+- [ ] `/user:assign "apply SDD to X"` (freeform) crystallizes, allocates an ID, writes a BACKLOG
+  row, and routes into the lane, with one approval gate. (Today: `/assign` only accepts `ID-NNN`.)
+- [ ] `/user:assign ID-007` behaves exactly as before (no regression on the ID path).
+- [ ] `docs/PLAYBOOK.md` Section 8 describes a native front door, not a manual bridge.
+- [ ] `tests/test-meta.sh` pins the two-path contract + the row-before-draft + approve-before-allocate invariants.
+
+## Acceptance Criteria (global)
+- [ ] All tasks pass their individual acceptance criteria.
+- [ ] `/assign` accepts freeform OR `ID-NNN`; freeform writes a tracked BACKLOG row before any draft; the ID path is unchanged.
+- [ ] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0.
+
+## Verification
+`bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `commands/assign.md` documents both the `ID-NNN` and freeform paths AND the row-before-draft + approve-before-allocate invariants are present.
+
+## Edge Cases
+1. Ambiguous arg (looks like an ID but is not in the queue): treat as ID path, report "unknown id" (today's behavior), do NOT silently create a freeform row from a typo'd ID.
+2. Freeform intent that is still too vague to name an outcome: the crystallize step loops (think/brainstorming) and does NOT allocate an ID until the objective is approved (no half-baked rows).
+3. Duplicate freeform intent (a row already exists for the same idea): surface the existing row instead of allocating a second ID (mirror the SPEC-005 idempotency edge).
+
+## Failure modes
+| Failure class | Detection signal | Mitigation / recovery |
+|---|---|---|
+| Freeform path auto-creates rows from vague briefs | BACKLOG fills with half-baked queued rows | the approve-before-allocate invariant: no row until the objective is approved |
+| ID path regresses behind the new resolver | an `ID-NNN` assign behaves differently | the ID path is unchanged + a no-regression AC + meta assertion |
+| Untracked freeform work | a lane runs with no BACKLOG row | the row-before-draft invariant: the row is written before the goal draft |
+
+## Out of Scope
+- **A prose keyword-recognizer that auto-runs `/assign` without the operator invoking it** (approach C as the primary mechanism): the front door is an invoked command, not a model convention. A light "you can run `/assign <freeform>`" pointer is fine; auto-execution on a detected phrase is not.
+- **Removing BACKLOG-ID-first**: the ID stays canonical; this spec only changes WHEN the ID is allocated (on the fly for freeform), not WHETHER.
+- **Autonomy level of the subsequent lane**: governed by the operator's pre-authorization (PLAYBOOK Section 9), not by this front door.
+
+## Decision Log
+- DEC-001: Extend `/assign` (approach A) rather than add a second intake command (B) or rely on a keyword recognizer (C). Rationale: one mutator (SPEC-006), guardrails over guidance, ID traceability preserved. Who: drafted from the PLAYBOOK scenarios 2026-05-22.
+- DEC-002: Freeform path pauses for human approval of the crystallized objective BEFORE allocating an ID. Rationale: a vague brief turned straight into a tracked row + spec is how scope drift starts (PLAYBOOK S5). Who: same.
+
+## Open questions
+(none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From 4c7e42da84884d4e0880cdf148c399e014075fd2 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:22:07 +0700
Subject: [PATCH 24/56] docs: add operating-layer vision + SDLC state machine

Design-first north-star for the natural-language operating layer: vision, the SDLC modeled as a formal state machine (states, guarded transitions, terminals, hard-stops-as-guards, 2 sub-machines), the full 15-scenario catalog (your 5 + 10 new), and a gap analysis. New backlog rows ID-023 (mid-flight spec amend), ID-024 (context-switch + abandon terminal), ID-025 (re-open-shipped) trace here; SPEC-026 is the first implementing spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md                      |   1 +
 _meta/BACKLOG.md               |   4 +
 docs/PLAYBOOK.md               |   1 +
 docs/operating-layer-vision.md | 196 +++++++++++++++++++++++++++++++++
 4 files changed, 202 insertions(+)
 create mode 100644 docs/operating-layer-vision.md

diff --git a/README.md b/README.md
index 9294b8d..9df70f4 100644
--- a/README.md
+++ b/README.md
@@ -223,6 +223,7 @@ dwarves-kit/
   MANUAL.md                     Operator reference for the commands
   docs/ORCHESTRATION.md         The flow/loop view: lanes, loops, triggers, stop conditions (ASCII diagrams)
   docs/PLAYBOOK.md              The interaction view: what you say -> what happens (operator scenarios)
+  docs/operating-layer-vision.md  Design-first vision + the SDLC state machine (north-star for the operating-layer specs)
   RUNBOOK.md                    Hook misbehavior diagnosis + recovery
   README.md / CONTRIBUTING.md / CHANGELOG.md / VERSION / LICENSE
   CLAUDE.md                     Project template; the Claude-Code layer on top of AGENTS.md
diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 60cfce4..ab54c6f 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -42,6 +42,9 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
 | ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
 | ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | speccing |
+| ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
+| ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
+| ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
@@ -49,6 +52,7 @@ Dependency notes:
 - ID-018 / ID-019 came out of the SPEC-024 /review-team pass (deferred LOW findings); both are tiny-lane polish, independent of the SPEC-024 ship.
 - ID-020 / ID-021 came out of the SPEC-024 retro; ID-020 (verifier absence-checks) relates to ID-016 (both are guard/verifier hardening) and is the highest-signal kit finding of the cycle.
 - ID-022 came out of writing `docs/PLAYBOOK.md` (scenarios S2/S5): the freeform-intent gap. It unparks the SPEC-024-deferred griller entry; SPEC-026 drafts it. Independent of the other chains.
+- ID-023 / ID-024 / ID-025 came out of `docs/operating-layer-vision.md` (the SDLC state-machine gap analysis): the transitions with no clean path today. ID-022 is the highest-priority of the operating-layer set; these three follow it. All trace to the vision doc as their north-star.
 
 ## Schema
 
diff --git a/docs/PLAYBOOK.md b/docs/PLAYBOOK.md
index d310910..374522e 100644
--- a/docs/PLAYBOOK.md
+++ b/docs/PLAYBOOK.md
@@ -262,6 +262,7 @@ Until that ships, the bridge in Section 8 is the supported path.
 ---
 
 ## See also
+- `docs/operating-layer-vision.md` - the design-first vision + the SDLC state machine this playbook projects (the formal model behind these scenarios).
 - `docs/ORCHESTRATION.md` - the flow/loop view (lanes, loops, triggers, stop conditions, ASCII diagrams).
 - `WORKFLOW.md` - the rules contract (the cycle, the lanes, the gates).
 - `MANUAL.md` - per-command operator detail.
diff --git a/docs/operating-layer-vision.md b/docs/operating-layer-vision.md
new file mode 100644
index 0000000..cbaf73d
--- /dev/null
+++ b/docs/operating-layer-vision.md
@@ -0,0 +1,196 @@
+# Operating-layer vision + SDLC state machine
+
+> Design-first north-star for the kit's natural-language operating layer. Vision and
+> model, not implementation. The implementing specs (SPEC-026 +) trace here. Operator
+> behavior derives from this; `docs/PLAYBOOK.md` is the operator-facing projection of
+> the state machine below, `docs/ORCHESTRATION.md` is the flow/loop view, `WORKFLOW.md`
+> is the rules contract, `docs/PHILOSOPHY.md` is the why.
+
+## 1. Vision
+
+Today the kit is driven through an **interpret-and-bridge** layer: the operator types
+intent in chat, Claude maps it to a `/user:*` command or skill, and the hooks act as
+guardrails. That works, but the operator cannot *see* the machine: there is no explicit
+notion of "what state am I in, what transitions are available from here, what guards each
+one, and how do I trigger it." The mapping lives only in Claude's interpretation, so it is
+inconsistent run to run and invisible to the operator.
+
+**The vision:** make the operating layer a **legible state machine** the operator can drive
+from natural language. At any moment the operator (and Claude) can answer four questions
+without guessing:
+
+1. **Where am I?** (the current state)
+2. **Where can I go?** (the available transitions)
+3. **What does each cost / require?** (the guard + the stop condition)
+4. **How do I trigger it?** (the phrase or command)
+
+The machine does not replace the interpret layer; it gives that layer a **declared model**
+to interpret against, so the same intent produces the same transition every time, and the
+operator can learn the map instead of re-deriving it.
+
+## 2. Principles (inherited + new)
+
+Inherited from `docs/PHILOSOPHY.md` and `WORKFLOW.md`:
+- **Guardrails over guidance.** Transitions are guarded by hooks where the cost is irreversible; everything else suggests and routes.
+- **BACKLOG-ID-first.** Every unit of work has an ID before it ships; freeform intent is bridged to an ID (SPEC-026 makes that bridge native).
+- **Bounded loops.** Sub-machines (build, debug) terminate on a model-evaluated stop, never an unbounded outer driver.
+- **Detector vs mutator.** Reading state never changes it; exactly one mutator advances it.
+
+New UX invariants this vision adds:
+- **State is always answerable.** Claude can always name the current state and the legal next transitions.
+- **Guards are explicit, not implicit.** A transition that cannot fire says *why* (the guard that blocks it), never silently stalls.
+- **The 4 hard stops are transition guards, not surprises.** They are drawn into the machine, so an operator running autonomously knows exactly the edges that will pause for them.
+
+## 3. The SDLC state machine
+
+### 3.1 States
+
+| State | Meaning | Entry | Exit |
+|---|---|---|---|
+| `IDLE` | no active unit of work | session start; an item shipped/abandoned | intake |
+| `TRIAGING` | intake: intent -> lane + (eventually) an ID | `/assign`, `/think`, "apply SDD", a vague brief | lane chosen |
+| `DESIGNING` | solution exploration (iterative) | full lane, or "let's design" | solution approved |
+| `SPECIFYING` | the spec is being written | `/spec` | spec `DRAFT` exists |
+| `VALIDATING` | adversarial spec review | `/spec-validate` | `VALIDATED` or NEEDS REVISION |
+| `BUILDING` | execution sub-machine (worker -> verifier -> fix -> integration) | `/execute`, `/next` | all tasks + integration PASS |
+| `REVIEWING` | code review | `/review`, `/review-team` | verdict recorded |
+| `DOCUMENTING` | doc sync + doc-verifier | `/docs` | docs match code |
+| `SHIPPING` | ship pipeline | `/ship` | tagged/PR; spec `SHIPPED` |
+| `REFLECTING` | retrospective | `/retro` | retro written |
+| `DEBUGGING` | off-cycle debug sub-machine (iron law) | `/debug` | root cause + fix + human-confirm |
+| `BLOCKED` | meta-state: parked, awaiting a human | "park", "I'm stuck", a hard stop, escalate | unblock / abandon |
+| `SHIPPED` | terminal: the item is done | `/ship` completes | (re-open -> TRIAGING) |
+| `ABANDONED` | terminal: the item is dropped | "kill it" | none |
+
+### 3.2 Master diagram
+
+```text
+                          ┌────────────────────────── re-open ("follow-up") ──────────────────────────┐
+                          ▼                                                                            │
+   ┌──────┐  intake     ┌──────────┐  lane=full      ┌───────────┐  approved   ┌────────────┐         │
+   │ IDLE │ ──────────▶ │ TRIAGING │ ─────────────▶  │ DESIGNING │ ──────────▶ │ SPECIFYING │         │
+   └──────┘             └────┬─────┘                 │ ⇄ iterate │             └─────┬──────┘         │
+      ▲                      │ lane=normal           └───────────┘                   │ DRAFT          │
+      │ shipped              │ (skip design)                                          ▼                │
+      │                      ├──────────────────────────────────────────────▶  VALIDATING            │
+      │                      │ lane=tiny: edit->verify->done (no spec)               │ VALIDATED       │
+      │                      │ lane=bug ─────────────▶ DEBUGGING                      │  ▲ NEEDS        │
+      │                      │ lane=backfill: docs only, no app code                  │  │ REVISION     │
+      │                      ▼                                                        ▼  │              │
+      │            ┌──────────────────────────────────────────────────────────▶ BUILDING ────────────┘
+      │            │  guard: all tasks PASS + integration PASS                       │ ⇄ retry (fix<=2)
+      │            │  guard (hard): verification pipeline, anti-rationalization      │ escalate
+      │            ▼                                                                  ▼
+      │        REVIEWING ◀── FIX THEN SHIP / DO NOT SHIP (loop back to SPECIFYING/BUILDING)
+      │            │ SHIP / fixes applied
+      │            ▼
+      │       DOCUMENTING ──▶ SHIPPING ──▶ REFLECTING ──▶ (SHIPPED) ──▶ IDLE
+      │                       │  guard (hard): DO-NOT-SHIP verdict, push-to-main blocker
+      │                       │
+   (any state) ──"park"/"stuck"/escalate──▶ BLOCKED ──resume──▶ (prior state)
+   (any state) ──"kill it"──▶ ABANDONED (terminal)
+   (any state) ──bug found──▶ DEBUGGING ──root cause+fix+confirm──▶ (prior state)
+```
+
+### 3.3 Transition table (the contract)
+
+| From | Trigger (phrase / command) | Guard | To |
+|---|---|---|---|
+| IDLE | "what's next" then pick; `/assign ID`; "apply SDD X"; vague brief | none | TRIAGING |
+| TRIAGING | lane = full | scope confirmed | DESIGNING |
+| TRIAGING | lane = normal | scope confirmed | SPECIFYING |
+| TRIAGING | lane = tiny | trivial edit | BUILDING (no spec) |
+| TRIAGING | lane = bug | a defect | DEBUGGING |
+| DESIGNING | "iterate", redirect | per-section approval pending | DESIGNING |
+| DESIGNING | "design is good, write the spec" | solution approved | SPECIFYING |
+| SPECIFYING | `/spec` done | spec `DRAFT` exists | VALIDATING (full) / BUILDING (normal) |
+| VALIDATING | `/spec-validate` verdict | VALIDATED | BUILDING |
+| VALIDATING | NEEDS REVISION | revisions required | SPECIFYING |
+| BUILDING | task FAIL:fixable | retries < 2 | BUILDING (fix-agent) |
+| BUILDING | task FAIL:escalate / retries == 2 | unfixable | BLOCKED |
+| BUILDING | all tasks done | **all PASS + integration PASS** (hard) | REVIEWING |
+| REVIEWING | verdict SHIP / FIX-applied | not DO-NOT-SHIP | DOCUMENTING |
+| REVIEWING | FIX THEN SHIP / DO NOT SHIP | findings open | SPECIFYING / BUILDING |
+| DOCUMENTING | `/docs` done | doc-verifier PASS | SHIPPING |
+| SHIPPING | `/ship` | **not DO-NOT-SHIP, not push-to-main** (hard) | REFLECTING |
+| REFLECTING | `/retro` done | retro written | SHIPPED -> IDLE |
+| any | "park" / "I'm stuck" | a blocker exists | BLOCKED |
+| any | "kill it, not worth it" | operator confirms | ABANDONED |
+| any | a bug surfaces | a defect | DEBUGGING |
+| SHIPPED | "the shipped X needs a follow-up" | none | TRIAGING (new spec) |
+
+### 3.4 Sub-machines
+
+- **BUILDING** expands to: `worker -> task-verifier -> {PASS | FAIL:fixable -> fix-agent (<=2) | FAIL:escalate} -> integration-checker`. See `docs/ORCHESTRATION.md` 5.3.
+- **DEBUGGING** expands to: `Phase 1 Root cause -> Phase 2 Pattern -> Phase 3 Hypothesis -> Phase 4 Implementation`, under the iron law (no fix without a recorded root cause), guarded by the guess-fix guard. See `docs/ORCHESTRATION.md` 5.2.
+
+### 3.5 Hard stops as guards (the only blockers)
+
+| Hard stop | Guards the transition | Effect |
+|---|---|---|
+| safety-gate | any transition running destructive Bash | blocks the command |
+| push-to-main | SHIPPING -> (the push) | blocks the push |
+| anti-rationalization | BUILDING -> REVIEWING; any -> "done" | blocks premature/false done |
+| verification pipeline | BUILDING -> REVIEWING | blocks if a task fails |
+
+Everything else is advisory: it suggests the transition, it does not block it.
+
+## 4. Scenario catalog (all 15, mapped to transitions)
+
+The five from the original playbook, plus ten that complete the SDLC. Each is "trigger -> from-state -> guard -> to-state".
+
+| # | Scenario | Trigger | From -> To | Notes |
+|---|---|---|---|---|
+| 1 | What's next / left | "what's next" | IDLE -> IDLE (detector) | renders queue; no transition until you pick |
+| 2 | Apply SDD to a feature | "apply SDD to X" | IDLE -> TRIAGING -> SPECIFYING | freeform; bridged to an ID (SPEC-026) |
+| 3 | Autonomous full flow | "run the lane, your call" | TRIAGING -> ... -> SHIPPED | advisory checkpoints skipped; hard stops remain |
+| 4 | Iterate the design | "discuss / revisit design" | DESIGNING ⇄ DESIGNING | human-in-loop, per-section |
+| 5 | Vague brief | "rough idea about X" | IDLE -> TRIAGING | interview (`/think` + brainstorming) before any row |
+| 6 | Resume after interruption | "where were we" | (any) -> same | session-state restore + `/start`; mostly supported |
+| 7 | Mid-flight scope change | "also do Y" | BUILDING -> SPECIFYING | **gap**: amend the active spec mid-build |
+| 8 | Blocked / park | "park this", "I'm stuck" | (any) -> BLOCKED | supported (parked status + Open questions) |
+| 9 | Context switch across specs | "switch to the other feature" | (any) -> (other worktree) | **gap**: worktree-per-spec is the model, no switch command |
+| 10 | Urgent hotfix | "prod is down, fix X now" | IDLE -> DEBUGGING | bug lane exists; **minor gap**: a declared fast path |
+| 11 | Review someone else's work | "review this PR" | IDLE -> REVIEWING | supported (`/review` on any diff); note the base ref |
+| 12 | Abandon | "kill this" | (any) -> ABANDONED | **minor gap**: no explicit abandon terminal (vs park) |
+| 13 | Re-open shipped | "follow-up on shipped X" | SHIPPED -> TRIAGING | **gap**: convention for a follow-up spec |
+| 14 | Status during a run | "how's it going" | (running) -> same | report progress; no state change |
+| 15 | Knowledge capture | "save this learning" | (any) -> same | side-effect (`/learned` + skills); no state change |
+
+## 5. Gap analysis (what has no clean path today -> implementing specs)
+
+Most transitions already have a path. The real gaps, in priority order:
+
+| Gap | Scenario | Why it is a gap | Proposed |
+|---|---|---|---|
+| Freeform front door | 2, 5 | `/assign` is ID-only; freeform is bridged by hand | **SPEC-026 / ID-022** (drafted) |
+| Mid-flight spec amend | 7 | no path to amend a `VALIDATED`/building spec without restarting | **ID-023** (new) |
+| Context switch across specs | 9 | worktree-per-spec is the model but no switch affordance | **ID-024** (new) |
+| Re-open shipped | 13 | a `SHIPPED` spec has no follow-up convention | **ID-025** (new) |
+| Abandon terminal | 12 | only `parked` exists; no explicit drop | folded into ID-024 (state hygiene) or a tiny lane |
+
+Scenarios 6, 8, 10, 11, 14, 15 are supported or near-supported today; the doc records them so the machine is complete, no new spec required.
+
+## 6. How this gets built (the kickoff path, design-first)
+
+The kit dogfoods itself. The order:
+
+```text
+  1. THIS doc (vision + state machine)        <- design-first, done now
+  2. /user:assign ID-022   -> SPEC-026 (freeform front door): /spec-validate (drafted) -> /execute
+  3. /user:assign ID-023   -> mid-flight spec amend (spec it, then build)
+  4. /user:assign ID-024   -> context-switch affordance (+ abandon terminal)
+  5. /user:assign ID-025   -> re-open-shipped convention
+  6. align docs/PLAYBOOK.md to the shipped state machine (regenerate the scenario cards from it)
+```
+
+Each row runs the normal/full lane (Section 3). The state machine in this doc is the
+acceptance reference: a transition is "done" when an operator can trigger it by phrase and
+the guard/stop behaves as the table says.
+
+## See also
+- `docs/PLAYBOOK.md` - the operator-facing projection (what you say -> what happens).
+- `docs/ORCHESTRATION.md` - the flow/loop view (the sub-machines in detail).
+- `WORKFLOW.md` - the rules contract.
+- `docs/PHILOSOPHY.md` - the why behind the guards and the bounded loops.
+- `docs/specs/SPEC-026-freeform-front-door.md` - the first implementing spec.

From 5c42c4beee03a026b75b5b325fb40c9ef29841ae Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:33:57 +0700
Subject: [PATCH 25/56] docs(spec): validate freeform front door spec

spec-validate (5 lenses): NEEDS REVISION -> revised. Delegate crystallize to /user:think (keep /assign a light mutator, DEC-003); sanitize freeform input (pipe-escape + slug hardening, DEC-004); atomic ID allocation + collision check (DEC-005); dedup-by-slug + concurrency edge cases. Status DRAFT -> VALIDATED; ID-022 -> validated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                           |  2 +-
 docs/specs/SPEC-026-freeform-front-door.md | 99 +++++++++++++++-------
 2 files changed, 71 insertions(+), 30 deletions(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index ab54c6f..f53aa13 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -41,7 +41,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-019 | Demo `examples/hello-spec/docs/specs/SPEC-001-version-flag.md` lacks a `## After state` section; add one so the demo is a complete exemplar of the post-SPEC-024 spec template | SPEC-024 review 2026-05-21 (test-coverage lens) | TBD | tiny | queued |
 | ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
 | ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
-| ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | speccing |
+| ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | validated |
 | ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
diff --git a/docs/specs/SPEC-026-freeform-front-door.md b/docs/specs/SPEC-026-freeform-front-door.md
index de93f3a..3157360 100644
--- a/docs/specs/SPEC-026-freeform-front-door.md
+++ b/docs/specs/SPEC-026-freeform-front-door.md
@@ -1,9 +1,11 @@
 # Spec: freeform front door (intent -> ID -> lane, no manual bookkeeping)
 Generated: 2026-05-22
-Status: DRAFT
+Status: VALIDATED
 
 > Source: the PLAYBOOK.md scenarios (S2 "apply SDD to X", S5 vague brief) and SPEC-024's
 > deferred "freeform griller entry". Backlog: ID-022.
+> Validated 2026-05-22 via /user:spec-validate (5 lenses); NEEDS REVISION -> revised
+> (DEC-003 delegate-to-think, DEC-004 sanitize, DEC-005 atomic-allocate; dedup + concurrency edges added).
 
 ## Problem
 The kit's orchestration is BACKLOG-ID-first: `/user:assign` takes an `ID-NNN`. A freeform
@@ -35,11 +37,18 @@ does not support natively.
 
 ### Chosen approach + why
 Widen `/user:assign` so its argument is either an existing `ID-NNN` (today's path, unchanged) or
-freeform intent text. On the freeform path it: (1) crystallizes the intent (challenge + scope via
-the existing `/user:think` / brainstorming behavior, only as far as needed to name an outcome and
-a lane), (2) allocates the next `ID-NNN` and appends a BACKLOG Active-queue row (`Status: queued`),
-(3) then runs the identical ID-first path (goal draft, lane pick, activator detect, hand-off). One
-mutator, one routing path, ID traceability preserved (the ID is created on the fly, never skipped).
+freeform intent text. On the freeform path it: (1) **delegates** crystallization to `/user:think`
+(the existing idea-griller) rather than absorbing the interview, so `/assign` stays the quick
+mutator SPEC-006 defined (it does not run a multi-turn interview itself); `/think` returns a
+crystallized objective + a lane, (2) `/assign` then does only its mutation tail: sanitize the
+intent, allocate the next `ID-NNN`, append a BACKLOG Active-queue row (`Status: queued`), (3) then
+runs the identical ID-first path (goal draft, lane pick, activator detect, hand-off). One mutator,
+one routing path, ID traceability preserved (the ID is created on the fly, never skipped), and the
+interactive part lives in `/think` where it belongs (DEC-003).
+
+Boundary note: `/assign` orchestrates the freeform path but the **interview is delegated**, not
+embedded. `/assign`'s own work stays "allocate + route", consistent with the detector/mutator
+split (SPEC-006).
 
 ### Extensibility & boundaries
 - **Load-bearing dimension: intake shape.** Today: one shape (ID). After: two shapes (ID,
@@ -56,8 +65,8 @@ mutator, one routing path, ID traceability preserved (the ID is created on the f
        ▼
    resolve arg ──┬── looks like ID-NNN ──▶ (today's path, unchanged) ──┐
                  │                                                       │
-                 └── freeform text ──▶ crystallize (think/brainstorm)   │
-                                       └▶ allocate next ID + write       │
+                 └── freeform text ──▶ crystallize via /user:think   │
+                                       └▶ approve, sanitize, allocate, write       │
                                           BACKLOG row (queued)           │
                                           └▶ [human approves objective]──┤
                                                                          ▼
@@ -69,14 +78,26 @@ mutator, one routing path, ID traceability preserved (the ID is created on the f
 ### Interfaces (I/O contract)
 - **`commands/assign.md`** consumes: `$ARGUMENTS` = either `ID-NNN` OR freeform intent text.
   Produces (freeform path): a new `_meta/BACKLOG.md` row with an allocated ID + `Status: queued`,
-  then the existing `.claude/goals/<slug>.md` draft. Invariant: the freeform path MUST write the
-  BACKLOG row before the goal draft (ID traceability first); it MUST pause for human approval of
-  the crystallized objective before allocating, so a vague brief never auto-creates a row.
+  then the existing `.claude/goals/<slug>.md` draft.
+- Invariant (delegation): the freeform path delegates the interview to `/user:think`; `/assign`
+  itself does not run a multi-turn interview (DEC-003), so the mutator stays light.
+- Invariant (approve-before-allocate): the freeform path MUST pause for human approval of the
+  crystallized objective before it allocates an ID, so a vague brief never auto-creates a row.
+- Invariant (row-before-draft): the BACKLOG row is written before the goal draft (ID traceability first).
+- Invariant (sanitization, DEC-004): freeform intent is sanitized before it touches a file. Table
+  cells escape `|` (and newlines) so a freeform string cannot break the `_meta/BACKLOG.md` pipe
+  table; the derived slug is reduced to `[a-z0-9-]+` (no `/`, no `..`) so it cannot traverse out of
+  `.claude/goals/`.
+- Invariant (allocate-atomically, DEC-005): the ID is allocated by re-reading the current max in
+  the same step that writes the row, and a post-write collision check (two equal IDs) fails loud
+  rather than silently colliding (mirrors the existing SPEC/ADR dup-number guard).
 - Invariant: the ID-first path (an `ID-NNN` argument) is byte-for-byte unchanged.
 
 ### Data model changes
 None to the schemas. The BACKLOG Active-queue row shape (SPEC-005 schema) is reused; the freeform
-path just writes one. `Source` column records "freeform intake (date)".
+path just writes one (sanitized, per the sanitization invariant). `Source` column records
+"freeform intake (date)". The derived slug follows the existing kebab convention but is hardened to
+`[a-z0-9-]+`.
 
 ### API changes (the `/user:assign` contract)
 `$ARGUMENTS` widens from "an `ID-NNN`" to "an `ID-NNN` or freeform intent". Detection rule:
@@ -86,9 +107,10 @@ matches `^ID-[0-9]+$` (after trim) -> ID path; else -> freeform path.
 None. The kit ships no UI.
 
 ### Infrastructure changes
-- `commands/assign.md`: the resolver + freeform path + the approval gate.
-- `tests/test-meta.sh`: assert `assign.md` documents both paths and the "row before draft" +
-  "approve before allocate" invariants.
+- `commands/assign.md`: the resolver + freeform path (delegating crystallize to `/user:think`) +
+  the approval gate + sanitization + atomic allocation.
+- `tests/test-meta.sh`: assert `assign.md` documents both paths, the `/user:think` delegation, and
+  the four invariants (row-before-draft, approve-before-allocate, sanitization, atomic-allocate).
 - `docs/PLAYBOOK.md`: replace Section 8's "manual bridge" note with the native path once shipped.
 - `WORKFLOW.md` `## The spine`: note `/assign` accepts freeform, not only an ID.
 
@@ -96,16 +118,25 @@ None. The kit ships no UI.
 ### Phase 1: Resolver + freeform path
 - [ ] TASK-001: Add the arg resolver to `commands/assign.md` (`ID-NNN` vs freeform) with the ID
   path unchanged. - AC: `assign.md` documents the two-shape argument + the `^ID-[0-9]+$` rule.
-- [ ] TASK-002: Specify the freeform path: crystallize -> approval gate -> allocate next ID +
-  write BACKLOG row -> existing ID-first tail. - AC: `assign.md` carries the ordered freeform
-  steps and both invariants (row-before-draft, approve-before-allocate).
-
-### Phase 2: Guards + docs
-- [ ] TASK-003: `tests/test-meta.sh` assertions for the two-path contract + invariants. - AC:
-  `bash tests/test-meta.sh` exercises them and passes.
-- [ ] TASK-004: Update `docs/PLAYBOOK.md` (S2/S5/S8 -> native path) and `WORKFLOW.md` spine note.
-  - AC: PLAYBOOK no longer calls the bridge "manual today"; WORKFLOW spine says `/assign` accepts
-  freeform.
+- [ ] TASK-002: Specify the freeform path: **delegate crystallize to `/user:think`** -> approval
+  gate -> sanitize -> allocate ID + write BACKLOG row -> existing ID-first tail. The interview is
+  delegated, not embedded (DEC-003). - AC: `assign.md` carries the ordered freeform steps, names the
+  `/user:think` delegation, and the row-before-draft + approve-before-allocate invariants.
+
+### Phase 2: Hardening
+- [ ] TASK-003: Sanitization + concurrency guard in `commands/assign.md`'s freeform path: escape
+  `|`/newlines in the BACKLOG row cells (table integrity); reduce the slug to `[a-z0-9-]+` (no `/`,
+  no `..`); allocate the ID by re-reading max in the write step + a post-write equal-ID collision
+  check that fails loud. - AC: `assign.md` documents the sanitization + atomic-allocate invariants
+  (DEC-004, DEC-005).
+
+### Phase 3: Guards + docs
+- [ ] TASK-004: `tests/test-meta.sh` assertions: `assign.md` documents both paths, the delegation,
+  and all four invariants (row-before-draft, approve-before-allocate, sanitization, atomic-allocate).
+  - AC: `bash tests/test-meta.sh` exercises them and passes.
+- [ ] TASK-005: Update `docs/PLAYBOOK.md` (S2/S5/S8 -> native path) and the `WORKFLOW.md` spine
+  note. - AC: PLAYBOOK no longer calls the bridge "manual today"; the WORKFLOW spine says `/assign`
+  accepts freeform.
 
 ## After state
 Observable; each bullet is false now and a real check once shipped.
@@ -113,20 +144,24 @@ Observable; each bullet is false now and a real check once shipped.
   row, and routes into the lane, with one approval gate. (Today: `/assign` only accepts `ID-NNN`.)
 - [ ] `/user:assign ID-007` behaves exactly as before (no regression on the ID path).
 - [ ] `docs/PLAYBOOK.md` Section 8 describes a native front door, not a manual bridge.
-- [ ] `tests/test-meta.sh` pins the two-path contract + the row-before-draft + approve-before-allocate invariants.
+- [ ] The freeform path delegates the interview to `/user:think` (the mutator does not embed it).
+- [ ] A freeform intent containing `|` does not corrupt the `_meta/BACKLOG.md` table, and a slug-hostile intent (e.g. `../x`) cannot write outside `.claude/goals/`.
+- [ ] `tests/test-meta.sh` pins the two-path contract + all four invariants (row-before-draft, approve-before-allocate, sanitization, atomic-allocate).
 
 ## Acceptance Criteria (global)
 - [ ] All tasks pass their individual acceptance criteria.
-- [ ] `/assign` accepts freeform OR `ID-NNN`; freeform writes a tracked BACKLOG row before any draft; the ID path is unchanged.
+- [ ] `/assign` accepts freeform OR `ID-NNN`; freeform delegates the interview to `/user:think`, sanitizes input, writes a tracked BACKLOG row before any draft; the ID path is unchanged.
 - [ ] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0.
 
 ## Verification
-`bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `commands/assign.md` documents both the `ID-NNN` and freeform paths AND the row-before-draft + approve-before-allocate invariants are present.
+`bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `commands/assign.md` documents both the `ID-NNN` and freeform paths, the `/user:think` delegation, and all four invariants (row-before-draft, approve-before-allocate, sanitization, atomic-allocate).
 
 ## Edge Cases
 1. Ambiguous arg (looks like an ID but is not in the queue): treat as ID path, report "unknown id" (today's behavior), do NOT silently create a freeform row from a typo'd ID.
 2. Freeform intent that is still too vague to name an outcome: the crystallize step loops (think/brainstorming) and does NOT allocate an ID until the objective is approved (no half-baked rows).
-3. Duplicate freeform intent (a row already exists for the same idea): surface the existing row instead of allocating a second ID (mirror the SPEC-005 idempotency edge).
+3. Duplicate freeform intent (a row already exists for the same idea): detection is by **slug match after crystallize** (the crystallized objective produces a slug; if that slug already has a row/draft, surface it instead of allocating a second ID). Semantic dedup (two differently-worded briefs for the same idea) is **best-effort**: on a near-match slug, ask rather than silently merge or duplicate. (Mirrors the SPEC-005 filesystem-is-truth idempotency, made explicit for freeform.)
+4. Concurrent freeform allocation (two sessions allocate at once): both re-read max in the write step; the post-write equal-ID collision check fails loud (the operator re-runs), rather than two rows silently sharing an ID.
+5. Pipe / newline in the freeform intent: sanitized (escaped) before the BACKLOG row is written, so the markdown table stays well-formed.
 
 ## Failure modes
 | Failure class | Detection signal | Mitigation / recovery |
@@ -134,6 +169,9 @@ Observable; each bullet is false now and a real check once shipped.
 | Freeform path auto-creates rows from vague briefs | BACKLOG fills with half-baked queued rows | the approve-before-allocate invariant: no row until the objective is approved |
 | ID path regresses behind the new resolver | an `ID-NNN` assign behaves differently | the ID path is unchanged + a no-regression AC + meta assertion |
 | Untracked freeform work | a lane runs with no BACKLOG row | the row-before-draft invariant: the row is written before the goal draft |
+| Freeform `|`/newline corrupts the BACKLOG pipe table | the Active-queue table renders broken; a row has wrong columns | the sanitization invariant (DEC-004): escape `|`/newlines in row cells before writing |
+| Slug path-traversal from freeform (`../`, `/`) | a draft is written outside `.claude/goals/` | the sanitization invariant (DEC-004): slug reduced to `[a-z0-9-]+` |
+| Concurrent allocation picks the same ID | two Active-queue rows share an ID (caught at CI by the dup-number guard, but late) | the atomic-allocate invariant (DEC-005): re-read max in the write step + a loud post-write collision check |
 
 ## Out of Scope
 - **A prose keyword-recognizer that auto-runs `/assign` without the operator invoking it** (approach C as the primary mechanism): the front door is an invoked command, not a model convention. A light "you can run `/assign <freeform>`" pointer is fine; auto-execution on a detected phrase is not.
@@ -143,6 +181,9 @@ Observable; each bullet is false now and a real check once shipped.
 ## Decision Log
 - DEC-001: Extend `/assign` (approach A) rather than add a second intake command (B) or rely on a keyword recognizer (C). Rationale: one mutator (SPEC-006), guardrails over guidance, ID traceability preserved. Who: drafted from the PLAYBOOK scenarios 2026-05-22.
 - DEC-002: Freeform path pauses for human approval of the crystallized objective BEFORE allocating an ID. Rationale: a vague brief turned straight into a tracked row + spec is how scope drift starts (PLAYBOOK S5). Who: same.
+- DEC-003: The freeform path DELEGATES crystallize to `/user:think`; `/assign` does not embed a multi-turn interview. Rationale: SPEC-006 defines `/assign` as a light mutator-dispatcher that "does not execute"; absorbing an interactive interview would break that boundary. The interview lives in `/think`; `/assign` keeps only allocate + route. Who: spec-validate Reviewer 5 (2026-05-22).
+- DEC-004: Freeform intent is sanitized before it touches a file: escape `|`/newlines in BACKLOG table cells; reduce the slug to `[a-z0-9-]+` (no `/`, no `..`). Rationale: a freeform string is operator-controlled text written into a pipe-delimited markdown table and a filesystem path; without escaping it can corrupt the queue table or traverse out of `.claude/goals/`. Who: spec-validate Reviewer 1 (2026-05-22).
+- DEC-005: The ID is allocated by re-reading the current max in the same step that writes the row, with a loud post-write equal-ID collision check. Rationale: two concurrent freeform allocations would otherwise pick the same `max+1` and collide (the CI dup-number guard catches it, but late). Who: spec-validate Reviewer 2 (2026-05-22).
 
 ## Open questions
 (none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From d920df3808a794ff1b88e10d331e54d05f030fe5 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:36:27 +0700
Subject: [PATCH 26/56] feat(assign): accept freeform intent, delegate
 crystallize to think

Widen /user:assign so $ARGUMENTS is either an ID-NNN (matches
^ID-[0-9]+$, the unchanged ID-first path) or freeform intent text.
The freeform path delegates the interview to /user:think, gates on
human approval of the crystallized objective before allocating an ID
(approve-before-allocate), sanitizes input (escape | and newlines in
BACKLOG cells; reduce the slug to [a-z0-9-]+ so it cannot traverse
out of .claude/goals/), atomically allocates the next ID by
re-reading max in the write step with a loud post-write collision
check, writes the BACKLOG row before the goal draft (row-before-draft),
then rejoins the existing Steps 4-6 tail. /assign stays a light
mutator-dispatcher: it does not embed the interview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/assign.md | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/commands/assign.md b/commands/assign.md
index 3d491fd..fe9b406 100644
--- a/commands/assign.md
+++ b/commands/assign.md
@@ -6,9 +6,34 @@ You are a goal dispatcher. Turn a committed backlog item into a runnable goal dr
 
 ## Process
 
-### Step 1: Resolve the item
+### Step 1: Resolve the argument (two shapes)
 
-Read `$ARGUMENTS` for the `ID-NNN`. Find that row in the `_meta/BACKLOG.md` Active queue (the Schema section there defines the columns). If the id is absent, say so, list the open queue ids, and stop.
+`$ARGUMENTS` is one of two shapes. Trim it, then branch:
+
+- **ID shape**: the trimmed argument matches `^ID-[0-9]+$`. Take the **ID-first path** below (Step 1b onward), unchanged. This is the path that has always existed.
+- **Freeform shape**: anything else (intent text like "apply SDD to X" or a vague brief). Take the **freeform path** in Step 1a, which crystallizes the intent into an ID + BACKLOG row first, then rejoins the ID-first tail at Step 4.
+
+A future third intake shape (e.g. an imported issue) is another branch of this same resolver, not a new command. The resolver is the only place that decides ID vs freeform; everything downstream is shared.
+
+### Step 1a: Freeform path (delegate, gate, sanitize, allocate, write)
+
+When the argument is freeform intent (does NOT match `^ID-[0-9]+$`), run these ordered steps. `/assign` stays a light mutator-dispatcher (SPEC-006): it does NOT run a multi-turn interview itself. Two invariants govern this path, and both come before any file is written:
+
+- **approve-before-allocate**: pause for human approval of the crystallized objective BEFORE allocating an ID. A vague brief never auto-creates a row (DEC-002).
+- **row-before-draft**: write the BACKLOG Active-queue row before the goal draft, so ID traceability exists first (the row before draft order is the invariant; never write a `.claude/goals/` draft for an un-rowed intent).
+
+1. **Delegate crystallize to `/user:think`.** Hand the freeform intent to `/user:think` (the existing idea-griller). It runs the interview and returns a crystallized objective + a lane. `/assign` does NOT embed that interview (DEC-003); it only consumes `/think`'s result. If the intent is too vague to name an outcome, `/think` loops; do not allocate anything until it converges.
+2. **Approval gate (approve-before-allocate).** Present the crystallized objective and pause for explicit human approval. Until the human approves, allocate nothing and write nothing. This is the gate that keeps half-baked rows out of the queue.
+3. **Dedup by slug.** Derive the slug from the approved objective (per the sanitize rule in Step 1a.4). If a `.claude/goals/<slug>.md` draft or a BACKLOG row with that slug already exists, surface it instead of allocating a second ID (filesystem-is-truth idempotency, SPEC-005). On a near-match slug, ask rather than silently merge or duplicate.
+4. **Sanitize (DEC-004).** Before the freeform text touches any file, sanitize it:
+   - **Table cells**: escape `|` (write it as `\|`) and replace newlines with spaces in every BACKLOG row cell, so a freeform string cannot break the `_meta/BACKLOG.md` pipe table.
+   - **Slug**: reduce the derived slug to `[a-z0-9-]+` only. Lowercase, replace spaces with `-`, then strip `/`, `..`, and anything outside `[a-z0-9-]`, so the slug cannot traverse out of `.claude/goals/`. The kebab convention is unchanged; it is just hardened to `[a-z0-9-]+`.
+5. **Atomic allocate + write the row (DEC-005, row-before-draft).** Allocate the next `ID-NNN` by **re-reading the current max ID** in `_meta/BACKLOG.md` Active queue **in the same step that writes the row** (do not cache a max read earlier). The new ID is `max + 1`, zero-padded. Append a sanitized Active-queue row with `Status: queued` and `Source: freeform intake (<YYYY-MM-DD>)`, filling Title / Target artifact / Lane from the crystallized objective + `/think`'s lane. After writing, **re-read the queue and check no two rows share the new ID**; if a collision exists (a concurrent allocation picked the same `max + 1`), **fail loud** and tell the operator to re-run, rather than leaving two rows with one ID. This mirrors the existing SPEC/ADR dup-number guard.
+6. **Rejoin the ID-first tail.** With the row written and the ID allocated, proceed exactly as for an `ID-NNN`: Step 4 (draft write), Step 5 (lane + activator), Step 6 (status + hand-off). The freeform path adds no new tail; it reuses the ID path's.
+
+### Step 1b: Resolve the item (ID-first path)
+
+Read `$ARGUMENTS` for the `ID-NNN`. Find that row in the `_meta/BACKLOG.md` Active queue (the Schema section there defines the columns). If the id is absent, say so, list the open queue ids, and stop. (An argument that looks like an ID but is not in the queue is a typo'd ID, not a freeform intent: report "unknown id", do NOT silently create a freeform row from it.)
 
 ### Step 2: Idempotency check
 
@@ -64,7 +89,11 @@ created: <YYYY-MM-DD>
 - **Re-run for the same id**: re-surface the existing draft; do not duplicate or double-advance status (idempotent).
 - **No activator installed**: the draft still works as a plain file; only one-step activation is lost.
 - **Queued item with no spec, normal/full lane**: hand off to `/user:spec` first.
+- **Freeform intent too vague**: `/user:think` loops until the objective is named; no ID is allocated until the approval gate passes (no half-baked rows).
+- **Duplicate freeform intent**: dedup by slug after crystallize; surface the existing row/draft instead of allocating a second ID, and ask on a near-match rather than silently merge.
+- **Concurrent freeform allocation**: both sessions re-read max in the write step; the post-write equal-ID collision check fails loud and the operator re-runs.
+- **`|` or newline in freeform intent**: sanitized (cells escape `|`, newlines become spaces) before the row is written, so the BACKLOG pipe table stays well-formed.
 
 ## What this command does NOT do
 
-It does not execute the task, does not write `.claude/last-goal.md`, and does not hard-gate. It is the mutator that sets up a goal; the lane's commands do the work and `/user:start`/`/user:next` only render. Source: SPEC-006; dispatcher pattern from `commands/next.md` + CCGS `/start`; goal breakdown from the `goal-craft` skill; draft store from ADR-0011.
+It does not execute the task, does not write `.claude/last-goal.md`, and does not hard-gate. On the freeform path it does NOT embed a multi-turn interview either: the crystallize step is delegated to `/user:think`, and `/assign` keeps only allocate + route (DEC-003). It is the mutator that sets up a goal; the lane's commands do the work and `/user:start`/`/user:next` only render. Source: SPEC-006; SPEC-026 (the freeform front door + its four invariants: row-before-draft, approve-before-allocate, sanitize, atomic-allocate); dispatcher pattern from `commands/next.md` + CCGS `/start`; goal breakdown from the `goal-craft` skill; draft store from ADR-0011.

From 1e59de8732d6f1eb7407594e2c75d2a11170105e Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:38:47 +0700
Subject: [PATCH 27/56] test(meta): pin the freeform front door contract in
 assign.md

Add a SPEC-026 assertion block to tests/test-meta.sh that pins the
freeform front door contract in commands/assign.md so a wording flip
on any of the documented intake paths, the /user:think delegation, or
the four invariants fails CI.

Pins (grep -qF):
- the two-shape resolver: ^ID-[0-9]+$ and freeform
- the delegation: /user:think (interview delegated, not embedded; DEC-003)
- the four invariants: row-before-draft, approve-before-allocate,
  sanitize, atomic-allocate (plus the collision guard wording)
- the slug hardening: [a-z0-9-] (sanitized slug charset; DEC-004)

Total assert count rises from 241 to 250.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 tests/test-meta.sh | 54 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/tests/test-meta.sh b/tests/test-meta.sh
index d9d0be9..0bcff96 100755
--- a/tests/test-meta.sh
+++ b/tests/test-meta.sh
@@ -327,6 +327,60 @@ else
 fi
 rm -rf "$MERGE_HOME"
 
+# ============================================================
+echo ""
+echo "=== Freeform front door (SPEC-026) ==="
+# ============================================================
+# Pin the SPEC-026 contract in commands/assign.md so a wording flip on any of
+# the intake paths, the /user:think delegation, or the four invariants fails CI.
+# All literals exist in assign.md today; this guards them from silent drift.
+# ASSIGN_MD is set in the SPEC-024 block above.
+
+# Two-shape resolver: the ID-first regex AND the freeform branch must both be named.
+for LITERAL in '^ID-[0-9]+$' 'freeform'; do
+  TOTAL=$((TOTAL + 1))
+  if grep -qF "$LITERAL" "$ASSIGN_MD" 2>/dev/null; then
+    echo -e "  ${GREEN}PASS${NC} assign.md documents the '$LITERAL' intake shape (SPEC-026)"
+    PASS=$((PASS + 1))
+  else
+    echo -e "  ${RED}FAIL${NC} assign.md lost the '$LITERAL' intake shape (resolver drift)"
+    FAIL=$((FAIL + 1))
+  fi
+done
+
+# Delegation: the crystallize interview is delegated to /user:think, not embedded (DEC-003).
+TOTAL=$((TOTAL + 1))
+if grep -qF '/user:think' "$ASSIGN_MD" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} assign.md delegates crystallize to /user:think (SPEC-026 DEC-003)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} assign.md lost the /user:think delegation (interview embedded?)"
+  FAIL=$((FAIL + 1))
+fi
+
+# The four invariants. atomic-allocate is pinned via BOTH its named marker and the
+# 'collision' guard wording, since both literals are load-bearing in assign.md.
+for INVARIANT in 'row-before-draft' 'approve-before-allocate' 'sanitize' 'atomic-allocate' 'collision'; do
+  TOTAL=$((TOTAL + 1))
+  if grep -qF "$INVARIANT" "$ASSIGN_MD" 2>/dev/null; then
+    echo -e "  ${GREEN}PASS${NC} assign.md pins the '$INVARIANT' invariant (SPEC-026)"
+    PASS=$((PASS + 1))
+  else
+    echo -e "  ${RED}FAIL${NC} assign.md lost the '$INVARIANT' invariant (contract drift)"
+    FAIL=$((FAIL + 1))
+  fi
+done
+
+# Slug hardening: the sanitized slug charset must stay pinned (path-traversal guard).
+TOTAL=$((TOTAL + 1))
+if grep -qF '[a-z0-9-]' "$ASSIGN_MD" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} assign.md pins the '[a-z0-9-]' slug charset (SPEC-026 DEC-004)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} assign.md lost the '[a-z0-9-]' slug charset (slug hardening drift)"
+  FAIL=$((FAIL + 1))
+fi
+
 # ============================================================
 echo ""
 echo "=== Agent files ==="

From 893753e10514447e5d1b16507d5714026e7dbede Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:41:53 +0700
Subject: [PATCH 28/56] docs: describe the freeform front door as native
 (shipped)

PLAYBOOK S2/S5/S8 and the layer-2 intro now describe /user:assign's
native freeform path (delegate crystallize to /user:think, approve,
sanitize, atomic-allocate the ID + BACKLOG row, then route) instead of
a hand-run bridge. Section 8 no longer calls it "manual today"; it is
the native path /assign runs internally. Section 11 marks SPEC-026 /
ID-022 shipped and drops the "until that ships, use the bridge" line.
The 3-layer model and the honest "invoked command, not an auto-keyword"
point are kept. WORKFLOW.md ## The spine gains a Freeform-front-door
note (no numbered AGENTS/CLAUDE read-order restatement).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md      |   2 +-
 docs/PLAYBOOK.md | 107 ++++++++++++++++++++++++++++-------------------
 2 files changed, 66 insertions(+), 43 deletions(-)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index e1a16ca..1e05381 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -73,7 +73,7 @@ session start
                           queue (CHANGELOG is the canonical shipped record).
 ```
 
-**Detector/mutator split.** `/user:start` and `/user:next` only read and render; `/user:assign` is the only mutator. **Activator-agnostic.** `/user:assign` writes only the `.claude/goals/<slug>.md` draft and surfaces its body; activation (starting the loop) is done by whatever primitive is present (the built-in `/goal`, the `ralph-loop` plugin, or the `goal-craft` skill). The kit NEVER writes `.claude/last-goal.md`; if no activator exists, the draft is a plain reusable file. **"Even the goal loop follows WORKFLOW"** is delivered honestly: the safety subset is hard-enforced by existing hooks (anti-rationalization, the verification pipeline, the push-to-main blocker); decision/doc completeness is warned + logged to `~/.claude/dwarves-kit/logs/completeness.log` and reviewed at `/user:ship` + `/user:retro`, not hard-blocked mid-loop (PHILOSOPHY rejects hard-gating process completeness).
+**Freeform front door.** `/user:assign` accepts freeform intent, not only an `ID-NNN`. Given freeform, it delegates the interview to `/user:think`, pauses for approval, then allocates the ID + BACKLOG row before routing as usual; the ID-first path is unchanged. **Detector/mutator split.** `/user:start` and `/user:next` only read and render; `/user:assign` is the only mutator. **Activator-agnostic.** `/user:assign` writes only the `.claude/goals/<slug>.md` draft and surfaces its body; activation (starting the loop) is done by whatever primitive is present (the built-in `/goal`, the `ralph-loop` plugin, or the `goal-craft` skill). The kit NEVER writes `.claude/last-goal.md`; if no activator exists, the draft is a plain reusable file. **"Even the goal loop follows WORKFLOW"** is delivered honestly: the safety subset is hard-enforced by existing hooks (anti-rationalization, the verification pipeline, the push-to-main blocker); decision/doc completeness is warned + logged to `~/.claude/dwarves-kit/logs/completeness.log` and reviewed at `/user:ship` + `/user:retro`, not hard-blocked mid-loop (PHILOSOPHY rejects hard-gating process completeness).
 
 ## Completion contract
 The done-definition is canonical in `AGENTS.md` zone 3 ("Done means"); do not
diff --git a/docs/PLAYBOOK.md b/docs/PLAYBOOK.md
index 374522e..4ef7574 100644
--- a/docs/PLAYBOOK.md
+++ b/docs/PLAYBOOK.md
@@ -23,9 +23,11 @@ interprets** the phrase and **invokes** the right command/skill. The kit's hooks
 as guardrails around whatever runs. Keep this split in mind for every scenario below.
 
 A second load-bearing fact: the kit's orchestration is **BACKLOG-ID-first**. `/user:assign`
-takes an `ID-NNN`. A freeform intent with no ID ("apply SDD to this feature", a vague brief)
-has **no auto-path today**; Claude bridges it manually (Section 8). A real freeform front
-door is proposed in Section 9 (SPEC-026 / ID-022).
+takes an `ID-NNN` **or** freeform intent ("apply SDD to this feature", a vague brief). When you
+hand it freeform, `/user:assign` runs the freeform front door natively (Section 8): it delegates
+the interview to `/user:think`, waits for your approval, then allocates the ID + BACKLOG row
+before routing into the lane. The ID stays canonical; it is just minted on the fly. This shipped
+in SPEC-026 / ID-022 (Section 11).
 
 ---
 
@@ -70,25 +72,29 @@ If the queue is empty or you have a new idea, jump to Scenario 2 or 5.
 - *Interpreted*: Claude maps "SDD" to **the spec-driven lane** (normal or full) and proposes it.
   "SDD" is a keyword **for Claude to interpret**, not a mechanical trigger.
 
-**Resulting flow (today).** Because there is no ID, Claude runs the **freeform -> ID bridge**
-(Section 8), then routes into the lane. By default Claude does **not** auto-run the whole flow;
-it sets up and starts the first step, checking the lane + scope with you:
+**Resulting flow.** Because there is no ID, Claude invokes `/user:assign` with the freeform
+intent. `/user:assign` runs the **native freeform front door** (Section 8): it delegates the
+interview to `/user:think`, then allocates the ID + BACKLOG row, then routes into the lane. By
+default Claude does **not** auto-run the whole flow; it sets up and starts the first step,
+checking the lane + scope with you:
 ```text
   "apply SDD to X"
      -> Claude picks the lane (normal vs full) and confirms scope with you
-     -> bridge: crystallize (/user:think if fuzzy) -> write a BACKLOG row (new ID) -> /user:assign ID
+     -> /user:assign "apply SDD to X"   (freeform front door, Section 8):
+          delegate crystallize to /user:think -> [you approve] -> allocate ID + BACKLOG row
      -> /user:spec  (-> /user:spec-validate on full)
      -> /user:execute (verification pipeline)
      -> /user:review[-team] -> /user:docs -> /user:ship -> /user:retro
 ```
 
-**Decision / approval points.** Lane + scope confirmation before the spec; spec approval; the
-spec-validate verdict (full lane); execute phase checkpoints. Claude stops at each unless you
-pre-authorize autonomy (Scenario 3).
+**Decision / approval points.** Lane + scope confirmation before the spec; the approve-before-allocate
+gate inside the freeform front door; spec approval; the spec-validate verdict (full lane); execute
+phase checkpoints. Claude stops at each unless you pre-authorize autonomy (Scenario 3).
 
-**Honest caveat.** "Apply SDD" does not auto-launch the full orchestration. Claude interprets it
-and drives it, with the hooks as guardrails. If you want "apply SDD" to be a real one-shot front
-door, that is the SPEC-026 enhancement (Section 9).
+**Honest caveat.** "Apply SDD" does not auto-launch the full orchestration off a keyword. Claude
+still **interprets** the phrase and **invokes** `/user:assign` with it; the front door is a real
+invoked command (SPEC-026, Section 11), not an auto-firing keyword. From there the hooks act as
+guardrails.
 
 ---
 
@@ -175,17 +181,19 @@ RECONSIDER); spec approval. To leave the loop: "the design is good, write the sp
 **Resulting flow.**
 ```text
   vague brief
-     -> Claude interviews: /user:think  and/or  superpowers:brainstorming
-        (challenge the idea, surface requirements, name the real outcome)
+     -> /user:assign "<vague brief>"   (native freeform front door, Section 8):
+          delegate crystallize to /user:think and/or superpowers:brainstorming
+          (challenge the idea, surface requirements, name the real outcome)
      -> crystallize into a clear objective
-     -> [YOU APPROVE the crystallized objective]
-     -> bridge: write a BACKLOG row (new ID) -> /user:assign ID -> lane (Scenario 2's flow)
+     -> [YOU APPROVE the crystallized objective]   (approve-before-allocate gate)
+     -> allocate ID + BACKLOG row -> route into the lane (Scenario 2's flow)
 ```
 
 **Will Claude auto-hook it into the SDD orchestration after you approve?** Yes, **on your "go"** -
-Claude chains it: write the BACKLOG row, `/user:assign` the new ID, and start the lane's first
-command. But understand this is **Claude chaining the steps**, not the kit auto-wiring a freeform
-brief. And Claude will ask **how autonomous** to be from there (ties to Scenario 3).
+`/user:assign`'s freeform path allocates the ID + BACKLOG row and starts the lane's first command.
+This is the kit's native front door now, not Claude bridging by hand; the front door is still an
+**invoked command** (you, or Claude on your behalf, invoke `/user:assign`), not an auto-firing
+keyword. Claude will ask **how autonomous** to be from there (ties to Scenario 3).
 
 **Decision / approval points.** Claude will **not** write the spec or start the lane until you
 approve the crystallized objective. That approval gate is deliberate: a vague brief turned
@@ -193,22 +201,33 @@ straight into a spec is how scope drift starts.
 
 ---
 
-## 8. The freeform -> ID bridge (the step the kit does not auto-do)
+## 8. The freeform -> ID front door (what `/user:assign` does internally)
 
-Scenarios 2 and 5 are freeform (no ID). The orchestration is ID-first, so Claude bridges:
+Scenarios 2 and 5 are freeform (no ID). The orchestration is ID-first, so `/user:assign` mints the
+ID for you. Hand it freeform intent instead of an `ID-NNN` and its freeform path runs:
 
 ```text
-  freeform intent
-     1. crystallize   -> /user:think or brainstorming until the outcome is clear
-     2. allocate      -> pick the next ID-NNN, write a row into _meta/BACKLOG.md Active queue
-                         (Title, Source, Target artifact, Lane, Status: queued)
-     3. assign        -> /user:assign ID-NNN  (writes the goal draft, flips status, routes)
-     4. run the lane  -> Scenario 2's flow
+  /user:assign "<freeform intent>"
+     1. delegate crystallize -> /user:think (the idea-griller) runs the interview and
+                                returns a crystallized objective + a lane. /assign does NOT
+                                embed the interview; it consumes /think's result.
+     2. approve              -> pause for your approval of the crystallized objective
+                                (approve-before-allocate: a vague brief never auto-creates a row)
+     3. sanitize + allocate  -> sanitize the intent (escape `|`/newlines for the table cells;
+                                reduce the slug to [a-z0-9-]+), re-read the current max ID and
+                                write the next ID-NNN row into _meta/BACKLOG.md Active queue in
+                                the same step (Title, Source: freeform intake (date), Target
+                                artifact, Lane, Status: queued), with a loud equal-ID collision
+                                check (atomic-allocate)
+     4. rejoin the ID tail   -> write the goal draft, pick the lane, route (Scenario 2's flow),
+                                exactly as for an ID-NNN argument
 ```
 
-This bridge is **manual today** (Claude runs it). It preserves ID-first traceability: even an
-ad-hoc idea gets a backlog row and an ID, so nothing ships untracked. The cost is the bookkeeping
-detour every time. Section 9 proposes removing it.
+This is the **native freeform front door** (`/user:assign` runs it; you do not bookkeep by hand).
+It preserves ID-first traceability: even an ad-hoc idea gets a BACKLOG row and an ID before any
+draft is written, so nothing ships untracked. It is still an **invoked command**, not an
+auto-firing keyword: Claude interprets "apply SDD to X" and invokes `/user:assign` with it; the
+kit does not watch for the phrase. Shipped in SPEC-026 / ID-022 (Section 11).
 
 ---
 
@@ -234,7 +253,7 @@ are **never** waived by any autonomy level.
 |---|---|---|---|
 | "what's next / what's left" | `/user:start` (detector) | context-readiness suggestion | nothing (read-only) |
 | "assign ID-007" / "start ID-007" | `/user:assign ID-007` | (none) | hands off to the lane |
-| "apply SDD to X" (no ID) | bridge -> `/user:spec` lane | spec-drift-guard once a spec exists | lane + scope confirm, then per-phase |
+| "apply SDD to X" (no ID) | `/user:assign "<freeform>"` (freeform front door) -> `/user:spec` lane | spec-drift-guard once a spec exists | approve-before-allocate gate, lane + scope confirm, then per-phase |
 | "discuss / iterate the design" | `/user:design` (+ `/user:devs-team`) | (none) | every section (human-in-loop) |
 | "vague idea about X" | `/user:think` / brainstorming | (none) | your approval of the objective |
 | "run the full lane, your call" | the lane, autonomously | anti-rationalization (in a /goal loop) | hard stops + push/PR |
@@ -243,21 +262,25 @@ are **never** waived by any autonomy level.
 
 ---
 
-## 11. Proposed enhancement: a real freeform front door
+## 11. The freeform front door (shipped)
 
-Scenarios 2 and 5 expose a real gap: freeform intent ("apply SDD to X", a vague brief) has no
-auto-path; Claude bridges it by hand every time (Section 8). SPEC-024 deliberately deferred the
-"freeform griller entry" to keep BACKLOG-ID-first canonical.
+Scenarios 2 and 5 used to expose a real gap: freeform intent ("apply SDD to X", a vague brief) had
+no auto-path, so Claude bridged it by hand every run. SPEC-024 deferred the "freeform griller entry"
+to keep BACKLOG-ID-first canonical; SPEC-026 / ID-022 then closed the gap and **shipped**.
 
-**Proposed**: extend `/user:assign` (the one mutator) to accept **freeform intent** in addition
-to `ID-NNN`. The freeform path runs the crystallize step, auto-allocates the next ID, writes the
-BACKLOG row, then proceeds exactly as ID-first, so "apply SDD to X" becomes a genuine one-shot
-front door without losing ID traceability.
+**Shipped**: `/user:assign` (the one mutator) now accepts **freeform intent** in addition to
+`ID-NNN`. The freeform path delegates crystallization to `/user:think`, pauses for your approval,
+sanitizes the input, atomically allocates the next ID, writes the BACKLOG row, then proceeds
+exactly as ID-first, so "apply SDD to X" is a genuine one-shot front door without losing ID
+traceability.
 
-- Spec: `docs/specs/SPEC-026-freeform-front-door.md` (DRAFT).
+- Spec: `docs/specs/SPEC-026-freeform-front-door.md` (VALIDATED, shipped).
 - Backlog: ID-022.
+- Command: `commands/assign.md` (the resolver + freeform path).
 
-Until that ships, the bridge in Section 8 is the supported path.
+The four invariants the front door upholds: delegate-to-`/user:think`, approve-before-allocate,
+sanitize, atomic-allocate. The native path is Section 8. It stays an invoked command, not an
+auto-firing keyword.
 
 ---
 

From c746e23213fcefbb6c1cd6393ca07be6529b55a0 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:43:25 +0700
Subject: [PATCH 29/56] docs(spec): mark freeform front door tasks verified

TASK-001..005 passed task-verifier; SPEC-026 checklist updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-026-freeform-front-door.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/specs/SPEC-026-freeform-front-door.md b/docs/specs/SPEC-026-freeform-front-door.md
index 3157360..2950764 100644
--- a/docs/specs/SPEC-026-freeform-front-door.md
+++ b/docs/specs/SPEC-026-freeform-front-door.md
@@ -116,25 +116,25 @@ None. The kit ships no UI.
 
 ## Task Breakdown
 ### Phase 1: Resolver + freeform path
-- [ ] TASK-001: Add the arg resolver to `commands/assign.md` (`ID-NNN` vs freeform) with the ID
+- [x] TASK-001 (DONE d920df3, verified): Add the arg resolver to `commands/assign.md` (`ID-NNN` vs freeform) with the ID
   path unchanged. - AC: `assign.md` documents the two-shape argument + the `^ID-[0-9]+$` rule.
-- [ ] TASK-002: Specify the freeform path: **delegate crystallize to `/user:think`** -> approval
+- [x] TASK-002 (DONE d920df3, verified): Specify the freeform path: **delegate crystallize to `/user:think`** -> approval
   gate -> sanitize -> allocate ID + write BACKLOG row -> existing ID-first tail. The interview is
   delegated, not embedded (DEC-003). - AC: `assign.md` carries the ordered freeform steps, names the
   `/user:think` delegation, and the row-before-draft + approve-before-allocate invariants.
 
 ### Phase 2: Hardening
-- [ ] TASK-003: Sanitization + concurrency guard in `commands/assign.md`'s freeform path: escape
+- [x] TASK-003 (DONE d920df3, verified): Sanitization + concurrency guard in `commands/assign.md`'s freeform path: escape
   `|`/newlines in the BACKLOG row cells (table integrity); reduce the slug to `[a-z0-9-]+` (no `/`,
   no `..`); allocate the ID by re-reading max in the write step + a post-write equal-ID collision
   check that fails loud. - AC: `assign.md` documents the sanitization + atomic-allocate invariants
   (DEC-004, DEC-005).
 
 ### Phase 3: Guards + docs
-- [ ] TASK-004: `tests/test-meta.sh` assertions: `assign.md` documents both paths, the delegation,
+- [x] TASK-004 (DONE 1e59de8, verified): `tests/test-meta.sh` assertions: `assign.md` documents both paths, the delegation,
   and all four invariants (row-before-draft, approve-before-allocate, sanitization, atomic-allocate).
   - AC: `bash tests/test-meta.sh` exercises them and passes.
-- [ ] TASK-005: Update `docs/PLAYBOOK.md` (S2/S5/S8 -> native path) and the `WORKFLOW.md` spine
+- [x] TASK-005 (DONE 893753e, verified): Update `docs/PLAYBOOK.md` (S2/S5/S8 -> native path) and the `WORKFLOW.md` spine
   note. - AC: PLAYBOOK no longer calls the bridge "manual today"; the WORKFLOW spine says `/assign`
   accepts freeform.
 

From e865a57f340ed636ffd15aa5b2616367ecd1cfe7 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 00:52:40 +0700
Subject: [PATCH 30/56] docs: reflect the freeform front door in MANUAL and
 ORCHESTRATION

Doc-impact drift caught in a follow-up audit: SPEC-026 changed commands/assign.md but TASK-005 only updated PLAYBOOK + WORKFLOW; MANUAL.md (the commands/* companion per the doc-impact map) and ORCHESTRATION.md (the flow view) still described /assign as ID-only. Both now document the freeform path (delegate to /think, approve-before-allocate, sanitize, allocate ID + row).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 MANUAL.md             | 8 ++++----
 docs/ORCHESTRATION.md | 7 +++++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/MANUAL.md b/MANUAL.md
index 3d14202..0034918 100644
--- a/MANUAL.md
+++ b/MANUAL.md
@@ -60,10 +60,10 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 ### `/user:assign`
 
 **Phase:** orchestrate (backlog item -> goal draft -> lane)
-**Reads:** `_meta/BACKLOG.md` Active queue (by `ID-NNN`), the item's Lane column, `AGENTS.md` zones (the projection source for the six-section goal) + the active spec's `## Verification` / `## After state`
-**Writes:** `.claude/goals/<slug>.md` (the SPEC-005 draft contract; never `.claude/last-goal.md`), a six-section operating directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if)
-**When to invoke:** you picked an item from "what's left?" and want it scoped into a goal and routed into the right lane
-**Common gotcha:** it is a mutator-dispatcher: it sets up the goal and hands off, it does NOT execute. It detects the goal-loop activator (built-in `/goal`, `ralph-loop`, or `goal-craft`) and degrades to a plain draft file if none is installed. Idempotent per id. Source: SPEC-006 + ADR-0011.
+**Reads:** `$ARGUMENTS` = either an `ID-NNN` (today's path) OR **freeform intent** (anything not matching `^ID-[0-9]+$`, e.g. "apply SDD to X"); `_meta/BACKLOG.md` Active queue, the item's Lane column, `AGENTS.md` zones (the projection source for the six-section goal) + the active spec's `## Verification` / `## After state`. Freeform delegates the crystallize interview to `/user:think`.
+**Writes:** `.claude/goals/<slug>.md` (the SPEC-005 draft contract; never `.claude/last-goal.md`), a six-section operating directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if). On the **freeform path** it first writes a new sanitized `_meta/BACKLOG.md` row with a freshly allocated ID (row-before-draft, approve-before-allocate).
+**When to invoke:** you picked an `ID-NNN` from "what's left?", OR you have a freeform feature idea / vague brief with no ID yet, and want it scoped into a goal and routed into the right lane.
+**Common gotcha:** it is a mutator-dispatcher: it sets up the goal and hands off, it does NOT execute. The freeform path **delegates** the interview to `/user:think` (it does not embed one). It detects the goal-loop activator (built-in `/goal`, `ralph-loop`, or `goal-craft`) and degrades to a plain draft file if none is installed. Idempotent per id (and per slug for freeform). Source: SPEC-006 + ADR-0011; freeform front door SPEC-026.
 
 ### `/user:spec`
 
diff --git a/docs/ORCHESTRATION.md b/docs/ORCHESTRATION.md
index c88bbdc..da82603 100644
--- a/docs/ORCHESTRATION.md
+++ b/docs/ORCHESTRATION.md
@@ -43,6 +43,9 @@ Three stores. Each flow reads and/or writes these; nothing is re-entered between
 **Detector vs mutator** (load-bearing): `/user:start` and `/user:next` only **read and
 render** the queue + drafts. `/user:assign` is the **only mutator**: it writes a goal
 draft, flips a backlog status, and hands off. No other entry point mutates state on intake.
+`/user:assign` accepts either an `ID-NNN` or **freeform intent** (the freeform front door,
+SPEC-026): given freeform it delegates the crystallize interview to `/user:think`, then
+allocates the ID + BACKLOG row (approve-before-allocate, sanitized) before routing as usual.
 
 ---
 
@@ -58,7 +61,7 @@ every lane is a longer or shorter walk along it.
   /user:start ........... DETECT: render the BACKLOG Active queue + active goal drafts
        │                  (read-only; suggests the next command)
        ▼
-  /user:assign ID-NNN ... MUTATE: goal-craft a draft (.claude/goals/<slug>.md),
+  /user:assign <ID|free> .. MUTATE: goal-craft a draft (.claude/goals/<slug>.md);
        │                  pick the lane from the item, detect the goal-loop activator,
        │                  flip status (queued -> speccing | executing), hand off.
        │                  Does NOT execute. Never writes last-goal.md.
@@ -338,7 +341,7 @@ mistake is irreversible:
 | Trigger | Starts | Stop condition | Enforcer |
 |---|---|---|---|
 | `/user:start` | render queue + drafts | output rendered | none (detector) |
-| `/user:assign ID-NNN` | goal draft + lane routing | draft written, status flipped, handed off | none (mutator; idempotent) |
+| `/user:assign <ID-NNN or freeform>` | goal draft + lane routing (freeform: delegate crystallize to `/user:think`, then allocate ID + BACKLOG row) | draft written, status flipped, handed off | none (mutator; idempotent; freeform gated by approve-before-allocate) |
 | `/user:think` | decision brief | brief written (if BUILD) | advisory |
 | `/user:spec` | spec scaffold | spec exists, `Status: DRAFT` | spec-drift-guard hook |
 | `/user:spec-validate` | 5-lens adversarial review | `Status: VALIDATED` | advisory (full lane) |

From 28aab3392f786d0f9bea294ab2dba607b56a3c67 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:06:52 +0700
Subject: [PATCH 31/56] docs(spec): add SPEC-027 mid-flight spec amend
 (VALIDATED)

Declares the BUILDING -> SPECIFYING -> BUILDING amend micro-loop
(operating-layer-vision Scenario 7 / ID-023). Approach A: convention +
recorded ## Amendments checkpoint, no new command or hook. spec-validate
passed after fixing the resume claim (lead with /next, not /execute;
DEC-006) and documenting the un-recorded-amend limitation (DEC-007).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                             |   2 +-
 docs/specs/SPEC-027-mid-flight-spec-amend.md | 140 +++++++++++++++++++
 2 files changed, 141 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/SPEC-027-mid-flight-spec-amend.md

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index f53aa13..9c30419 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -42,7 +42,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
 | ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
 | ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | validated |
-| ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
+| ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | executing |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
diff --git a/docs/specs/SPEC-027-mid-flight-spec-amend.md b/docs/specs/SPEC-027-mid-flight-spec-amend.md
new file mode 100644
index 0000000..c6c0ab2
--- /dev/null
+++ b/docs/specs/SPEC-027-mid-flight-spec-amend.md
@@ -0,0 +1,140 @@
+# Spec: Mid-flight spec amend (BUILDING -> SPECIFYING -> BUILDING)
+Generated: 2026-05-22
+Status: VALIDATED
+
+Source: `docs/operating-layer-vision.md` Scenario 7 + the Section 5 gap-analysis row "Mid-flight spec amend"; ID-023. Goal draft: `.claude/goals/mid-flight-spec-amend.md`.
+
+## Problem
+
+You are mid-`/user:execute` on a `VALIDATED` spec (state `BUILDING`). Partway through, the work reveals scope that must be added now: "also do Y." Today there is **no declared path** for this:
+
+- `commands/execute.md` says outright *"Do NOT modify the spec without asking"* and *"Spec ambiguity discovered: Stop and ask user to clarify."* It tells you to stop, not how to add scope and continue.
+- The only ways forward are both bad: silently mutate the spec mid-build (the SPEC-024 retro flagged exactly this, editing a TASK's acceptance criterion mid-execute, as a *"process smell"*, recorded only after the fact), or restart the lane (`/spec` -> `/spec-validate` from scratch), which throws away the completed-task state.
+- `docs/operating-layer-vision.md` §3.3 has no `BUILDING -> SPECIFYING` row for an amend; Scenario 7 is listed as a **gap** and is not rendered in `docs/PLAYBOOK.md` or `docs/ORCHESTRATION.md`.
+
+The state machine is therefore not legible at this point: an operator mid-build cannot answer "where can I go from here, what does it cost, how do I trigger it" for the add-scope case.
+
+## Solution
+
+### Approaches considered
+- **A (chosen): convention + a recorded checkpoint.** Declare the `BUILDING -> SPECIFYING -> BUILDING` micro-loop across the model + rules + operator docs; reword `execute.md`'s "don't modify the spec" anti-pattern into "amend at a checkpoint via the declared path, never silently"; add an **optional, on-demand `## Amendments`** provenance section to the spec template; pin the convention with a `tests/test-meta.sh` assertion. No new command, no new hook. Tradeoff: discoverability rests on the docs + Claude's interpret layer, not a one-word command; mitigated because the amend is rare (~2 occurrences) and the trigger phrase ("also do Y") is natural.
+- **B: a new `/user:amend` command.** One-step, discoverable, repeatable. Rejected for v1: a new command costs `MANUAL.md` + README command table + `.claude-plugin/plugin.json` + `marketplace.json` + `test-meta` frontmatter checks, and PHILOSOPHY rejects unearned commands ("no speculative features"; "earn the abstraction"). A twice-used path does not yet clear that bar. Revisit if amends become frequent.
+- **C: a hook-enforced amend ritual.** A `PreToolUse` hook that blocks edits to a `VALIDATED` spec mid-build unless an `## Amendments` entry exists. Rejected: PHILOSOPHY explicitly reserves hard blocks for the safety subset and rejects hard-gating *process* completeness (the completeness clauses warn+log, never block). An amend is a process step, not a safety boundary.
+
+### Chosen approach + why
+Approach A. It closes the gap with the kit's own grain: a **declared, recorded** path instead of a silent mutation, enforced by convention + one structural test rather than a new command or a hard gate. It composes with what exists: amend-the-spec-first means the new scope's source files become "known" to `spec-drift-guard` (which already greps the union of active specs and skips `.md`), and `/user:next` already resumes by picking the next undone (`- [ ]`) task and skipping `- [x]` done rows, so a resumed build runs only the amended tasks. (`/user:execute` re-parses and re-presents the *whole* plan, so resume after an amend leads with `/user:next`, not a fresh `/user:execute`; see DEC-006.) B and C add weight the evidence does not yet justify.
+
+### The amend micro-loop (the declared transition)
+
+```
+  BUILDING (mid /user:execute, spec is VALIDATED)
+     │  trigger: operator says "also do Y" / "amend the spec to add Y"
+     ▼
+  [GUARD] at a task checkpoint only:
+     - the in-flight task is verified + committed (or no task is in flight)
+     - completed - [x] tasks are FROZEN (never re-opened by an amend)
+     ▼
+  SPECIFYING (amend, not restart)
+     1. append new TASK-NNN rows (new phase or appended) to ## Task Breakdown
+     2. update ## After state / ## Acceptance Criteria / ## Verification for the DELTA only
+     3. record an ## Amendments entry (date | what | why | at which checkpoint | new task ids)
+     4. re-validate the DELTA only (full lane: /spec-validate on the new tasks; normal lane: advisory)
+     ▼  Status STAYS VALIDATED (no drop to DRAFT -> that would be a lane restart)
+  BUILDING (resume)
+     - /user:next picks the next undone - [ ] task (skips - [x] done rows) -> runs only the amended tasks
+       (/user:execute re-lists the FULL plan; resume after an amend leads with /user:next. See DEC-006.)
+```
+
+Key invariants:
+- **No lane restart.** `Status:` never drops back to `DRAFT`. The amend is an in-place delta on a `VALIDATED` spec; only the delta is (re-)validated. This is the literal "without restarting the lane" requirement of ID-023.
+- **Completed work is frozen.** An amend may only *add* scope (tasks / AC / after-state bullets). It must not silently rewrite a checked-off task's acceptance criterion; changing an already-DONE contract is a separate, heavier decision (re-open / re-spec), not an amend.
+- **Recorded at a checkpoint, not mid-worker.** The `## Amendments` entry is the provenance the SPEC-024 retro asked for, captured *at the moment of the change* and *between tasks*, not reconstructed afterward.
+
+### `## Amendments` section shape (optional, on-demand)
+
+Added only when an amend happens (like `## Failure modes` / `## Open questions`, never an empty scaffold):
+
+```markdown
+## Amendments
+- AMEND-001: 2026-05-22 | added <one-line scope> | why: <reason> | at TASK-007 checkpoint | new tasks: TASK-009..TASK-010 | re-validated: delta-only (advisory, normal lane)
+```
+
+### Source-of-truth placement (avoid duplicating the procedure)
+- **`WORKFLOW.md`** (the rules contract) is the **canonical** home of the amend rule (when you may amend, the checkpoint guard, the recorded entry, resume).
+- `docs/operating-layer-vision.md` §3.3 gains the `BUILDING -> SPECIFYING` transition **row** (the model) and §5 marks the gap **closed** (-> SPEC-027), mirroring how ID-022 -> SPEC-026 was recorded.
+- `docs/PLAYBOOK.md` renders Scenario 7 as an operator card (what you say -> what happens); `docs/ORCHESTRATION.md` renders the loop view. Both **point at** WORKFLOW.md, they do not restate the rule.
+- `commands/execute.md` rewords the anti-pattern + the ambiguity branch to point at the declared path.
+- `commands/spec.md` documents the optional on-demand `## Amendments` section.
+
+## Technical Design
+
+### Interfaces (I/O contract)
+- **Consumes:** an active `docs/specs/SPEC-NNN-<slug>.md` with `Status: VALIDATED` and a `## Task Breakdown` containing both `- [x]` (done) and `- [ ]` (pending) rows; the existing branch-aware active-spec detection (SPEC-005) that `/user:execute` and `/user:next` already use.
+- **Produces:** the same spec file, amended in place: new `- [ ]` TASK rows, an `## Amendments` log entry, delta updates to `## After state` / `## Acceptance Criteria` / `## Verification`. No new files, no schema migration.
+- **Invariants:** `Status:` stays `VALIDATED` across an amend; `- [x]` rows are byte-for-byte unchanged by an amend; an amend only appends scope.
+
+### Data model changes
+None. The spec template gains one **optional** documented section (`## Amendments`); no required field, no parser change. `spec-drift-guard.sh` is unchanged (it already skips `.md` and greps the active-spec union).
+
+### API / UI / Infrastructure changes
+None. This is a convention + doc + one meta-test change. No hook code, no new command, no `settings.json`/`hooks.json` wiring.
+
+## Task Breakdown
+
+### Phase 1: Declare the transition (model + rules)
+- [ ] TASK-001: Add the `BUILDING -> SPECIFYING` amend row to `docs/operating-layer-vision.md` §3.3 transition table (trigger "also do Y" / "amend"; guard "at a task checkpoint, completed tasks frozen, Status stays VALIDATED"; To `SPECIFYING`), and add the return `SPECIFYING -> BUILDING (resume)` semantics note. Mark §5 gap-analysis row "Mid-flight spec amend" as closed (-> SPEC-027), matching the ID-022 -> SPEC-026 style., AC: `rg "BUILDING -> SPECIFYING" docs/operating-layer-vision.md` matches in §3.3; §5 row references SPEC-027; no other §5 row altered.
+- [ ] TASK-002: Add the canonical mid-flight amend rule to `WORKFLOW.md` (the micro-loop, the checkpoint guard, the frozen-completed-tasks + add-only invariants, Status-stays-VALIDATED, resume via `/next` picking the next undone row, the `## Amendments` record)., AC: `WORKFLOW.md` contains a "mid-flight" / "amend" subsection naming all four invariants; `bash tests/test-meta.sh` still green.
+
+### Phase 2: Wire the operator-facing surfaces (projections)
+- [ ] TASK-003: Reword `commands/execute.md` so the anti-pattern "Do NOT modify the spec without asking" and the "Spec ambiguity discovered" / "Task is too large" branches point at the declared amend path (amend at a checkpoint, record in `## Amendments`, resume), instead of only "stop and ask". The no-silent-mutation rule is preserved (amend != silent edit)., AC: `commands/execute.md` references "amend" + "checkpoint"; it no longer instructs a flat "do NOT modify the spec" with no escape hatch; existing tests green.
+- [ ] TASK-004: Document the optional on-demand `## Amendments` section in `commands/spec.md`'s template prose (alongside the other optional sections), with the `AMEND-NNN: date | what | why | at checkpoint | new tasks | re-validated` shape., AC: `commands/spec.md` mentions `## Amendments` and the AMEND entry shape; the section is described as optional/on-demand (no empty scaffold).
+- [ ] TASK-005: Render Scenario 7 in `docs/PLAYBOOK.md` (operator card: "also do Y" mid-build -> the amend micro-loop) and add the loop view to `docs/ORCHESTRATION.md`; both point at WORKFLOW.md as canonical, not restating the rule., AC: `rg -i "also do|mid-flight|amend" docs/PLAYBOOK.md docs/ORCHESTRATION.md` matches in both; both link/point to WORKFLOW.md.
+
+### Phase 3: Pin the convention
+- [ ] TASK-006: Add a `tests/test-meta.sh` section "Mid-flight amend convention" asserting (a) `commands/execute.md` references the amend path (`amend` + `checkpoint`), (b) `WORKFLOW.md` documents the mid-flight amend rule, (c) `commands/spec.md` documents the `## Amendments` section, (d) `docs/operating-layer-vision.md` has the `BUILDING -> SPECIFYING` transition row., AC: the four new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
+
+## After state
+The definition-of-done picture. Each is false now, true after, and checkable.
+- [ ] An operator mid-build can find a declared "add scope without restarting" path. (Today: `commands/execute.md` says only "do NOT modify the spec" / "stop and ask".) Checkable: `rg -i "amend" commands/execute.md WORKFLOW.md` returns the declared path.
+- [ ] `docs/operating-layer-vision.md` §3.3 has a `BUILDING -> SPECIFYING` row and §5 marks the mid-flight gap closed. (Today: no row; §5 says "**ID-023** (new)".) Checkable: `rg "BUILDING -> SPECIFYING" docs/operating-layer-vision.md`.
+- [ ] The spec template documents an optional `## Amendments` provenance section. (Today: no amend convention exists.) Checkable: `rg "## Amendments" commands/spec.md`.
+- [ ] Scenario 7 is rendered for operators in both PLAYBOOK and ORCHESTRATION. (Today: absent from both.) Checkable: `rg -i "also do|mid-flight|amend" docs/PLAYBOOK.md docs/ORCHESTRATION.md` matches in both.
+- [ ] The convention is regression-pinned. (Today: nothing tests it.) Checkable: `bash tests/test-meta.sh` includes and PASSes the "Mid-flight amend convention" assertions.
+
+## Acceptance Criteria (global)
+- [ ] All tasks pass their individual acceptance criteria
+- [ ] `bash tests/test-meta.sh && bash tests/test-hooks.sh` green (no regressions)
+- [ ] The amend rule is canonical in exactly one place (WORKFLOW.md); other docs point at it, not restate it (no source-of-truth duplication)
+
+## Verification
+```bash
+bash tests/test-meta.sh && bash tests/test-hooks.sh
+```
+Plus the After-state greps above (each is a real command).
+
+## Edge Cases
+1. **Amend requested mid-worker (a task is in flight).** The guard defers: finish + verify + commit the in-flight task to reach a checkpoint, then amend. The amend never interrupts a running worker.
+2. **Amend that rewrites an already-DONE task's AC** (not just adds). Out of scope for an amend; that is a re-open/re-spec decision. The convention says amends are add-only; a contract change to completed work pauses for a human (Pause-if: risk-classification / architecture).
+3. **Amend on a `DRAFT` (not yet validated) spec.** No amend needed: just edit the DRAFT directly before `/spec-validate`. The amend path is specifically for `VALIDATED`/building specs.
+4. **Amend on a single-task spec.** Allowed; the new tasks make it multi-task, so the Step-4 integration-checker now applies on resume. The operator should expect the integration check.
+5. **Concurrent specs (multi-spec worktrees).** The amend targets the branch-aware active spec (SPEC-005 detection), so an amend cannot land in the wrong spec.
+6. **Amend done without recording the `## Amendments` entry.** Not detected in v1: Approach A is convention, not enforcement (B/C were rejected deliberately). This is a known limitation, mirroring `spec-drift-guard`'s documented grep limitation (the kit's honesty rule: never over-claim enforcement). The completeness clauses already reviewed at `/user:ship` + `/user:retro` are where an un-recorded amend would surface informally; wiring it into the completeness log is a future option, not v1. See DEC-007.
+
+## Out of Scope
+- A `/user:amend` slash command (Approach B), revisit only if amends become frequent.
+- Any hook enforcement of the amend ritual (Approach C), PHILOSOPHY rejects hard-gating process.
+- Re-opening / re-spec of a `SHIPPED` spec, that is ID-025 (SHIPPED -> TRIAGING), a separate item.
+- Context-switch across specs / the ABANDONED terminal, that is ID-024, a separate item.
+- Rewriting completed (`- [x]`) tasks' contracts, a heavier decision than an add-only amend.
+
+## Decision Log
+- DEC-001: Convention + recorded checkpoint (Approach A), not a new command (B) or a hook (C). Rationale: matches PHILOSOPHY (earn the abstraction; warn-not-block; no speculative features); the amend has ~2 real uses. Alternatives B/C rejected as unearned weight / a hard gate on process. (Confirmed with the maintainer at the design checkpoint.)
+- DEC-002: `Status:` stays `VALIDATED` across an amend; only the delta is (re-)validated. Rationale: dropping to `DRAFT` *is* a lane restart, the exact thing ID-023 removes. Alternative (reset to DRAFT) rejected.
+- DEC-003: Amends are **add-only**; completed `- [x]` tasks are frozen. Rationale: rewriting a done contract mid-build is the "process smell" the SPEC-024 retro flagged; keep that on the heavier re-open path. Alternative (allow arbitrary mid-build edits) rejected.
+- DEC-004: `## Amendments` is an **optional, on-demand** section, not added to every spec. Rationale: PHILOSOPHY "no empty scaffolds"; mirrors `## Failure modes` / `## Open questions`. Alternative (always-present section) rejected.
+- DEC-005: The amend rule is canonical in `WORKFLOW.md`; the vision doc carries the model row, PLAYBOOK/ORCHESTRATION/execute point at it. Rationale: source-of-truth hierarchy; avoid the four-copies drift the kit repeatedly fights. (Build:) the meta-test asserts the rule is present, not duplicated.
+- DEC-006 (validation): Resume after an amend leads with `/user:next`, not a fresh `/user:execute`. Rationale: spec-validate verified that `/user:next` skips `- [x]` done rows and picks the next undone task, while `/user:execute` re-parses and re-presents the *whole* plan; only `/user:next` makes the "completed work frozen, only new tasks run" promise hold. Found by Reviewer 2/3; the original draft over-claimed `/user:execute`.
+- DEC-007 (validation): An amend done without an `## Amendments` entry is undetected in v1 (documented as a known limitation, Edge case 6), not enforced. Rationale: honesty rule (never over-claim portable enforcement); B/C were rejected, so v1 has no guard. Wiring it into the completeness log is a future option. Found by Reviewer 2.
+
+## Open questions
+(none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From 17e411ac7b4022248cf2262730139a70b1afca81 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:08:12 +0700
Subject: [PATCH 32/56] docs(vision): add the mid-flight amend transition to
 the state machine
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add the BUILDING -> SPECIFYING amend row and the SPECIFYING -> BUILDING
(resume) return row to the §3.3 transition table, distinct from the
existing validation/review revision paths. Mark the §5 gap-analysis row
"Mid-flight spec amend" closed, referencing the now-validated implementing
spec, mirroring how the freeform front door row references its spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/operating-layer-vision.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/operating-layer-vision.md b/docs/operating-layer-vision.md
index cbaf73d..533d7f5 100644
--- a/docs/operating-layer-vision.md
+++ b/docs/operating-layer-vision.md
@@ -108,6 +108,8 @@ New UX invariants this vision adds:
 | VALIDATING | NEEDS REVISION | revisions required | SPECIFYING |
 | BUILDING | task FAIL:fixable | retries < 2 | BUILDING (fix-agent) |
 | BUILDING | task FAIL:escalate / retries == 2 | unfixable | BLOCKED |
+| BUILDING | "also do Y" / "amend the spec" | at a task checkpoint; completed tasks frozen; Status stays VALIDATED | SPECIFYING (amend, not restart) |
+| SPECIFYING | resume via `/next` | amend recorded | BUILDING (resume) |
 | BUILDING | all tasks done | **all PASS + integration PASS** (hard) | REVIEWING |
 | REVIEWING | verdict SHIP / FIX-applied | not DO-NOT-SHIP | DOCUMENTING |
 | REVIEWING | FIX THEN SHIP / DO NOT SHIP | findings open | SPECIFYING / BUILDING |
@@ -164,7 +166,7 @@ Most transitions already have a path. The real gaps, in priority order:
 | Gap | Scenario | Why it is a gap | Proposed |
 |---|---|---|---|
 | Freeform front door | 2, 5 | `/assign` is ID-only; freeform is bridged by hand | **SPEC-026 / ID-022** (drafted) |
-| Mid-flight spec amend | 7 | no path to amend a `VALIDATED`/building spec without restarting | **ID-023** (new) |
+| Mid-flight spec amend | 7 | no path to amend a `VALIDATED`/building spec without restarting | **SPEC-027 / ID-023** (validated) |
 | Context switch across specs | 9 | worktree-per-spec is the model but no switch affordance | **ID-024** (new) |
 | Re-open shipped | 13 | a `SHIPPED` spec has no follow-up convention | **ID-025** (new) |
 | Abandon terminal | 12 | only `parked` exists; no explicit drop | folded into ID-024 (state hygiene) or a tiny lane |

From 704092b6d81cf4c97274a02117ba930e200b24fd Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:10:03 +0700
Subject: [PATCH 33/56] docs(workflow): add the canonical mid-flight amend rule

Add a `## Mid-flight amend` section to WORKFLOW.md as the single source of
truth for the BUILDING -> SPECIFYING -> BUILDING amend path. Names the four
invariants: no lane restart (Status stays VALIDATED, delta-only re-validation),
completed work frozen (add-only, `- [x]` rows unchanged), recorded at a
checkpoint via the `## Amendments` entry, and resume via `/user:next` (not a
fresh `/user:execute`). Other docs point here rather than restating the rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index 1e05381..2d54118 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -75,6 +75,35 @@ session start
 
 **Freeform front door.** `/user:assign` accepts freeform intent, not only an `ID-NNN`. Given freeform, it delegates the interview to `/user:think`, pauses for approval, then allocates the ID + BACKLOG row before routing as usual; the ID-first path is unchanged. **Detector/mutator split.** `/user:start` and `/user:next` only read and render; `/user:assign` is the only mutator. **Activator-agnostic.** `/user:assign` writes only the `.claude/goals/<slug>.md` draft and surfaces its body; activation (starting the loop) is done by whatever primitive is present (the built-in `/goal`, the `ralph-loop` plugin, or the `goal-craft` skill). The kit NEVER writes `.claude/last-goal.md`; if no activator exists, the draft is a plain reusable file. **"Even the goal loop follows WORKFLOW"** is delivered honestly: the safety subset is hard-enforced by existing hooks (anti-rationalization, the verification pipeline, the push-to-main blocker); decision/doc completeness is warned + logged to `~/.claude/dwarves-kit/logs/completeness.log` and reviewed at `/user:ship` + `/user:retro`, not hard-blocked mid-loop (PHILOSOPHY rejects hard-gating process completeness).
 
+## Mid-flight amend
+Canonical rule. You are mid-`/user:execute` on a `VALIDATED` spec and the work
+reveals scope that must be added now ("also do Y"). Amend the spec in place;
+do not restart the lane and do not silently mutate it. Other docs
+(PLAYBOOK, ORCHESTRATION, `commands/execute.md`) point here; they do not restate
+this rule. The state-model row (`BUILDING -> SPECIFYING -> BUILDING`) lives in
+`docs/operating-layer-vision.md` §3.3; this section carries the operational rule.
+
+The amend is governed by four invariants:
+
+- **No lane restart.** `Status:` stays `VALIDATED` across an amend; only the
+  DELTA is (re-)validated (full lane: `/spec-validate` on the new tasks; normal
+  lane: advisory). Dropping back to `DRAFT` would be a lane restart, the exact
+  thing this path removes.
+- **Completed work is frozen (add-only).** An amend may only ADD scope (new
+  `- [ ]` tasks, new acceptance criteria, new after-state bullets). It must not
+  rewrite an already-done (`- [x]`) task's contract; the `- [x]` rows are
+  byte-for-byte unchanged. Rewriting a done task is the heavier re-open / re-spec
+  path, not an amend.
+- **Recorded at a checkpoint, not mid-worker.** The amend happens between tasks:
+  the in-flight task is verified and committed first (or no task is in flight).
+  Record it as an entry in the spec's `## Amendments` section (optional,
+  on-demand; see `commands/spec.md`), one line per amend:
+  `AMEND-NNN: date | what | why | at which checkpoint | new tasks | re-validated`.
+- **Resume leads with `/user:next`, not a fresh `/user:execute`.** `/next` picks
+  the next undone `- [ ]` task and skips `- [x]` done rows, so resume runs only
+  the amended tasks. `/execute` re-parses and re-presents the whole plan, so it
+  is the wrong door after an amend.
+
 ## Completion contract
 The done-definition is canonical in `AGENTS.md` zone 3 ("Done means"); do not
 restate it here. In the kit, the task-verifier is what proves "done" (self-reported

From 7c6eaaeb95e181ac817e55c68444eaf19250bc1e Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:12:06 +0700
Subject: [PATCH 34/56] docs(execute): point the no-silent-mutation rule at the
 amend path

Reword the "do NOT modify the spec" anti-pattern and the
ambiguity / oversize error-handling branches so they point at the
declared mid-flight amend path (pause at a checkpoint, append tasks,
record ## Amendments, resume with /user:next) instead of only
"stop and ask". Keeps the no-silent-mutation guard: an amend is a
deliberate, recorded, checkpoint action, not a silent edit; a silent
rewrite of done (- [x]) tasks stays forbidden. Points at WORKFLOW.md
"## Mid-flight amend" (canonical); does not restate the rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/execute.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/commands/execute.md b/commands/execute.md
index 04a4a52..8aff916 100644
--- a/commands/execute.md
+++ b/commands/execute.md
@@ -233,8 +233,8 @@ After all phases complete:
 
 - **Worker fails to complete**: Run task-verifier anyway on whatever exists. The verifier determines if partial work is salvageable (FAIL:fixable) or needs human input (FAIL:escalate).
 - **Tests break during execution**: task-verifier catches this. If fixable, fix-agent handles it. If not, escalate.
-- **Spec ambiguity discovered**: Stop and ask user to clarify. Do not guess. Do not dispatch fix-agent for spec problems.
-- **Task is too large**: Split it into subtasks, confirm with user, then dispatch.
+- **Spec ambiguity discovered**: If it is a genuine contradiction (the spec disagrees with itself), stop and ask the user to clarify. Do not guess. Do not dispatch fix-agent for spec problems. If instead the work reveals scope that must be ADDED now ("also do Y"), that is the declared mid-flight amend path, not an ambiguity: amend at a checkpoint (append `- [ ]` tasks, record an `## Amendments` entry) and resume with `/user:next` (see WORKFLOW.md "## Mid-flight amend").
+- **Task is too large**: Split it into subtasks. If the split stays within the task's declared scope, confirm with user, then dispatch. If splitting means ADDING scope beyond the spec, route it through the mid-flight amend path (amend at a checkpoint, then resume with `/user:next`; see WORKFLOW.md "## Mid-flight amend").
 - **fix-agent reports it cannot fix an issue**: Escalate immediately. Don't retry with the same fix-agent.
 
 ## Anti-patterns to avoid
@@ -243,6 +243,6 @@ After all phases complete:
 - Do NOT skip verification. Every task goes through task-verifier, even if the worker says "all criteria met."
 - Do NOT skip the phase checkpoint. The user must approve before the next phase.
 - Do NOT auto-fix failing tests without the verification pipeline.
-- Do NOT modify the spec without asking.
+- Do NOT silently mutate the spec mid-build. An amend is not a silent edit: when the work reveals scope that must be added now, take the declared mid-flight amend path (pause at a task checkpoint, append new `- [ ]` tasks, record an `## Amendments` entry, resume with `/user:next`). See WORKFLOW.md "## Mid-flight amend". A silent rewrite of done (`- [x]`) tasks is still forbidden.
 - Do NOT retry FAIL:escalate verdicts. They need human judgment by definition.
 - Do NOT dispatch fix-agent for more than 2 issues at once. If the verifier found 5+, the task needs re-implementation, not patching. Escalate.

From c11b7af0f65df83864e31b8e6d85783fd0961202 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:12:07 +0700
Subject: [PATCH 35/56] docs(spec): document the optional Amendments section in
 the template

Add an optional, on-demand `## Amendments` section to the spec template
in commands/spec.md, modeled on the `## Failure modes` optional-section
convention (never an empty scaffold in a fresh spec). Includes the
AMEND-NNN entry shape and points at WORKFLOW.md as the owner of the
amend rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/spec.md | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/commands/spec.md b/commands/spec.md
index fb8066e..8723add 100644
--- a/commands/spec.md
+++ b/commands/spec.md
@@ -149,6 +149,10 @@ Optional; expected for full-lane specs that touch an external provider, data los
 ## Decision Log
 - DEC-001: [decision], [rationale], [alternatives rejected]
 
+## Amendments
+Optional; added only when a mid-flight amend happens (like `## Failure modes` / `## Open questions`, never an empty scaffold in a fresh spec). A running provenance log of mid-build scope additions. `WORKFLOW.md` owns the amend rule (when you may amend, the checkpoint guard, resume); this section is just the recorded entry. Entry shape:
+- AMEND-NNN: [date] | [what scope was added] | why: [reason] | at [TASK-NNN] checkpoint | new tasks: [TASK-NNN..TASK-NNN] | re-validated: [delta-only (advisory / full lane)]
+
 ## Open questions
 (none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)
 ```

From 2db948f187f87489c8dc9f2205094e7d457c1865 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:12:56 +0700
Subject: [PATCH 36/56] docs: render the mid-flight scope-change scenario for
 operators

Add Scenario 7 ("also do Y" mid-build) as an operator card in
docs/PLAYBOOK.md and as an amend micro-loop view in
docs/ORCHESTRATION.md. Both projections point at WORKFLOW.md
"## Mid-flight amend" as canonical and do not restate the four
invariants, per DEC-005 (avoid source-of-truth duplication).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/ORCHESTRATION.md | 33 +++++++++++++++++++++++++++++++++
 docs/PLAYBOOK.md      | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/docs/ORCHESTRATION.md b/docs/ORCHESTRATION.md
index da82603..90c59b4 100644
--- a/docs/ORCHESTRATION.md
+++ b/docs/ORCHESTRATION.md
@@ -255,6 +255,39 @@ context, retries fixable failures, and checks cross-task wiring at the end. Self
       build complete ◀── re-check
 ```
 
+### 5.4 Mid-flight amend micro-loop (BUILDING -> checkpoint -> amend -> resume)
+
+A side excursion off the execute pipeline, not a fourth engine: when work mid-build reveals
+scope that must be added now ("also do Y"), the operator amends the active `VALIDATED` spec in
+place and resumes, **without** restarting the lane. The rule (the checkpoint guard, the
+add-only + frozen-completed-tasks + Status-stays-VALIDATED invariants, resume-via-`/next`) is
+canonical in **`WORKFLOW.md` "## Mid-flight amend"**; the state-model row
+(`BUILDING -> SPECIFYING -> BUILDING`) lives in `docs/operating-layer-vision.md` §3.3. This
+section only draws the loop; it does not restate the four invariants.
+
+- **Trigger**: operator says "also do Y" / "amend the spec" mid-`/user:execute`.
+- **Guard**: amend only at a task checkpoint (the in-flight task verified + committed first,
+  or none in flight); completed `- [x]` tasks are frozen.
+- **Resume**: `/user:next` (picks the next undone `- [ ]` row, skips done rows), **not** a fresh
+  `/user:execute` (which re-presents the whole plan). See `WORKFLOW.md` for why.
+
+```text
+   BUILDING (mid /user:execute, spec is VALIDATED)
+        │  trigger: "also do Y"
+        ▼
+   reach a task checkpoint  ──────────────────────────────┐
+   (in-flight task verified + committed; - [x] frozen)     │ not at a checkpoint yet?
+        │                                                  │ finish the in-flight task first
+        ▼                                                  └──────────────────────────────┘
+   SPECIFYING (amend, not restart)
+        - append new - [ ] TASK rows; delta After-state / AC / Verification
+        - record an ## Amendments entry
+        - re-validate the DELTA only (full: /spec-validate; normal: advisory)
+        │  Status STAYS VALIDATED (no drop to DRAFT)
+        ▼
+   /user:next  ──▶  BUILDING (resume; runs only the amended tasks)
+```
+
 ---
 
 ## 6. Opt-in side-flows (8)
diff --git a/docs/PLAYBOOK.md b/docs/PLAYBOOK.md
index 4ef7574..74a6e9e 100644
--- a/docs/PLAYBOOK.md
+++ b/docs/PLAYBOOK.md
@@ -201,6 +201,43 @@ straight into a spec is how scope drift starts.
 
 ---
 
+## Scenario 7: mid-build, "also do Y" (a mid-flight scope change)
+
+**Context.** You are mid-`/user:execute` on a `VALIDATED` spec (state BUILDING). Partway through,
+the work reveals scope that must be added now. You do **not** want to restart the lane or throw
+away the tasks already done.
+
+**What you say.** "also do Y", "while you're in here, add Z", "amend the spec to cover Y".
+
+**Automatic vs interpreted.**
+- *Not a keyword trigger.* No hook watches for "also do". Nothing auto-fires.
+- *Interpreted*: Claude recognizes this as a **mid-flight amend** and runs the declared
+  amend micro-loop (BUILDING -> SPECIFYING -> BUILDING) instead of silently editing the spec or
+  starting over. The full rule (when you may amend, the checkpoint guard, the recorded entry,
+  how to resume) is canonical in **`WORKFLOW.md` "## Mid-flight amend"**; this card is only the
+  what-you-say -> what-happens projection of it.
+
+**Resulting flow.** Amend in place at a checkpoint, then resume; the spec stays `VALIDATED`:
+```text
+  "also do Y"  (mid /user:execute, spec is VALIDATED)
+     -> reach a task checkpoint (finish + verify + commit the in-flight task first)
+     -> amend the spec: append new - [ ] TASK rows + delta the After-state / AC / Verification
+        (completed - [x] tasks are NOT touched), record an ## Amendments entry
+     -> re-validate the DELTA only (full lane: /spec-validate on the new tasks; normal: advisory)
+     -> /user:next  resumes, picking the next undone - [ ] task (skips the done rows)
+```
+
+**Decision / approval points.** You confirm the added scope before the amend lands; the
+delta-only re-validation verdict (full lane). The Status never drops to `DRAFT` (that would be a
+lane restart), and an amend only **adds** scope: rewriting an already-done task's contract is a
+heavier re-open decision, not an amend. See `WORKFLOW.md` "## Mid-flight amend" for the four
+invariants in full.
+
+**How to continue.** Say `next` to resume on the amended tasks. To add yet more scope later,
+repeat: each amend appends a fresh `## Amendments` line. To leave the build instead, `ship` it.
+
+---
+
 ## 8. The freeform -> ID front door (what `/user:assign` does internally)
 
 Scenarios 2 and 5 are freeform (no ID). The orchestration is ID-first, so `/user:assign` mints the

From e2a9b5c2e50fe0147a5346a648fcf6c783342e98 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:15:26 +0700
Subject: [PATCH 37/56] test(meta): pin the mid-flight amend convention

Add a "Mid-flight amend convention" section to tests/test-meta.sh with four
assertions guarding the BUILDING -> SPECIFYING -> BUILDING amend convention
across its four surfaces: execute.md (amend + checkpoint), WORKFLOW.md (the
canonical "Mid-flight amend" rule), spec.md (the optional "## Amendments"
section), and operating-layer-vision.md (the BUILDING -> SPECIFYING transition
row in the table). Total assertions 250 -> 254.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 tests/test-meta.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/tests/test-meta.sh b/tests/test-meta.sh
index 0bcff96..e0d72bb 100755
--- a/tests/test-meta.sh
+++ b/tests/test-meta.sh
@@ -1049,6 +1049,58 @@ else
   FAIL=$((FAIL + 1))
 fi
 
+# ============================================================
+echo ""
+echo "=== Mid-flight amend convention (SPEC-027) ==="
+# ============================================================
+# Pin the BUILDING -> SPECIFYING -> BUILDING amend convention across its four
+# surfaces so a wording flip on any of them fails CI. WORKFLOW.md is the canonical
+# home of the rule; the other three are projections/the model row that point at it.
+
+# (a) execute.md reroutes the "don't modify the spec" anti-pattern to the declared
+# amend path: it must reference BOTH "amend" and "checkpoint".
+TOTAL=$((TOTAL + 1))
+if grep -qF 'amend' "$KIT_DIR/commands/execute.md" 2>/dev/null \
+   && grep -qF 'checkpoint' "$KIT_DIR/commands/execute.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} execute.md references the amend path (amend + checkpoint) (SPEC-027)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} execute.md lost the amend path (needs amend + checkpoint)"
+  FAIL=$((FAIL + 1))
+fi
+
+# (b) WORKFLOW.md is the canonical home: it must carry the "Mid-flight amend" rule.
+TOTAL=$((TOTAL + 1))
+if grep -qF 'Mid-flight amend' "$KIT_DIR/WORKFLOW.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} WORKFLOW.md documents the Mid-flight amend rule (SPEC-027, canonical)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} WORKFLOW.md lost the Mid-flight amend rule"
+  FAIL=$((FAIL + 1))
+fi
+
+# (c) spec.md documents the optional on-demand "## Amendments" provenance section.
+TOTAL=$((TOTAL + 1))
+if grep -qF '## Amendments' "$KIT_DIR/commands/spec.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} spec.md documents the '## Amendments' section (SPEC-027)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} spec.md lost the '## Amendments' section"
+  FAIL=$((FAIL + 1))
+fi
+
+# (d) operating-layer-vision.md carries the BUILDING -> SPECIFYING amend transition
+# row in §3.3. Pin the whole row (From cell BUILDING, the amend trigger, To cell
+# SPECIFYING) so the model stays legible; brittle-proofed via the full-row regex.
+TOTAL=$((TOTAL + 1))
+if grep -qE '\| BUILDING \|.*amend the spec.*\| SPECIFYING' "$KIT_DIR/docs/operating-layer-vision.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} operating-layer-vision.md has the BUILDING -> SPECIFYING amend row (SPEC-027)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} operating-layer-vision.md lost the BUILDING -> SPECIFYING amend transition row"
+  FAIL=$((FAIL + 1))
+fi
+
 # ============================================================
 echo ""
 echo "=== Results ==="

From 3287581289e6c336fe46a12e683363ea0cd3598a Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:16:51 +0700
Subject: [PATCH 38/56] docs(spec): mark SPEC-027 tasks verified

All six tasks passed the verification pipeline (task-verifier PASS each).
test-meta 254/254, test-hooks 92/92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-027-mid-flight-spec-amend.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/specs/SPEC-027-mid-flight-spec-amend.md b/docs/specs/SPEC-027-mid-flight-spec-amend.md
index c6c0ab2..c498292 100644
--- a/docs/specs/SPEC-027-mid-flight-spec-amend.md
+++ b/docs/specs/SPEC-027-mid-flight-spec-amend.md
@@ -82,16 +82,16 @@ None. This is a convention + doc + one meta-test change. No hook code, no new co
 ## Task Breakdown
 
 ### Phase 1: Declare the transition (model + rules)
-- [ ] TASK-001: Add the `BUILDING -> SPECIFYING` amend row to `docs/operating-layer-vision.md` §3.3 transition table (trigger "also do Y" / "amend"; guard "at a task checkpoint, completed tasks frozen, Status stays VALIDATED"; To `SPECIFYING`), and add the return `SPECIFYING -> BUILDING (resume)` semantics note. Mark §5 gap-analysis row "Mid-flight spec amend" as closed (-> SPEC-027), matching the ID-022 -> SPEC-026 style., AC: `rg "BUILDING -> SPECIFYING" docs/operating-layer-vision.md` matches in §3.3; §5 row references SPEC-027; no other §5 row altered.
-- [ ] TASK-002: Add the canonical mid-flight amend rule to `WORKFLOW.md` (the micro-loop, the checkpoint guard, the frozen-completed-tasks + add-only invariants, Status-stays-VALIDATED, resume via `/next` picking the next undone row, the `## Amendments` record)., AC: `WORKFLOW.md` contains a "mid-flight" / "amend" subsection naming all four invariants; `bash tests/test-meta.sh` still green.
+- [x] TASK-001 (DONE: 17e411a, verified): Add the `BUILDING -> SPECIFYING` amend row to `docs/operating-layer-vision.md` §3.3 transition table (trigger "also do Y" / "amend"; guard "at a task checkpoint, completed tasks frozen, Status stays VALIDATED"; To `SPECIFYING`), and add the return `SPECIFYING -> BUILDING (resume)` semantics note. Mark §5 gap-analysis row "Mid-flight spec amend" as closed (-> SPEC-027), matching the ID-022 -> SPEC-026 style., AC: `rg "BUILDING -> SPECIFYING" docs/operating-layer-vision.md` matches in §3.3; §5 row references SPEC-027; no other §5 row altered.
+- [x] TASK-002 (DONE: 704092b, verified): Add the canonical mid-flight amend rule to `WORKFLOW.md` (the micro-loop, the checkpoint guard, the frozen-completed-tasks + add-only invariants, Status-stays-VALIDATED, resume via `/next` picking the next undone row, the `## Amendments` record)., AC: `WORKFLOW.md` contains a "mid-flight" / "amend" subsection naming all four invariants; `bash tests/test-meta.sh` still green.
 
 ### Phase 2: Wire the operator-facing surfaces (projections)
-- [ ] TASK-003: Reword `commands/execute.md` so the anti-pattern "Do NOT modify the spec without asking" and the "Spec ambiguity discovered" / "Task is too large" branches point at the declared amend path (amend at a checkpoint, record in `## Amendments`, resume), instead of only "stop and ask". The no-silent-mutation rule is preserved (amend != silent edit)., AC: `commands/execute.md` references "amend" + "checkpoint"; it no longer instructs a flat "do NOT modify the spec" with no escape hatch; existing tests green.
-- [ ] TASK-004: Document the optional on-demand `## Amendments` section in `commands/spec.md`'s template prose (alongside the other optional sections), with the `AMEND-NNN: date | what | why | at checkpoint | new tasks | re-validated` shape., AC: `commands/spec.md` mentions `## Amendments` and the AMEND entry shape; the section is described as optional/on-demand (no empty scaffold).
-- [ ] TASK-005: Render Scenario 7 in `docs/PLAYBOOK.md` (operator card: "also do Y" mid-build -> the amend micro-loop) and add the loop view to `docs/ORCHESTRATION.md`; both point at WORKFLOW.md as canonical, not restating the rule., AC: `rg -i "also do|mid-flight|amend" docs/PLAYBOOK.md docs/ORCHESTRATION.md` matches in both; both link/point to WORKFLOW.md.
+- [x] TASK-003 (DONE: 7c6eaae, verified): Reword `commands/execute.md` so the anti-pattern "Do NOT modify the spec without asking" and the "Spec ambiguity discovered" / "Task is too large" branches point at the declared amend path (amend at a checkpoint, record in `## Amendments`, resume), instead of only "stop and ask". The no-silent-mutation rule is preserved (amend != silent edit)., AC: `commands/execute.md` references "amend" + "checkpoint"; it no longer instructs a flat "do NOT modify the spec" with no escape hatch; existing tests green.
+- [x] TASK-004 (DONE: c11b7af, verified): Document the optional on-demand `## Amendments` section in `commands/spec.md`'s template prose (alongside the other optional sections), with the `AMEND-NNN: date | what | why | at checkpoint | new tasks | re-validated` shape., AC: `commands/spec.md` mentions `## Amendments` and the AMEND entry shape; the section is described as optional/on-demand (no empty scaffold).
+- [x] TASK-005 (DONE: 2db948f, verified): Render Scenario 7 in `docs/PLAYBOOK.md` (operator card: "also do Y" mid-build -> the amend micro-loop) and add the loop view to `docs/ORCHESTRATION.md`; both point at WORKFLOW.md as canonical, not restating the rule., AC: `rg -i "also do|mid-flight|amend" docs/PLAYBOOK.md docs/ORCHESTRATION.md` matches in both; both link/point to WORKFLOW.md.
 
 ### Phase 3: Pin the convention
-- [ ] TASK-006: Add a `tests/test-meta.sh` section "Mid-flight amend convention" asserting (a) `commands/execute.md` references the amend path (`amend` + `checkpoint`), (b) `WORKFLOW.md` documents the mid-flight amend rule, (c) `commands/spec.md` documents the `## Amendments` section, (d) `docs/operating-layer-vision.md` has the `BUILDING -> SPECIFYING` transition row., AC: the four new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
+- [x] TASK-006 (DONE: e2a9b5c, verified): Add a `tests/test-meta.sh` section "Mid-flight amend convention" asserting (a) `commands/execute.md` references the amend path (`amend` + `checkpoint`), (b) `WORKFLOW.md` documents the mid-flight amend rule, (c) `commands/spec.md` documents the `## Amendments` section, (d) `docs/operating-layer-vision.md` has the `BUILDING -> SPECIFYING` transition row., AC: the four new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
 
 ## After state
 The definition-of-done picture. Each is false now, true after, and checkable.

From 1be1b5431ed0692eb4a7c03dd7a6491218a13423 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:22:29 +0700
Subject: [PATCH 39/56] fix(workflow): require operator approval before a
 mid-flight amend

Review (HIGH) found the amend path had no human-approval gate in the canonical
WORKFLOW rule or the autonomous /execute orchestrator, while the PLAYBOOK card
already promised one. An autonomous loop could self-expand scope, contradicting
AGENTS.md zone-4 Pause-if and the kit's anti-scope-drift stance. Folded
operator-approval into invariant #3 (kept four invariants, no fifth) and added
"confirm the added scope with the user" to execute.md's two amend branches.
Recorded as SPEC-027 DEC-008. test-meta 254/254, test-hooks 92/92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 WORKFLOW.md                                  | 11 +++++++----
 commands/execute.md                          |  4 ++--
 docs/specs/SPEC-027-mid-flight-spec-amend.md |  1 +
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/WORKFLOW.md b/WORKFLOW.md
index 2d54118..bcf0bb7 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -94,10 +94,13 @@ The amend is governed by four invariants:
   rewrite an already-done (`- [x]`) task's contract; the `- [x]` rows are
   byte-for-byte unchanged. Rewriting a done task is the heavier re-open / re-spec
   path, not an amend.
-- **Recorded at a checkpoint, not mid-worker.** The amend happens between tasks:
-  the in-flight task is verified and committed first (or no task is in flight).
-  Record it as an entry in the spec's `## Amendments` section (optional,
-  on-demand; see `commands/spec.md`), one line per amend:
+- **Recorded at a checkpoint, operator-approved, not mid-worker.** The amend
+  happens between tasks: the in-flight task is verified and committed first (or no
+  task is in flight). Adding scope is an operator decision, never the loop's: an
+  autonomous `/user:execute` pauses for the operator to approve the added scope
+  before the amend lands (this is AGENTS.md zone 4 "Pause if", a scope / risk /
+  architecture change). Then record it as an entry in the spec's `## Amendments`
+  section (optional, on-demand; see `commands/spec.md`), one line per amend:
   `AMEND-NNN: date | what | why | at which checkpoint | new tasks | re-validated`.
 - **Resume leads with `/user:next`, not a fresh `/user:execute`.** `/next` picks
   the next undone `- [ ]` task and skips `- [x]` done rows, so resume runs only
diff --git a/commands/execute.md b/commands/execute.md
index 8aff916..84b85d8 100644
--- a/commands/execute.md
+++ b/commands/execute.md
@@ -233,8 +233,8 @@ After all phases complete:
 
 - **Worker fails to complete**: Run task-verifier anyway on whatever exists. The verifier determines if partial work is salvageable (FAIL:fixable) or needs human input (FAIL:escalate).
 - **Tests break during execution**: task-verifier catches this. If fixable, fix-agent handles it. If not, escalate.
-- **Spec ambiguity discovered**: If it is a genuine contradiction (the spec disagrees with itself), stop and ask the user to clarify. Do not guess. Do not dispatch fix-agent for spec problems. If instead the work reveals scope that must be ADDED now ("also do Y"), that is the declared mid-flight amend path, not an ambiguity: amend at a checkpoint (append `- [ ]` tasks, record an `## Amendments` entry) and resume with `/user:next` (see WORKFLOW.md "## Mid-flight amend").
-- **Task is too large**: Split it into subtasks. If the split stays within the task's declared scope, confirm with user, then dispatch. If splitting means ADDING scope beyond the spec, route it through the mid-flight amend path (amend at a checkpoint, then resume with `/user:next`; see WORKFLOW.md "## Mid-flight amend").
+- **Spec ambiguity discovered**: If it is a genuine contradiction (the spec disagrees with itself), stop and ask the user to clarify. Do not guess. Do not dispatch fix-agent for spec problems. If instead the work reveals scope that must be ADDED now ("also do Y"), that is the declared mid-flight amend path, not an ambiguity: confirm the added scope with the user first (adding scope is not the loop's call), then amend at a checkpoint (append `- [ ]` tasks, record an `## Amendments` entry) and resume with `/user:next` (see WORKFLOW.md "## Mid-flight amend").
+- **Task is too large**: Split it into subtasks. If the split stays within the task's declared scope, confirm with user, then dispatch. If splitting means ADDING scope beyond the spec, confirm the added scope with the user, then route it through the mid-flight amend path (amend at a checkpoint, then resume with `/user:next`; see WORKFLOW.md "## Mid-flight amend").
 - **fix-agent reports it cannot fix an issue**: Escalate immediately. Don't retry with the same fix-agent.
 
 ## Anti-patterns to avoid
diff --git a/docs/specs/SPEC-027-mid-flight-spec-amend.md b/docs/specs/SPEC-027-mid-flight-spec-amend.md
index c498292..ae0b743 100644
--- a/docs/specs/SPEC-027-mid-flight-spec-amend.md
+++ b/docs/specs/SPEC-027-mid-flight-spec-amend.md
@@ -135,6 +135,7 @@ Plus the After-state greps above (each is a real command).
 - DEC-005: The amend rule is canonical in `WORKFLOW.md`; the vision doc carries the model row, PLAYBOOK/ORCHESTRATION/execute point at it. Rationale: source-of-truth hierarchy; avoid the four-copies drift the kit repeatedly fights. (Build:) the meta-test asserts the rule is present, not duplicated.
 - DEC-006 (validation): Resume after an amend leads with `/user:next`, not a fresh `/user:execute`. Rationale: spec-validate verified that `/user:next` skips `- [x]` done rows and picks the next undone task, while `/user:execute` re-parses and re-presents the *whole* plan; only `/user:next` makes the "completed work frozen, only new tasks run" promise hold. Found by Reviewer 2/3; the original draft over-claimed `/user:execute`.
 - DEC-007 (validation): An amend done without an `## Amendments` entry is undetected in v1 (documented as a known limitation, Edge case 6), not enforced. Rationale: honesty rule (never over-claim portable enforcement); B/C were rejected, so v1 has no guard. Wiring it into the completeness log is a future option. Found by Reviewer 2.
+- DEC-008 (review): The amend requires explicit operator approval of the added scope before it lands, folded into invariant #3 ("operator-approved" in the checkpoint guard) in WORKFLOW.md + commands/execute.md. Rationale: `/user:execute` runs autonomously; adding scope is an AGENTS.md zone-4 Pause-if (scope/architecture) decision, never the loop's. The PLAYBOOK card already promised the gate; the canonical rule + the orchestrator had omitted it (a cross-surface inconsistency + an unguarded self-amend path). Found by `/user:review` (HIGH); folded into #3 to keep "four invariants" intact (no fifth).
 
 ## Open questions
 (none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From 4b591db17a75bb54f1656fb90d7dea033fe342f5 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 01:25:40 +0700
Subject: [PATCH 40/56] docs: record the mid-flight amend convention in MANUAL
 and CHANGELOG

MANUAL: note the optional ## Amendments section under /user:spec, the
operator-approved amend path under /user:execute, and that /user:next is the
resume door after an amend. CHANGELOG [Unreleased]: the SPEC-027 entry.
doc-verifier PASS (14/14 claims). test-meta 254/254, test-hooks 92/92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md | 1 +
 MANUAL.md    | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index bc0f927..66813a4 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,7 @@ All notable changes to dwarves-kit are documented here.
 
 ### Added
 
+- **Mid-flight spec amend: a declared path to add scope to a building spec without restarting the lane (SPEC-027).** Closes the `BUILDING -> SPECIFYING -> BUILDING` gap (operating-layer-vision Scenario 7): when a build reveals scope that must be added now ("also do Y"), the operator approves, then the spec is amended in place at a task checkpoint instead of being silently mutated or the lane restarted. Convention, not a new command or hook (Approach A; PHILOSOPHY "earn the abstraction"). Four invariants, canonical in `WORKFLOW.md` "## Mid-flight amend": no lane restart (`Status:` stays VALIDATED, only the delta re-validated), add-only (completed `- [x]` tasks frozen), recorded + operator-approved at a checkpoint (an optional on-demand `## Amendments` provenance section in the spec), resume with `/user:next` (not a fresh `/user:execute`). Projected into `docs/operating-layer-vision.md` §3.3 (the transition row) + §5 (gap closed), `commands/execute.md` (the no-silent-mutation anti-pattern now points at the amend path), `commands/spec.md` (the `## Amendments` template section), `docs/PLAYBOOK.md` Scenario 7 + `docs/ORCHESTRATION.md` 5.4 (operator projections), and pinned by 4 `tests/test-meta.sh` assertions (meta suite to 254). `/user:review` (HIGH) added the operator-approval gate the canonical rule had omitted (DEC-008). Source: SPEC-027 / ID-023.
 - **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
 - **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
 - **Six-section goal projection in `/user:assign`**: the goal draft `/user:assign` writes now projects the backlog item across six fixed sections (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if) sourced from the `AGENTS.md` zones + the active spec, so a handed-off goal carries fuller context into its lane.
diff --git a/MANUAL.md b/MANUAL.md
index 0034918..cce6188 100644
--- a/MANUAL.md
+++ b/MANUAL.md
@@ -72,7 +72,7 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 **Writes:** `docs/specs/SPEC-NNN-<slug>.md` (Status: DRAFT), `docs/research/{stack,features,architecture,pitfalls}.md`
 **When to invoke:** after `/think`, or directly if the work is well-scoped already
 **Common gotcha:** the research agents are parallel-dispatched via Task tool. If your Claude Code is older than v2.0.60, they fall back to inline research and the run is slower.
-**Template sections:** the generated spec scaffolds Solution depth (approaches / chosen + why / extensibility, SPEC-008), plus an optional `### Interfaces (I/O contract)` under Technical Design and an optional `## Failure modes` table (SPEC-009). Both optional sections are lane-scoped; Reviewers 2 and 5 check them when present. It also pins `## Verification` (the command(s) that prove the spec done) and `## Open questions` (the blocker landing zone a `/goal` loop appends to), so a validated spec is natively pointer-`/goal`-ready (SPEC-012 P1).
+**Template sections:** the generated spec scaffolds Solution depth (approaches / chosen + why / extensibility, SPEC-008), plus an optional `### Interfaces (I/O contract)` under Technical Design and an optional `## Failure modes` table (SPEC-009). Both optional sections are lane-scoped; Reviewers 2 and 5 check them when present. It also pins `## Verification` (the command(s) that prove the spec done) and `## Open questions` (the blocker landing zone a `/goal` loop appends to), so a validated spec is natively pointer-`/goal`-ready (SPEC-012 P1). An optional, on-demand `## Amendments` section (added only when a mid-flight amend happens, never an empty scaffold) records add-scope provenance during a build (SPEC-027).
 
 ### `/user:spec-validate`
 
@@ -90,6 +90,7 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 **Dispatches:** worker subagent per task, then task-verifier, then fix-agent on FAIL:fixable (retry max 2)
 **When to invoke:** when handing off to a contractor OR running the kit on yourself end-to-end
 **Common gotcha:** verification adds ~2x token cost per task. Worth it for the FAIL:fixable catch rate; budget accordingly. Each worker first expands its task into bite-sized verify-each-step increments (TDD when a unit test fits; grep/bash/test-suite verify for doc and config tasks) before coding.
+**Mid-flight amend:** if a build reveals scope that must be added now ("also do Y"), do not silently edit the spec or restart the lane. With your approval, amend at a task checkpoint (append `- [ ]` tasks, record an `## Amendments` entry, Status stays VALIDATED) and resume with `/user:next`. The canonical rule is WORKFLOW.md "## Mid-flight amend"; the operator card is PLAYBOOK Scenario 7 (SPEC-027).
 
 ### `/user:next`
 
@@ -97,7 +98,7 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 **Reads:** `docs/specs/SPEC-NNN-<slug>.md`
 **Writes:** code, tests; you drive the verification yourself
 **When to invoke:** when you want hands-on control or the next task needs subtle judgment that the verification pipeline might over-correct on
-**Common gotcha:** picks the next unchecked task only. To skip a task or pick a specific one, edit SPEC.md task ordering first.
+**Common gotcha:** picks the next unchecked task only. To skip a task or pick a specific one, edit SPEC.md task ordering first. This unchecked-only behavior is also why `/user:next` (not a fresh `/user:execute`) is the way to resume after a mid-flight amend: it runs the newly appended tasks and skips the done rows (SPEC-027).
 
 ### `/user:debug`
 

From 3ac30dcbc0a9341c98d10d6c28084b7055c49b27 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:08:54 +0700
Subject: [PATCH 41/56] chore: mark SPEC-027 shipped, drop ID-023 off the
 active queue

Local ship (no version bump / tag / PR per the maintainer's call; the branch
stacks on unmerged PR #7 and 1.6.0 is cut-but-untagged). SPEC-027 -> SHIPPED;
ID-023 -> shipped in CHANGELOG [Unreleased]. CHANGELOG is the shipped record.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                             | 2 +-
 docs/specs/SPEC-027-mid-flight-spec-amend.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 9c30419..6f9bbf0 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -42,7 +42,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-020 | Verification-pipeline absence-checks: teach task-verifier + integration-checker to assert that removed/replaced content is GONE for "replace, don't duplicate" / "remove X" tasks, not just that the new artifact exists. A replace-task left both copies and passed BOTH verifiers this cycle; only the independent review caught it | SPEC-024 retro 2026-05-21 (pipeline blind spot) | TBD | normal | queued |
 | ID-021 | Reduce execute-worker shell/hook friction: pre-warn the recurring gotchas in `commands/execute.md`'s worker template (fish `noclobber` -> `>|`; no heredoc commit `-m` -> `git commit -F`; no `rm` -> `mv`), and investigate the commit-format/commit-msg heredoc mis-parse (it read a multi-line `-m` body as a 459-char subject) | SPEC-024 retro 2026-05-21 (worker friction) | TBD | normal | queued |
 | ID-022 | Freeform front door: extend `/user:assign` to accept freeform intent (not only `ID-NNN`), auto-allocating an ID + BACKLOG row so "apply SDD to X" / a vague brief is a native one-shot intake (unparks the SPEC-024-deferred griller entry; preserves ID-first traceability) | PLAYBOOK.md scenarios 2026-05-22 | SPEC-026 | normal | validated |
-| ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | executing |
+| ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/mid-flight-amend, no tag/PR) |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
diff --git a/docs/specs/SPEC-027-mid-flight-spec-amend.md b/docs/specs/SPEC-027-mid-flight-spec-amend.md
index ae0b743..ab94d67 100644
--- a/docs/specs/SPEC-027-mid-flight-spec-amend.md
+++ b/docs/specs/SPEC-027-mid-flight-spec-amend.md
@@ -1,6 +1,6 @@
 # Spec: Mid-flight spec amend (BUILDING -> SPECIFYING -> BUILDING)
 Generated: 2026-05-22
-Status: VALIDATED
+Status: SHIPPED
 
 Source: `docs/operating-layer-vision.md` Scenario 7 + the Section 5 gap-analysis row "Mid-flight spec amend"; ID-023. Goal draft: `.claude/goals/mid-flight-spec-amend.md`.
 

From 438724fb8160407404e1728db626455638e784c4 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:10:06 +0700
Subject: [PATCH 42/56] docs(retro): capture the SPEC-027 mid-flight amend
 cycle

Signal: spec-validate caught a correctness defect pre-build; review caught a
missing approval-gate the verifiers passed (ID-020 pattern recurs); release
hygiene is a two-cycle recurrence (SPEC-018 + SPEC-027) and the top finding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RETRO-2026-05-22-mid-flight-spec-amend.md | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 docs/retro/RETRO-2026-05-22-mid-flight-spec-amend.md

diff --git a/docs/retro/RETRO-2026-05-22-mid-flight-spec-amend.md b/docs/retro/RETRO-2026-05-22-mid-flight-spec-amend.md
new file mode 100644
index 0000000..a5fc18c
--- /dev/null
+++ b/docs/retro/RETRO-2026-05-22-mid-flight-spec-amend.md
@@ -0,0 +1,34 @@
+# Retro: SPEC-027 mid-flight spec amend (ID-023)
+Date: 2026-05-22
+Sprint: single-session dogfood, assign -> spec -> spec-validate -> execute -> review -> docs -> ship
+
+## Metrics
+- Tasks planned: 6, completed: 6, deferred: 0 (the 6 TASK rows; the 8 open `- [ ]` are After-state / global-AC observable checkboxes, not tasks).
+- Commits: 10 (9 build + 1 ship bookkeeping), atomic, conventional, no spec IDs in subjects.
+- Files changed: 11 (+177 / -14). No new command, no new hook (Approach A: convention).
+- Verification: 6/6 task-verifier PASS, 0 retries, 0 escalations; integration-checker PASS; doc-verifier PASS (14/14 claims). meta 254/254 (+4), hooks 92/92.
+- Completeness log: clean. Doc-impact: every applicable companion moved (MANUAL + test-meta; no new command, so no README-table / plugin / marketplace impact).
+
+## Key commits
+- `17e411a` vision transition row; `704092b` canonical WORKFLOW rule; `7c6eaae` execute reword; `e2a9b5c` meta-test pin; `1be1b54` the review-driven approval-gate fix (DEC-008).
+
+## What worked
+- **`/spec-validate` caught a real correctness defect before any build.** The draft claimed resume "via `/user:execute` (or `/user:next`)"; verifying against `commands/next.md` vs `commands/execute.md` showed only `/next` skips `[x]` done rows. Fixed as DEC-006 in the spec, so the build inherited the correct rule. The adversarial pass earned its keep on the kit's own spec, not just downstream ones.
+- **`/review` caught a HIGH that all the verifiers passed.** task-verifier (x6) and integration-checker both PASSED the build, yet the independent review found the amend path had no operator-approval gate in the canonical rule or the autonomous orchestrator, while the PLAYBOOK card already promised one (a cross-surface inconsistency + an unguarded autonomous self-amend). This is the SPEC-024 retro's ID-020 finding recurring: review catches a class the verifiers structurally cannot (a missing guard is not a failed assertion).
+- **Source-of-truth discipline held across 5 independent workers.** DEC-005 (canonical rule in WORKFLOW.md, every other surface points at it) survived 5 separate worker subagents writing 6 surfaces; the integration-checker confirmed zero four-copies drift. The "point, do not restate" instruction in each worker prompt was the load-bearing control.
+- **Disjoint-file parallel dispatch in Phase 2 was clean and faster.** TASK-003/004/005 touched execute.md / spec.md / (PLAYBOOK+ORCHESTRATION) with no overlap; dispatched in parallel, no conflicts, each verified independently.
+
+## What hurt
+- **Release-hygiene tangle (recurring).** `VERSION`/`plugin.json` = 1.6.0 but latest tag = v1.5.1 (cut-but-untagged), `[Unreleased]` mixes unmerged PR #7 (SPEC-024) with this cycle, branch 42 commits ahead of `master`. The `/ship` gate had to stop and ask because the state was ambiguous. The SPEC-018 (placement-ui-design) retro flagged this exact failure mode; its recurrence clears the PHILOSOPHY section-5 bar for a guard.
+- **The kit prescribes sequential worker dispatch.** `commands/execute.md` says "Execute them one at a time (sequential dispatch; parallel dispatch is a future upgrade)." Phase 2's three tasks were disjoint-file independent, so sequential was pure latency tax; the dogfood deviated to parallel. The instruction does not acknowledge the safe disjoint-file case.
+- **The spec underspecified the human-approval gate.** SPEC-027's "Key invariants" listed four and omitted operator-approval; the PLAYBOOK worker reasonably added "you confirm the added scope," which is what created the cross-surface inconsistency review then caught. The autonomy-gate angle (does this let an autonomous loop make a scope/architecture decision unattended?) was not in any `/spec-validate` lens's focus for a docs/convention spec, so validate missed it and review had to.
+- **Auto-format left punctuation artifacts.** The slop-cleaner / auto-format converted the spec's TASK-line ` — AC:` em dashes to `., AC:`, producing slightly clumsy "style., AC:" reads. Cosmetic, in the spec doc only, but the em-dash replacement does not always pick graceful punctuation.
+
+## Action items
+- [ ] **Release-hygiene guard (recurring, clears the PHILOSOPHY bar).** A `kit-health` line or a hook that flags `VERSION` cut-but-untagged AND `[Unreleased]` spanning more than one spec. Relates to the ID-016 guard-promotion theme. -> propose BACKLOG.
+- [ ] **execute.md: acknowledge disjoint-file parallel dispatch.** Note that independent, non-overlapping-file tasks may be dispatched in parallel; sequential is the safe default only for shared-file or dependent tasks. -> propose BACKLOG (tiny) or a one-line execute.md edit.
+- [ ] **spec-validate: an autonomy/approval-gate lens.** For any spec whose behavior runs inside an autonomous loop (`/execute`, `/goal`), a check: "does this let the loop make a scope / architecture / risk decision without a human gate?" Would have caught this cycle's HIGH at validate, not review. -> propose BACKLOG.
+- [ ] **LOW: PLAYBOOK scenario-numbering reconcile** (already in TODOS.md): no Scenario 6 card; sections 8-11 use bare numbers on the scenario axis. -> propose BACKLOG (tiny).
+
+## Kit feedback
+The dogfood is positive evidence for the layered verification design: each layer caught a different bug class. `/spec-validate` caught a factual-correctness claim (the resume command); `/review` caught a missing guard + cross-surface inconsistency that the per-task and integration verifiers structurally cannot see; `doc-verifier` confirmed the docs matched the shipped code. The "review catches what verifiers miss" pattern recurred (ID-020 from the SPEC-024 retro), strengthening that backlog item. The one process smell the kit's own machinery did NOT prevent was release hygiene, which is now a two-cycle recurrence and the highest-signal kit finding here.

From 1c3b9131b8db5d476668561e9d44f665b62d6229 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:15:16 +0700
Subject: [PATCH 43/56] chore(backlog): add ID-026..028 from the SPEC-027 retro

ID-026 release-hygiene guard (recurring, top signal), ID-027 spec-validate
autonomy-gate lens, ID-028 execute.md disjoint-file parallel-dispatch note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index 6f9bbf0..bdc6343 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -45,6 +45,9 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/mid-flight-amend, no tag/PR) |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
+| ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | TBD | normal | queued |
+| ID-027 | spec-validate autonomy-gate lens: for a spec whose behavior runs inside an autonomous loop (`/execute`, `/goal`), check whether it lets the loop make a scope / architecture / risk decision without a human gate | SPEC-027 retro 2026-05-22 (the cycle's HIGH was caught at review, should have been catchable at validate) | TBD | normal | queued |
+| ID-028 | execute.md: acknowledge disjoint-file parallel worker dispatch (sequential is the safe default only for shared-file or dependent tasks; independent non-overlapping tasks may parallelize) | SPEC-027 retro 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
@@ -52,7 +55,8 @@ Dependency notes:
 - ID-018 / ID-019 came out of the SPEC-024 /review-team pass (deferred LOW findings); both are tiny-lane polish, independent of the SPEC-024 ship.
 - ID-020 / ID-021 came out of the SPEC-024 retro; ID-020 (verifier absence-checks) relates to ID-016 (both are guard/verifier hardening) and is the highest-signal kit finding of the cycle.
 - ID-022 came out of writing `docs/PLAYBOOK.md` (scenarios S2/S5): the freeform-intent gap. It unparks the SPEC-024-deferred griller entry; SPEC-026 drafts it. Independent of the other chains.
-- ID-023 / ID-024 / ID-025 came out of `docs/operating-layer-vision.md` (the SDLC state-machine gap analysis): the transitions with no clean path today. ID-022 is the highest-priority of the operating-layer set; these three follow it. All trace to the vision doc as their north-star.
+- ID-023 / ID-024 / ID-025 came out of `docs/operating-layer-vision.md` (the SDLC state-machine gap analysis): the transitions with no clean path today. ID-022 is the highest-priority of the operating-layer set; these three follow it. All trace to the vision doc as their north-star. ID-023 shipped (SPEC-027); ID-024 / ID-025 remain queued.
+- ID-026 / ID-027 / ID-028 came out of the SPEC-027 retro. ID-026 (release-hygiene guard) is a two-cycle recurrence (SPEC-018 + SPEC-027) and relates to the ID-016 / ID-020 guard-promotion theme; it is the highest-signal of the three. ID-027 (validate autonomy-gate lens) would have caught the SPEC-027 review HIGH earlier. ID-028 is tiny-lane polish.
 
 ## Schema
 

From 50dbd61444f55a6699f4f24b842cebbdf76bdc57 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:24:54 +0700
Subject: [PATCH 44/56] docs(spec): add SPEC-028 release-hygiene guard
 (VALIDATED)

A warn-only guard (ship Step 4 + a kit-health line) for the phantom cut:
VERSION/plugin.json naming a version with no matching git tag, with
[Unreleased] piling on top (the SPEC-018 + SPEC-027 recurrence). Approach A
(warn, no hook, no hard test); normal lane. spec-validate passed; folded the
whitespace-strip + identical-shape-across-surfaces anti-drift note (DEC-005).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                             |   2 +-
 docs/specs/SPEC-028-release-hygiene-guard.md | 107 +++++++++++++++++++
 2 files changed, 108 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/SPEC-028-release-hygiene-guard.md

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index bdc6343..e424bf7 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -45,7 +45,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/mid-flight-amend, no tag/PR) |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
-| ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | TBD | normal | queued |
+| ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | SPEC-028 | normal | executing |
 | ID-027 | spec-validate autonomy-gate lens: for a spec whose behavior runs inside an autonomous loop (`/execute`, `/goal`), check whether it lets the loop make a scope / architecture / risk decision without a human gate | SPEC-027 retro 2026-05-22 (the cycle's HIGH was caught at review, should have been catchable at validate) | TBD | normal | queued |
 | ID-028 | execute.md: acknowledge disjoint-file parallel worker dispatch (sequential is the safe default only for shared-file or dependent tasks; independent non-overlapping tasks may parallelize) | SPEC-027 retro 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
diff --git a/docs/specs/SPEC-028-release-hygiene-guard.md b/docs/specs/SPEC-028-release-hygiene-guard.md
new file mode 100644
index 0000000..54cdba6
--- /dev/null
+++ b/docs/specs/SPEC-028-release-hygiene-guard.md
@@ -0,0 +1,107 @@
+# Spec: Release-hygiene guard (phantom-cut warn)
+Generated: 2026-05-22
+Status: VALIDATED
+
+Source: SPEC-027 retro ("What hurt": the release-hygiene tangle) + SPEC-018 retro (the first occurrence); ID-026. Goal draft: `.claude/goals/release-hygiene-guard.md`. A two-cycle recurrence, which clears the PHILOSOPHY section-5 bar for promoting a check.
+
+## Problem
+The kit can drift into a messy-release state that no surface flags. Concretely (the live state at this spec's writing): `VERSION` and `.claude-plugin/plugin.json` say `1.6.0`, but the latest git tag is `v1.5.1`, so `1.6.0` is a **phantom cut**: the files name a version that was never released. Meanwhile `CHANGELOG.md` `[Unreleased]` keeps accumulating new specs on top of that untagged cut, so it is unclear what `1.6.0` even contains or whether it shipped. This exact state was flagged by the SPEC-018 retro (placement-ui-design: "1.6.0 cut in CHANGELOG + VERSION + plugin.json but never tagged") and recurred in the SPEC-027 retro. Nothing in the kit notices it.
+
+The kit deliberately does "no version bump" ships that let `[Unreleased]` accumulate (e.g. PR #7), so accumulation alone is **not** the bug. The bug is the **phantom cut**: a version named in the files with no matching git tag, with work piling above it.
+
+## Solution
+
+### Approaches considered
+- **A (chosen): warn-only surfaces (ship + kit-health), no hook, no hard test.** Add a release-hygiene warn at `/user:ship`'s version step (where the bump/tag decision is made) plus a `kit-health` check line (run before tagging). Both detect the phantom cut (the files' version not present in `git tag`) and warn; neither blocks. A `tests/test-meta.sh` assertion pins that both surfaces carry the check (not that the repo is currently clean). Tradeoff: a warn can be ignored, but PHILOSOPHY reserves hard blocks for the safety subset and the completeness clauses are already warn+log; this matches that grain.
+- **B: a `tests/test-meta.sh` hard assertion that `VERSION` is tagged.** Rejected: "VERSION named but not yet tagged" is a **legitimate transient** during a release (bump files, commit, then tag), and CI frequently does not fetch tags (shallow clone), so a hard assertion would go red in the normal flow. The wrong tool for a transient-legitimate state.
+- **C: a hook.** Rejected: PHILOSOPHY disfavors hard-gating process; the trigger is fuzzy (on a version-file edit? on a tag?); and the untagged-cut state is a legitimate transient. A hook is overkill for a maintainer-facing heads-up.
+
+### Chosen approach + why
+Approach A. The maintainer at `/user:ship` is exactly who needs the warning at exactly the moment it matters (the version/tag decision); `kit-health` is the documented "run before tagging" self-assessment, so it catches the state even between ships. Both are warn-only, matching the existing completeness-clause grain (warn + report, never block). The meta-test pins the surfaces structurally without asserting a transient-sensitive runtime condition.
+
+### Detection logic (the shared signal)
+- **Phantom cut (primary, mechanical):** read the version from `VERSION` (and `.claude-plugin/plugin.json`; a meta-test already asserts they match), stripping whitespace (`tr -d '[:space:]' < VERSION`, matching the existing meta-test's read so a trailing newline cannot break the tag pattern). If `git tag -l "v<version>"` is empty, the files name an untagged version: warn. (During a real release this fires between the version-bump commit and the tag; the warn is then a correct "remember to tag" nudge.)
+- **Identical shape across both surfaces (anti-drift):** ship and kit-health implement this exact check shape (the whitespace-strip, the `git tag -l "v<version>"` test, and the graceful degrade below). The meta-test pins each surface's *presence*; this spec pins their *shape*, so the two inlined copies (DEC-003) cannot diverge in edge behavior.
+- **Accumulation context (soft):** when the phantom cut fires AND `CHANGELOG.md` `[Unreleased]` is non-empty, the warn adds "work is accumulating above an untagged cut" so the maintainer sees the compounding, not just the missing tag.
+- **Graceful degrade:** if there is no `.git` (e.g. the installed copy under `~/.claude/dwarves-kit`), no `VERSION`, or `git` is unavailable, the check is a no-op (it is repo-scoped; it must not error in a non-repo context).
+
+### Extensibility & boundaries
+- Load-bearing dimension: the number of warn surfaces. Today two (ship, kit-health). The detection is small enough (a `git tag -l` comparison + a CHANGELOG grep) that it is inlined at each surface rather than abstracted (the kit's rule: no shared helper until the same code appears three times). A third surface is the trigger to extract a shared `checks/`-style helper, recorded here so the decision is not re-litigated.
+- Boundaries: each surface is independently testable (ship.md carries the warn instruction; kit-health.md carries the check bash); the meta-test pins each.
+
+### Architecture
+```text
+  VERSION / plugin.json = X
+        │
+        ▼
+  git tag -l "vX"  ──── empty? ────▶ PHANTOM CUT
+        │ present                         │ + CHANGELOG [Unreleased] non-empty?
+        ▼                                 ▼
+   (clean, silent)                  WARN (maintainer-facing, never block):
+                                    "vX is not tagged; [Unreleased] is piling on top"
+        ▲                                 ▲
+        │                                 │
+   surface 1: /user:ship Step 4     surface 2: kit-health line (repo-scoped, before tagging)
+```
+
+## Technical Design
+
+### Interfaces (I/O contract)
+- **Consumes:** `VERSION`, `.claude-plugin/plugin.json` (`.version`), `git tag -l`, `CHANGELOG.md` (`[Unreleased]` section). Read-only.
+- **Produces:** a maintainer-facing warning string at `/user:ship` and a `kit-health` report line. No file writes, no block, no exit-code failure from the check itself.
+- **Invariants:** the check never blocks a ship and never errors outside a git repo / without a VERSION file (graceful no-op). The meta-test asserts the surfaces carry the check, never that the working tree is currently tag-clean (that would be transient-sensitive).
+
+### Data model / API / UI changes
+None. Two command-prompt edits (`commands/ship.md`, `commands/kit-health.md`) + meta-test assertions. No new file, no new command, no hook, no `settings.json` wiring.
+
+## Task Breakdown
+
+### Phase 1: The ship-time warn (primary surface)
+- [ ] TASK-001: Add a release-hygiene warn to `commands/ship.md` at the version step (Step 4, or a Step 4a adjacent to it). It instructs: read the version from `VERSION`, check `git tag -l "v<version>"`; if absent, WARN (phantom cut) and, if `CHANGELOG.md` `[Unreleased]` is non-empty, add the accumulation note. Warn-only, report to the maintainer, never block; degrade silently outside a git repo / without VERSION. Mirror the Step 1b "warn, not block" voice. AC: `commands/ship.md` contains the release-hygiene warn naming the `git tag` phantom-cut check and the "warn, not block" stance; `bash tests/test-meta.sh` green.
+
+### Phase 2: The kit-health line (secondary surface)
+- [ ] TASK-002: Add a release-hygiene check to `commands/kit-health.md` Step 1 (a new numbered bash check). Repo-scoped: compare `VERSION` to `git tag -l "v$(cat VERSION)"`; print a clean/PHANTOM-CUT line; degrade gracefully (no `.git`, no `VERSION`, no `git` => skip with a note, never error). AC: `commands/kit-health.md` Step 1 has the check; it is repo-scoped and degrades gracefully (no hard error in a non-repo context); `bash tests/test-meta.sh` green.
+
+### Phase 3: Pin the surfaces
+- [ ] TASK-003: Add a `tests/test-meta.sh` section "Release-hygiene guard" with assertions that (a) `commands/ship.md` carries the release-hygiene warn (greps for the phantom-cut / `git tag` check + the warn-not-block stance) and (b) `commands/kit-health.md` carries the phantom-cut check. The assertions pin the LOGIC's presence, NOT that the repo is currently tag-clean. AC: the new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
+
+## After state
+- [ ] `/user:ship` warns when `VERSION` names an untagged version. (Today: ship bumps/changelogs with no tag-state check; the 1.6.0 phantom cut went unnoticed across two cycles.) Checkable: `rg -i "tag|phantom|release.hygiene" commands/ship.md` returns the warn.
+- [ ] `kit-health` reports the phantom-cut state, repo-scoped + graceful. (Today: kit-health checks file count / hooks / descriptions, never the version/tag relationship.) Checkable: `rg -i "tag|phantom|VERSION" commands/kit-health.md` returns the check.
+- [ ] The two surfaces are regression-pinned. (Today: nothing tests for them.) Checkable: `bash tests/test-meta.sh` includes and PASSes the "Release-hygiene guard" assertions.
+- [ ] The check never blocks and never errors outside a repo. Checkable: the spec's Edge cases 2 + 3 hold by inspection of the added logic.
+
+## Acceptance Criteria (global)
+- [ ] All tasks pass their individual acceptance criteria.
+- [ ] `bash tests/test-meta.sh && bash tests/test-hooks.sh` green (no regressions).
+- [ ] The guard is warn-only on both surfaces (no block, no test that fails on a legitimate untagged-transient).
+
+## Verification
+```bash
+bash tests/test-meta.sh && bash tests/test-hooks.sh
+```
+Plus the After-state greps above (each a real command).
+
+## Edge Cases
+1. **Legitimate untagged transient (mid-release).** VERSION bumped + committed, tag not yet created: the warn fires, which is correct (it nudges "tag it"). It must NOT be a hard failure (that is why B was rejected).
+2. **Not a git repo / installed copy.** kit-health's bash also runs against `~/.claude/dwarves-kit` (no `.git`): the check degrades to a skip-with-note, never errors.
+3. **No `VERSION` file / `git` absent.** Skip silently; the check is best-effort.
+4. **Clean state (VERSION tagged, `[Unreleased]` empty or normal):** no warn; the surface stays quiet.
+5. **VERSION and plugin.json disagree.** Out of scope here: a separate meta-test already asserts `plugin.json` == `VERSION`; this guard reads `VERSION` as the source.
+
+## Out of Scope
+- Fixing the current `1.6.0` phantom cut or tagging anything (that is a release-cleanup action, not this guard).
+- Bumping/auto-tagging versions or changing the ship version logic (this only adds a warn).
+- A hard `tests/test-meta.sh` "must be tagged" assertion (Approach B, rejected: transient-legitimate).
+- A hook (Approach C, rejected).
+- A shared detection helper (deferred until a third surface needs it; see Extensibility).
+
+## Decision Log
+- DEC-001: Warn-only surfaces (ship + kit-health), Approach A; not a hard meta-test (B) or a hook (C). Rationale: matches the completeness-clause warn grain; B breaks on the legitimate untagged-transient + shallow-CI; C hard-gates process (PHILOSOPHY). Alternatives B/C rejected. (Confirmed with the maintainer at the surface checkpoint.)
+- DEC-002: The detectable signal is the phantom cut (VERSION version not in `git tag`), not `[Unreleased]` accumulation. Rationale: the kit deliberately accumulates `[Unreleased]` across "no version bump" ships, so accumulation alone is not a bug; the untagged cut is. Accumulation is a soft context line layered on the phantom-cut warn.
+- DEC-003: Detection is inlined at each surface, not a shared helper. Rationale: PHILOSOPHY "no premature abstraction" (two uses < three); a third surface triggers extraction. The meta-test pins both inlined copies against drift.
+- DEC-004: The meta-test pins the surfaces' presence, never a tag-clean working tree. Rationale: a runtime tag assertion is transient-sensitive (Edge case 1) and CI-fragile (shallow clone).
+- DEC-005 (validation): The check shape is pinned once in "Detection logic" (whitespace-strip the version, `git tag -l "v<version>"`, graceful no-git/no-VERSION degrade) and both surfaces implement it identically. Rationale: spec-validate Reviewer 5 noted the two inlined copies (DEC-003) could drift in edge handling while the meta-test only pins presence; pinning the shape in the spec closes that gap without a premature shared helper. Also folds R1/R2 robustness (quote + whitespace-strip the VERSION read).
+
+## Open questions
+(none; a /goal loop appends here if it hits a decision this spec does not cover, then stops)

From 0958107cf7fe09e3aa4ff76bb1ca197b785e3b8c Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:26:18 +0700
Subject: [PATCH 45/56] docs(kit-health): add a phantom-version-cut check

Add a repo-scoped release-hygiene check to kit-health Step 1: warn when
VERSION names a version with no matching git tag (a phantom cut), with a
soft note when CHANGELOG [Unreleased] is accumulating above it. Degrades
to a no-op outside a git repo, without VERSION, or without git; never
errors and never exits non-zero. Shape matches the ship.md surface
(whitespace-strip, git tag -l "v$VER", graceful degrade).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/kit-health.md | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/commands/kit-health.md b/commands/kit-health.md
index 667d215..31be774 100644
--- a/commands/kit-health.md
+++ b/commands/kit-health.md
@@ -78,6 +78,22 @@ echo "Source citations in README: $CREDITS"
 # 10. TODOs/FIXMEs in hook scripts
 TODOS=$(grep -r "TODO\|FIXME" ~/.claude/dwarves-kit/hooks/ 2>/dev/null | wc -l | tr -d ' ')
 echo "TODOs/FIXMEs in hooks: $TODOS"
+
+# 11. Release hygiene: a phantom version cut (VERSION names an untagged version)
+# Repo-scoped: reads the repo's .git + VERSION via the current working dir (git
+# tags + CHANGELOG live in the repo, not the installed copy under ~/.claude). The
+# guard degrades to a no-op outside a git repo / without VERSION, never errors.
+if [ -f VERSION ] && git rev-parse --git-dir >/dev/null 2>&1; then
+  VER=$(tr -d '[:space:]' < VERSION)
+  if [ -n "$VER" ] && [ -z "$(git tag -l "v$VER")" ]; then
+    echo "  [WARN] release hygiene: VERSION is $VER but tag v$VER does not exist (phantom cut)"
+    grep -q '^## \[Unreleased\]' CHANGELOG.md 2>/dev/null && echo "         and CHANGELOG [Unreleased] is accumulating above it"
+  else
+    echo "  release hygiene: ok (v$VER tagged, or clean)"
+  fi
+else
+  echo "  release hygiene: skipped (not in the kit repo / no VERSION)"
+fi
 ```
 
 ### Step 2: Present health report

From 0a27c62ab5051f5e8f96b406d90dcbbe00f48521 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:26:20 +0700
Subject: [PATCH 46/56] docs(ship): warn on a phantom version cut at the
 version step

Add a Step 4a release-hygiene warn to commands/ship.md. It detects the
phantom cut (VERSION names a version with no matching git tag), warns the
maintainer, and adds an accumulation note when CHANGELOG [Unreleased] is
non-empty. Warn-only, never blocks; degrades silently outside a git repo
or without VERSION. Shares the exact check shape with kit-health per
DEC-005 so the two inlined copies cannot drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/ship.md | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/commands/ship.md b/commands/ship.md
index 654adeb..dc7c5a8 100644
--- a/commands/ship.md
+++ b/commands/ship.md
@@ -55,6 +55,28 @@ If a version file exists:
 
 If no version file exists: skip this step silently.
 
+### Step 4a: Release-hygiene warn (warn, not block)
+
+At the version/tag decision, check for a **phantom cut**: `VERSION` names a version that has no matching git tag. Run the check, then REPORT to the maintainer; do NOT block the ship. Mirror the Step 1b "warn, not block" voice: hard blocks stay reserved for the REVIEW.md DO-NOT-SHIP verdict and the safety gates.
+
+Use this exact check shape (kit-health's check uses the same shape; they must not drift):
+
+```bash
+# Graceful degrade: no .git, no VERSION, or no git => silent no-op, never error/block.
+if [ -d .git ] && [ -f VERSION ] && command -v git >/dev/null 2>&1; then
+  VER=$(tr -d '[:space:]' < VERSION)   # strip whitespace so a trailing newline cannot break the pattern
+  if [ -z "$(git tag -l "v$VER")" ]; then
+    echo "WARN release-hygiene: phantom cut. VERSION names v$VER but no matching git tag exists."
+    # Accumulation context: [Unreleased] non-empty => work piling above an untagged cut.
+    if [ -f CHANGELOG.md ] && awk '/## \[Unreleased\]/{f=1;next} /^## /{f=0} f && NF{print}' CHANGELOG.md | grep -q .; then
+      echo "         work is accumulating above an untagged cut v$VER."
+    fi
+  fi
+fi
+```
+
+If the phantom cut fires, surface it as a heads-up at the version step (remember to tag `v$VER`), then continue. During a real release this fires between the version-bump commit and the tag, where the warn is the correct "remember to tag" nudge. Source: SPEC-028 (DEC-001 warn-only, DEC-005 shared shape).
+
 ### Step 5: Generate changelog entry
 
 If `CHANGELOG.md` exists (or the project follows a changelog convention):

From abea75d8d9b816f25cf1b424b0d44f91f120ea15 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:28:16 +0700
Subject: [PATCH 47/56] fix(ship): align the release-hygiene degrade guard with
 kit-health

ship.md used `[ -d .git ]` which is false in a git worktree (where .git is a file). Replace with `git rev-parse --git-dir` to match kit-health check 11 exactly, per DEC-005.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/ship.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/commands/ship.md b/commands/ship.md
index dc7c5a8..b6f437f 100644
--- a/commands/ship.md
+++ b/commands/ship.md
@@ -62,8 +62,8 @@ At the version/tag decision, check for a **phantom cut**: `VERSION` names a vers
 Use this exact check shape (kit-health's check uses the same shape; they must not drift):
 
 ```bash
-# Graceful degrade: no .git, no VERSION, or no git => silent no-op, never error/block.
-if [ -d .git ] && [ -f VERSION ] && command -v git >/dev/null 2>&1; then
+# Graceful degrade: no VERSION or not in a git repo => silent no-op, never error or block.
+if [ -f VERSION ] && git rev-parse --git-dir >/dev/null 2>&1; then
   VER=$(tr -d '[:space:]' < VERSION)   # strip whitespace so a trailing newline cannot break the pattern
   if [ -z "$(git tag -l "v$VER")" ]; then
     echo "WARN release-hygiene: phantom cut. VERSION names v$VER but no matching git tag exists."

From 2b9f2a177f5a64503836a60e07a9892ed000a915 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:30:02 +0700
Subject: [PATCH 48/56] test(meta): pin the release-hygiene guard surfaces

Add a "Release-hygiene guard" section to tests/test-meta.sh with two
assertions that pin the PRESENCE of the phantom-cut warn on its two
surfaces (ship.md, kit-health.md), not that the working tree is
currently tag-clean (DEC-004: a runtime tag-state assertion would go
red on the legitimate untagged-transient and on shallow CI clones).

Total 254 -> 256, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 tests/test-meta.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/tests/test-meta.sh b/tests/test-meta.sh
index e0d72bb..67045be 100755
--- a/tests/test-meta.sh
+++ b/tests/test-meta.sh
@@ -1101,6 +1101,39 @@ else
   FAIL=$((FAIL + 1))
 fi
 
+# ============================================================
+echo ""
+echo "=== Release-hygiene guard (SPEC-028) ==="
+# ============================================================
+# Pin the PRESENCE of the phantom-cut warn on its two surfaces so a deletion or a
+# wording flip fails CI. DEC-004: assert the surfaces carry the check, NEVER that
+# the working tree is currently tag-clean ("VERSION named but untagged" is a
+# legitimate transient during a release and CI often does not fetch tags). So we
+# grep the command-prompt files; we never run the phantom-cut check against the repo.
+
+# (a) ship.md (Step 4a) carries the phantom-cut / git-tag check AND the warn-not-block stance.
+TOTAL=$((TOTAL + 1))
+if grep -qF 'git tag -l' "$KIT_DIR/commands/ship.md" 2>/dev/null \
+   && grep -qiF 'phantom' "$KIT_DIR/commands/ship.md" 2>/dev/null \
+   && grep -qiF 'warn, not block' "$KIT_DIR/commands/ship.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} ship.md carries the release-hygiene warn (phantom-cut git-tag check + warn-not-block) (SPEC-028)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} ship.md lost the release-hygiene warn (needs git-tag phantom-cut check + warn-not-block stance)"
+  FAIL=$((FAIL + 1))
+fi
+
+# (b) kit-health.md carries the phantom-cut check.
+TOTAL=$((TOTAL + 1))
+if grep -qF 'git tag -l' "$KIT_DIR/commands/kit-health.md" 2>/dev/null \
+   && grep -qiF 'phantom' "$KIT_DIR/commands/kit-health.md" 2>/dev/null; then
+  echo -e "  ${GREEN}PASS${NC} kit-health.md carries the phantom-cut check (git-tag check + phantom) (SPEC-028)"
+  PASS=$((PASS + 1))
+else
+  echo -e "  ${RED}FAIL${NC} kit-health.md lost the phantom-cut check (needs git-tag check + phantom)"
+  FAIL=$((FAIL + 1))
+fi
+
 # ============================================================
 echo ""
 echo "=== Results ==="

From c68f818bb62b6978284a743a8f49f0ac2737504f Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:31:21 +0700
Subject: [PATCH 49/56] docs(spec): mark SPEC-028 tasks verified

3/3 tasks passed the pipeline (TASK-001 needed 1 fix-agent retry to align the
degrade guard with kit-health per DEC-005). test-meta 256/256, test-hooks 92/92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/specs/SPEC-028-release-hygiene-guard.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/specs/SPEC-028-release-hygiene-guard.md b/docs/specs/SPEC-028-release-hygiene-guard.md
index 54cdba6..b1df71c 100644
--- a/docs/specs/SPEC-028-release-hygiene-guard.md
+++ b/docs/specs/SPEC-028-release-hygiene-guard.md
@@ -57,13 +57,13 @@ None. Two command-prompt edits (`commands/ship.md`, `commands/kit-health.md`) +
 ## Task Breakdown
 
 ### Phase 1: The ship-time warn (primary surface)
-- [ ] TASK-001: Add a release-hygiene warn to `commands/ship.md` at the version step (Step 4, or a Step 4a adjacent to it). It instructs: read the version from `VERSION`, check `git tag -l "v<version>"`; if absent, WARN (phantom cut) and, if `CHANGELOG.md` `[Unreleased]` is non-empty, add the accumulation note. Warn-only, report to the maintainer, never block; degrade silently outside a git repo / without VERSION. Mirror the Step 1b "warn, not block" voice. AC: `commands/ship.md` contains the release-hygiene warn naming the `git tag` phantom-cut check and the "warn, not block" stance; `bash tests/test-meta.sh` green.
+- [x] TASK-001 (DONE: 0a27c62 + fix abea75d, verified): Add a release-hygiene warn to `commands/ship.md` at the version step (Step 4, or a Step 4a adjacent to it). It instructs: read the version from `VERSION`, check `git tag -l "v<version>"`; if absent, WARN (phantom cut) and, if `CHANGELOG.md` `[Unreleased]` is non-empty, add the accumulation note. Warn-only, report to the maintainer, never block; degrade silently outside a git repo / without VERSION. Mirror the Step 1b "warn, not block" voice. AC: `commands/ship.md` contains the release-hygiene warn naming the `git tag` phantom-cut check and the "warn, not block" stance; `bash tests/test-meta.sh` green.
 
 ### Phase 2: The kit-health line (secondary surface)
-- [ ] TASK-002: Add a release-hygiene check to `commands/kit-health.md` Step 1 (a new numbered bash check). Repo-scoped: compare `VERSION` to `git tag -l "v$(cat VERSION)"`; print a clean/PHANTOM-CUT line; degrade gracefully (no `.git`, no `VERSION`, no `git` => skip with a note, never error). AC: `commands/kit-health.md` Step 1 has the check; it is repo-scoped and degrades gracefully (no hard error in a non-repo context); `bash tests/test-meta.sh` green.
+- [x] TASK-002 (DONE: 0958107, verified): Add a release-hygiene check to `commands/kit-health.md` Step 1 (a new numbered bash check). Repo-scoped: compare `VERSION` to `git tag -l "v$(cat VERSION)"`; print a clean/PHANTOM-CUT line; degrade gracefully (no `.git`, no `VERSION`, no `git` => skip with a note, never error). AC: `commands/kit-health.md` Step 1 has the check; it is repo-scoped and degrades gracefully (no hard error in a non-repo context); `bash tests/test-meta.sh` green.
 
 ### Phase 3: Pin the surfaces
-- [ ] TASK-003: Add a `tests/test-meta.sh` section "Release-hygiene guard" with assertions that (a) `commands/ship.md` carries the release-hygiene warn (greps for the phantom-cut / `git tag` check + the warn-not-block stance) and (b) `commands/kit-health.md` carries the phantom-cut check. The assertions pin the LOGIC's presence, NOT that the repo is currently tag-clean. AC: the new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
+- [x] TASK-003 (DONE: 2b9f2a1, verified): Add a `tests/test-meta.sh` section "Release-hygiene guard" with assertions that (a) `commands/ship.md` carries the release-hygiene warn (greps for the phantom-cut / `git tag` check + the warn-not-block stance) and (b) `commands/kit-health.md` carries the phantom-cut check. The assertions pin the LOGIC's presence, NOT that the repo is currently tag-clean. AC: the new assertions PASS; `bash tests/test-meta.sh` exits 0 with an increased total.
 
 ## After state
 - [ ] `/user:ship` warns when `VERSION` names an untagged version. (Today: ship bumps/changelogs with no tag-state check; the 1.6.0 phantom cut went unnoticed across two cycles.) Checkable: `rg -i "tag|phantom|release.hygiene" commands/ship.md` returns the warn.

From 5ebdbeccacc03640a48f63b089b56b1ef99830ec Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:34:21 +0700
Subject: [PATCH 50/56] fix(ship): match the empty-VERSION guard with
 kit-health

Inner condition was missing the [ -n "$VER" ] guard, causing
ship.md to run `git tag -l "v"` and spuriously fire the phantom-cut
warn when VERSION is empty or whitespace-only. Brings the condition
into exact parity with kit-health.md check 11 (SPEC-028 DEC-005).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/ship.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commands/ship.md b/commands/ship.md
index b6f437f..110cb91 100644
--- a/commands/ship.md
+++ b/commands/ship.md
@@ -65,7 +65,7 @@ Use this exact check shape (kit-health's check uses the same shape; they must no
 # Graceful degrade: no VERSION or not in a git repo => silent no-op, never error or block.
 if [ -f VERSION ] && git rev-parse --git-dir >/dev/null 2>&1; then
   VER=$(tr -d '[:space:]' < VERSION)   # strip whitespace so a trailing newline cannot break the pattern
-  if [ -z "$(git tag -l "v$VER")" ]; then
+  if [ -n "$VER" ] && [ -z "$(git tag -l "v$VER")" ]; then
     echo "WARN release-hygiene: phantom cut. VERSION names v$VER but no matching git tag exists."
     # Accumulation context: [Unreleased] non-empty => work piling above an untagged cut.
     if [ -f CHANGELOG.md ] && awk '/## \[Unreleased\]/{f=1;next} /^## /{f=0} f && NF{print}' CHANGELOG.md | grep -q .; then

From b5ddf9faebf79182bc87317a37cad8c4ae4bf503 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:37:59 +0700
Subject: [PATCH 51/56] fix(kit-health): match the [Unreleased] accumulation
 check with ship

Review (MEDIUM) found the soft accumulation-context signal had drifted: kit-health
used grep heading-exists (over-fires) while ship.md used the spec's awk non-empty
test. Aligned kit-health to the same awk; the line is now byte-identical across both
surfaces. Recorded as SPEC-028 DEC-006. Smoke: WARNs correctly, exit 0. meta 256/256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/kit-health.md                       | 5 ++++-
 docs/specs/SPEC-028-release-hygiene-guard.md | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/commands/kit-health.md b/commands/kit-health.md
index 31be774..9365292 100644
--- a/commands/kit-health.md
+++ b/commands/kit-health.md
@@ -87,7 +87,10 @@ if [ -f VERSION ] && git rev-parse --git-dir >/dev/null 2>&1; then
   VER=$(tr -d '[:space:]' < VERSION)
   if [ -n "$VER" ] && [ -z "$(git tag -l "v$VER")" ]; then
     echo "  [WARN] release hygiene: VERSION is $VER but tag v$VER does not exist (phantom cut)"
-    grep -q '^## \[Unreleased\]' CHANGELOG.md 2>/dev/null && echo "         and CHANGELOG [Unreleased] is accumulating above it"
+    # Accumulation context: [Unreleased] NON-empty => work piling above an untagged cut (same awk as ship.md, DEC-006).
+    if [ -f CHANGELOG.md ] && awk '/## \[Unreleased\]/{f=1;next} /^## /{f=0} f && NF{print}' CHANGELOG.md | grep -q .; then
+      echo "         and CHANGELOG [Unreleased] is accumulating above it"
+    fi
   else
     echo "  release hygiene: ok (v$VER tagged, or clean)"
   fi
diff --git a/docs/specs/SPEC-028-release-hygiene-guard.md b/docs/specs/SPEC-028-release-hygiene-guard.md
index b1df71c..867d715 100644
--- a/docs/specs/SPEC-028-release-hygiene-guard.md
+++ b/docs/specs/SPEC-028-release-hygiene-guard.md
@@ -101,6 +101,7 @@ Plus the After-state greps above (each a real command).
 - DEC-002: The detectable signal is the phantom cut (VERSION version not in `git tag`), not `[Unreleased]` accumulation. Rationale: the kit deliberately accumulates `[Unreleased]` across "no version bump" ships, so accumulation alone is not a bug; the untagged cut is. Accumulation is a soft context line layered on the phantom-cut warn.
 - DEC-003: Detection is inlined at each surface, not a shared helper. Rationale: PHILOSOPHY "no premature abstraction" (two uses < three); a third surface triggers extraction. The meta-test pins both inlined copies against drift.
 - DEC-004: The meta-test pins the surfaces' presence, never a tag-clean working tree. Rationale: a runtime tag assertion is transient-sensitive (Edge case 1) and CI-fragile (shallow clone).
+- DEC-006 (review): The accumulation-context check (the `[Unreleased]` non-empty test) is part of the shared shape, not just the three structural lines. Found by `/user:review` (MEDIUM): kit-health used `grep -q '^## \[Unreleased\]'` (heading-exists, over-fires), while ship.md used the spec's awk non-empty test; the two soft-signal copies had drifted (the integration-checker compared only the three structural lines DEC-005 enumerated, so it did not catch this). Aligned kit-health to ship.md's awk. Lesson folded forward: "identical shape" includes the accumulation check.
 - DEC-005 (validation): The check shape is pinned once in "Detection logic" (whitespace-strip the version, `git tag -l "v<version>"`, graceful no-git/no-VERSION degrade) and both surfaces implement it identically. Rationale: spec-validate Reviewer 5 noted the two inlined copies (DEC-003) could drift in edge handling while the meta-test only pins presence; pinning the shape in the spec closes that gap without a premature shared helper. Also folds R1/R2 robustness (quote + whitespace-strip the VERSION read).
 
 ## Open questions

From f18c8c304c706b6cfe2aa3879584be56fb2cb0ae Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:40:28 +0700
Subject: [PATCH 52/56] docs: record the release-hygiene guard in MANUAL and
 CHANGELOG

MANUAL: the Step-4a release-hygiene warn under /user:ship, and the phantom-cut
check under /user:kit-health's Reads. CHANGELOG [Unreleased]: the SPEC-028 entry.
doc-verifier PASS (16/16). meta 256/256, hooks 92/92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md | 1 +
 MANUAL.md    | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 66813a4..4816870 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,7 @@ All notable changes to dwarves-kit are documented here.
 
 ### Added
 
+- **Release-hygiene guard: a warn on a phantom version cut (SPEC-028).** Closes the recurring messy-release state the SPEC-018 + SPEC-027 retros both flagged (`VERSION`/`plugin.json` cut to a version that was never tagged, with `[Unreleased]` piling on top). Two warn-only surfaces detect the **phantom cut** (`VERSION` names a version with no matching `git tag`), plus a soft heads-up when `[Unreleased]` is non-empty above it: a `/user:ship` Step-4a warn (at the version/tag decision) and a `commands/kit-health.md` check (run before tagging, repo-scoped, degrades to a no-op outside the repo). Warn-only, never blocks (Approach A; PHILOSOPHY reserves hard blocks for safety, and an untagged cut is a legitimate release transient, so a hard `tests/test-meta.sh` tag assertion was rejected). Both surfaces share one pinned check shape (whitespace-strip + `git tag -l "v$VER"` + a `git rev-parse --git-dir` degrade guard + the same `[Unreleased]` non-empty awk), pinned by 2 `tests/test-meta.sh` assertions (meta suite to 256). The verification pipeline earned its keep here: task-verifier caught one DEC-005 guard drift, the integration-checker caught a second, and `/user:review` caught a third (the soft accumulation-signal drift), all on the guard's own two copies (DEC-005/DEC-006). Source: SPEC-028 / ID-026.
 - **Mid-flight spec amend: a declared path to add scope to a building spec without restarting the lane (SPEC-027).** Closes the `BUILDING -> SPECIFYING -> BUILDING` gap (operating-layer-vision Scenario 7): when a build reveals scope that must be added now ("also do Y"), the operator approves, then the spec is amended in place at a task checkpoint instead of being silently mutated or the lane restarted. Convention, not a new command or hook (Approach A; PHILOSOPHY "earn the abstraction"). Four invariants, canonical in `WORKFLOW.md` "## Mid-flight amend": no lane restart (`Status:` stays VALIDATED, only the delta re-validated), add-only (completed `- [x]` tasks frozen), recorded + operator-approved at a checkpoint (an optional on-demand `## Amendments` provenance section in the spec), resume with `/user:next` (not a fresh `/user:execute`). Projected into `docs/operating-layer-vision.md` §3.3 (the transition row) + §5 (gap closed), `commands/execute.md` (the no-silent-mutation anti-pattern now points at the amend path), `commands/spec.md` (the `## Amendments` template section), `docs/PLAYBOOK.md` Scenario 7 + `docs/ORCHESTRATION.md` 5.4 (operator projections), and pinned by 4 `tests/test-meta.sh` assertions (meta suite to 254). `/user:review` (HIGH) added the operator-approval gate the canonical rule had omitted (DEC-008). Source: SPEC-027 / ID-023.
 - **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
 - **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
diff --git a/MANUAL.md b/MANUAL.md
index cce6188..5dca3d5 100644
--- a/MANUAL.md
+++ b/MANUAL.md
@@ -140,6 +140,7 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 **Writes:** bumped `VERSION`, new `CHANGELOG.md` entry, git tag, PR via `gh`
 **When to invoke:** review is green and docs are synced
 **Common gotcha:** blocks if any REVIEW*.md verdict is FIX-REQUIRED. Use `/review-team` and `responding-to-review` to triage before re-running ship.
+**Release-hygiene warn (Step 4a):** at the version step it warns (never blocks) on a phantom cut, `VERSION` naming a version with no matching git tag, with a heads-up when `[Unreleased]` is accumulating above it. Warn-only; tag `v<version>` or confirm intentional, then continue (SPEC-028).
 
 ### `/user:retro`
 
@@ -160,9 +161,9 @@ Opt-in lane between `/user:spec-validate` and `/user:execute`. Reads the active
 ### `/user:kit-health`
 
 **Phase:** self-assessment against PHILOSOPHY.md
-**Reads:** the kit itself (file counts, hook performance, settings validity, source citations)
+**Reads:** the kit itself (file counts, hook performance, settings validity, source citations), plus a repo-scoped release-hygiene check (a phantom cut: `VERSION` names a version with no matching git tag; degrades to a no-op outside the repo)
 **Writes:** verdict in chat (`SHIP / FIX-REQUIRED / REJECT`)
-**When to invoke:** maintainer-only, before tagging a release of the kit
+**When to invoke:** maintainer-only, before tagging a release of the kit (the release-hygiene check is exactly the "before tagging" guard)
 **Common gotcha:** the rejection-first verdict will REJECT on real violations. Do not soften the criteria; address them.
 
 ## Hooks (no invocation)

From 112635add1f31e813e20bc2ee4b8e315d2187ea6 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:41:13 +0700
Subject: [PATCH 53/56] chore: mark SPEC-028 shipped, drop ID-026 off the
 active queue

Local ship (no version bump / tag / PR; same call as the SPEC-027 cycle, and the
new guard itself warns the release state is a phantom cut). SPEC-028 -> SHIPPED;
ID-026 -> shipped in CHANGELOG [Unreleased]. The guard self-fired on this ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md                             | 2 +-
 docs/specs/SPEC-028-release-hygiene-guard.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index e424bf7..d385d49 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -45,7 +45,7 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-023 | Mid-flight spec amend: a path to add scope to a VALIDATED / in-build spec without restarting the lane (state machine transition BUILDING -> SPECIFYING) | operating-layer-vision gap analysis 2026-05-22 | SPEC-027 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/mid-flight-amend, no tag/PR) |
 | ID-024 | Context-switch affordance across specs (worktree-per-spec) + an explicit ABANDONED terminal (state hygiene; vs the existing `parked`) | operating-layer-vision gap analysis 2026-05-22 | TBD | normal | queued |
 | ID-025 | Re-open-shipped convention: a follow-up spec spun from a SHIPPED one (state machine transition SHIPPED -> TRIAGING) | operating-layer-vision gap analysis 2026-05-22 | TBD | tiny | queued |
-| ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | SPEC-028 | normal | executing |
+| ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | SPEC-028 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/release-hygiene-guard, no tag/PR; the guard self-fired on its own ship) |
 | ID-027 | spec-validate autonomy-gate lens: for a spec whose behavior runs inside an autonomous loop (`/execute`, `/goal`), check whether it lets the loop make a scope / architecture / risk decision without a human gate | SPEC-027 retro 2026-05-22 (the cycle's HIGH was caught at review, should have been catchable at validate) | TBD | normal | queued |
 | ID-028 | execute.md: acknowledge disjoint-file parallel worker dispatch (sequential is the safe default only for shared-file or dependent tasks; independent non-overlapping tasks may parallelize) | SPEC-027 retro 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
diff --git a/docs/specs/SPEC-028-release-hygiene-guard.md b/docs/specs/SPEC-028-release-hygiene-guard.md
index 867d715..989afd2 100644
--- a/docs/specs/SPEC-028-release-hygiene-guard.md
+++ b/docs/specs/SPEC-028-release-hygiene-guard.md
@@ -1,6 +1,6 @@
 # Spec: Release-hygiene guard (phantom-cut warn)
 Generated: 2026-05-22
-Status: VALIDATED
+Status: SHIPPED
 
 Source: SPEC-027 retro ("What hurt": the release-hygiene tangle) + SPEC-018 retro (the first occurrence); ID-026. Goal draft: `.claude/goals/release-hygiene-guard.md`. A two-cycle recurrence, which clears the PHILOSOPHY section-5 bar for promoting a check.
 

From e4ef22f33d1b1d9016b7777774b4b6bd447dfe6b Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:42:28 +0700
Subject: [PATCH 54/56] docs(retro): capture the SPEC-028 release-hygiene guard
 cycle

Signal: three verification layers caught three distinct drifts of one DEC-005
invariant (the guard's own two inlined copies); the guard self-fired on its own
ship. Tension surfaced: "no premature abstraction (3x)" is wrong for
identical-by-contract logic. The phantom cut + 3-deep stack remain to be resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RETRO-2026-05-22-release-hygiene-guard.md | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 docs/retro/RETRO-2026-05-22-release-hygiene-guard.md

diff --git a/docs/retro/RETRO-2026-05-22-release-hygiene-guard.md b/docs/retro/RETRO-2026-05-22-release-hygiene-guard.md
new file mode 100644
index 0000000..04f7efb
--- /dev/null
+++ b/docs/retro/RETRO-2026-05-22-release-hygiene-guard.md
@@ -0,0 +1,33 @@
+# Retro: SPEC-028 release-hygiene guard (ID-026)
+Date: 2026-05-22
+Sprint: single-session dogfood, assign -> spec (surface decision) -> spec-validate -> execute -> review -> docs -> ship. Normal lane (the surface choice was warn, not hook).
+
+## Metrics
+- Tasks planned: 3, completed: 3, deferred: 0.
+- Commits: 9 (3 surface/test edits, 3 fix commits, spec-progress + docs + ship-bookkeeping). Atomic, conventional.
+- Files changed: 7 (+84 / -7). No new command, no new hook, no new file except the spec (Approach A: warn-only convention on two existing surfaces).
+- Verification: 3/3 task-verifier PASS (TASK-001 needed 1 fix retry); integration-checker FAIL:fixable then PASS (1 fix); doc-verifier PASS (16/16). meta 256/256 (+2), hooks 92/92.
+- Completeness log: clean. Doc-impact: MANUAL (/ship + /kit-health) + CHANGELOG moved; no new command so no plugin/marketplace/README-table impact.
+
+## Key commits
+- `0958107` kit-health check; `0a27c62` ship warn; `2b9f2a1` meta-test pin; `abea75d` + `5ebdbec` + `b5ddf9f` the three DEC-005 alignment fixes (one per verification layer).
+
+## What worked
+- **Surface-decision-first kept the build minimal and set the lane correctly.** Deciding the detection surface in `/spec` (before writing the spec) determined the lane (warn surfaces -> normal, not full) and the research genuinely sharpened the design: a `tests/test-meta.sh` hard assertion was rejected because "VERSION untagged" is a legitimate release transient + CI shallow-clone fragile, and the real signal is the phantom cut (untagged version), not `[Unreleased]` accumulation (which the kit does deliberately). The spec was right-sized as a result.
+- **The three verification layers each caught a DIFFERENT drift of the SAME invariant.** DEC-005 said the two inlined copies (ship.md + kit-health.md) must be byte-identical. task-verifier caught the outer-guard drift (`[ -d .git ]` vs `git rev-parse --git-dir`, breaks in worktrees); integration-checker caught the inner-guard drift (the empty-VERSION condition); `/user:review` caught the soft accumulation-signal drift (grep-heading-exists vs awk-non-empty). Three layers, three distinct drifts, on the guard's own two copies. This is the strongest evidence yet for the layered pipeline AND a live demonstration of why identical-by-contract duplication is dangerous.
+- **The guard self-fired on its own ship.** At `/user:ship` Step 4a the new warn correctly detected the live phantom cut (`v1.6.0` untagged, `[Unreleased]` accumulating). End-to-end dogfood proof, in situ, warn-only (it reported, did not block).
+- **Pinning the check shape in the spec (DEC-005) gave the verifiers something concrete to check.** Without the "identical shape" contract written down, the three drifts would have passed silently.
+
+## What hurt
+- **DEC-003's "inline until 3 uses" produced 2 copies that drifted 3 times in ONE cycle.** The kit's "no premature abstraction (extract at the third occurrence)" rule said keep the check inline at two surfaces. Those two copies then diverged three separate times, each needing a fix round. For logic that MUST be byte-identical across surfaces (identical-by-contract), "wait for the third use" is the wrong heuristic: the cost is paid in drift, not in a premature helper. Partial root cause is mine: my worker prompts were asymmetric (TASK-002 got the exact bash block, TASK-001 got a narrative "check shape"), which seeded the first drift.
+- **The 3-deep branch stack deepened the very tangle this spec targets.** Building the release-hygiene guard added a third stacked branch (agents-md -> mid-flight -> release-hygiene) on top of the untagged `1.6.0`, making the release state messier, the exact thing the guard now warns about. The guard detects the mess; it does not clean it up.
+- **The phantom cut is still unfixed.** `v1.6.0` remains untagged with three specs' worth of `[Unreleased]` above it across three local branches. The guard now nags about it on every ship, which is correct, but the cleanup (integrate the stack, decide the version, tag) is owed and only grows.
+
+## Action items
+- [ ] **Carve out "identical-by-contract" from the no-premature-abstraction rule.** When two copies of a snippet MUST stay byte-identical (a shared check across surfaces), single-source them (a shared snippet/helper) or pin the exact block, even at two uses; do not wait for the third. Evidence: 3 drifts in one cycle. -> propose BACKLOG (a CLAUDE.md code-quality nuance) + relates to the ID-016/ID-020/ID-026 guard-theme.
+- [ ] **Worker-prompt discipline for identical-output tasks.** When two worker tasks must produce identical logic, give BOTH workers the exact same block (not one exact + one narrative). A one-line note in `commands/execute.md`'s worker-dispatch guidance. -> propose BACKLOG (tiny), relates to ID-028.
+- [ ] **Resolve the release state the guard is now nagging about.** Integrate the 3-branch stack (agents-md / mid-flight / release-hygiene), decide the version, and tag, so `v1.6.0` (or its successor) stops being a phantom cut. Needs the maintainer's integration decision. -> propose BACKLOG (the real cleanup the guard points at).
+- [ ] **ID-024 (context-switch / worktree-per-spec) is reinforced** by the 3-deep stack pain; bump its priority consideration. -> already queued (ID-024), note the reinforcement.
+
+## Kit feedback
+The layered verification design is strongly validated this cycle: three independent layers caught three distinct drifts of one invariant that any single layer would have missed (the per-task verifier could not see the cross-surface inner-guard drift; the integration-checker's scoped structural comparison did not flag the soft accumulation-signal drift; review caught that). The same cycle exposed a tension in the kit's own rules: "no premature abstraction (3x)" is wrong for identical-by-contract logic, where duplication drifts faster than it accumulates uses. The fix is not to drop the rule but to carve out the identical-by-contract case. Secondary: the kit cheerfully let a 3-deep branch stack form on an untagged version; the new guard now warns about the end state, but nothing discourages the stack from forming (ID-024 territory).

From 366547e43a39374b76dda28e19e9b1405916afa0 Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 08:58:31 +0700
Subject: [PATCH 55/56] chore(backlog): add ID-029..030 from the SPEC-028 retro

ID-029 carve out identical-by-contract from the no-premature-abstraction rule
(DEC-003's two copies drifted three times in one cycle); ID-030 worker-prompt
symmetry for identical-output tasks. The release-state cleanup stays a maintainer
decision (recorded in the retro), not a backlog row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 _meta/BACKLOG.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md
index d385d49..7fda0af 100644
--- a/_meta/BACKLOG.md
+++ b/_meta/BACKLOG.md
@@ -48,6 +48,8 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`).
 | ID-026 | Release-hygiene guard: a kit-health line or hook that flags VERSION cut-but-untagged AND `[Unreleased]` spanning more than one spec (catch the messy-release state before it compounds) | SPEC-027 retro 2026-05-22 (recurrence of the SPEC-018 finding; clears the PHILOSOPHY section-5 bar) | SPEC-028 | normal | shipped (CHANGELOG [Unreleased]; local branch feat/release-hygiene-guard, no tag/PR; the guard self-fired on its own ship) |
 | ID-027 | spec-validate autonomy-gate lens: for a spec whose behavior runs inside an autonomous loop (`/execute`, `/goal`), check whether it lets the loop make a scope / architecture / risk decision without a human gate | SPEC-027 retro 2026-05-22 (the cycle's HIGH was caught at review, should have been catchable at validate) | TBD | normal | queued |
 | ID-028 | execute.md: acknowledge disjoint-file parallel worker dispatch (sequential is the safe default only for shared-file or dependent tasks; independent non-overlapping tasks may parallelize) | SPEC-027 retro 2026-05-22 | TBD | tiny | queued |
+| ID-029 | Carve out "identical-by-contract" from the no-premature-abstraction (3x) rule: when two snippet copies MUST stay byte-identical (a shared check across surfaces), single-source or pin the exact block even at two uses, do not wait for the third | SPEC-028 retro 2026-05-22 (DEC-003's two inlined copies drifted three times in one cycle) | TBD | normal | queued |
+| ID-030 | execute.md worker-dispatch guidance: when two tasks must produce identical logic, give both workers the exact same block, not one exact + one narrative (the asymmetry that seeded the SPEC-028 drift) | SPEC-028 retro 2026-05-22 | TBD | tiny | queued |
 Dependency notes:
 - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline.
 - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain.
@@ -56,7 +58,8 @@ Dependency notes:
 - ID-020 / ID-021 came out of the SPEC-024 retro; ID-020 (verifier absence-checks) relates to ID-016 (both are guard/verifier hardening) and is the highest-signal kit finding of the cycle.
 - ID-022 came out of writing `docs/PLAYBOOK.md` (scenarios S2/S5): the freeform-intent gap. It unparks the SPEC-024-deferred griller entry; SPEC-026 drafts it. Independent of the other chains.
 - ID-023 / ID-024 / ID-025 came out of `docs/operating-layer-vision.md` (the SDLC state-machine gap analysis): the transitions with no clean path today. ID-022 is the highest-priority of the operating-layer set; these three follow it. All trace to the vision doc as their north-star. ID-023 shipped (SPEC-027); ID-024 / ID-025 remain queued.
-- ID-026 / ID-027 / ID-028 came out of the SPEC-027 retro. ID-026 (release-hygiene guard) is a two-cycle recurrence (SPEC-018 + SPEC-027) and relates to the ID-016 / ID-020 guard-promotion theme; it is the highest-signal of the three. ID-027 (validate autonomy-gate lens) would have caught the SPEC-027 review HIGH earlier. ID-028 is tiny-lane polish.
+- ID-026 / ID-027 / ID-028 came out of the SPEC-027 retro. ID-026 (release-hygiene guard) is a two-cycle recurrence (SPEC-018 + SPEC-027) and relates to the ID-016 / ID-020 guard-promotion theme; it is the highest-signal of the three. ID-027 (validate autonomy-gate lens) would have caught the SPEC-027 review HIGH earlier. ID-028 is tiny-lane polish. ID-026 shipped (SPEC-028); ID-027 / ID-028 remain queued.
+- ID-029 / ID-030 came out of the SPEC-028 retro (DEC-003's two inlined check copies drifted three times in one cycle, each caught at a different verification layer). ID-029 (identical-by-contract abstraction carve-out) is the substantive insight and relates to the kit's code-quality rules; ID-030 (worker-prompt symmetry) is tiny-lane polish and relates to ID-028. The third retro action item, resolving the release-state phantom cut (integrate the agents-md -> mid-flight -> release-hygiene stack and tag), is deliberately NOT a backlog row: it is a maintainer integration decision, recorded in the SPEC-028 retro.
 
 ## Schema
 

From 3a5e816fd11b7b17bc733569e20193168006cc5f Mon Sep 17 00:00:00 2001
From: Han Ngo <nntruonghan@gmail.com>
Date: Fri, 22 May 2026 09:06:35 +0700
Subject: [PATCH 56/56] chore(release): cut the 1.6.0 CHANGELOG section

Finalize [Unreleased] -> [1.6.0] - 2026-05-22 (VERSION/plugin.json already 1.6.0).
Resolves the phantom-cut content; the SPEC-028 guard's warn clears once v1.6.0 is
tagged on master after this PR merges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 4816870..519ac3a 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,6 +4,8 @@ All notable changes to dwarves-kit are documented here.
 
 ## [Unreleased]
 
+## [1.6.0] - 2026-05-22
+
 ### Changed
 
 - **Dropped hand-maintained component counts and stripped spec IDs from the WORKFLOW contract.** The exact `N hooks / N commands / N agents / N skill` strings are removed from every live surface (`.claude-plugin/{plugin,marketplace}.json`, `README.md`, `MANUAL.md`, `CLAUDE.md`, the `docs/architecture.md` component table, `tool.toml`) and replaced with qualitative phrasing; the `92`/`238` test-suite-total comments in `CLAUDE.md` are dropped too. Counts rot silently: this swept up live drift the guards never caught (`architecture.md` said 19 commands, `tool.toml` said 12 hooks / 18 commands / 9 agents). With nothing left to keep in sync, both count tests in `tests/test-meta.sh` are removed (the ID-013 component-count guard added earlier this cycle and the older command-count parity test); meta suite 238 -> 213. `WORKFLOW.md` no longer cites `SPEC-NNN`/`ADR-NNN` inline (the `docs/specs/SPEC-NNN-<slug>.md` filename pattern stays): rules are stated by concept with one provenance pointer to `docs/specs/` + `docs/decisions/`, and the count-sweep chore paragraph is replaced by a version-only note. SPEC-004 + SPEC-005 `Status:` reconciled VALIDATED -> SHIPPED (shipped alongside SPEC-006).