dwarvesf · tieubao · May 22, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,89 @@
+# AGENTS.md: the operating layer
+
+> Tool-agnostic front door. Any agent runtime (Claude Code, Codex, Gemini, a
+> human) reads this first to learn how work is done in this repo. It carries the
+> portable operate-contract: what to read, how to run a unit of work, when work
+> is done, and when to stop and ask.
+
+## Enforcement boundary (read this first)
+
+**This file is a contract, not a guardrail.** Enforcement is Claude-Code-only: the
+hooks (safety-gate, push-to-main blocker, anti-rationalization Stop hook, the
+verification pipeline) are what actually block bad outcomes, and they run only
+under Claude Code. Under any other runtime (Codex, Gemini, a bare LLM)
+`AGENTS.md` is **advisory only**: it tells an agent what to do, but nothing
+enforces it. Do not assume the guardrails are portable. They are not, until the
+v3.x multi-runtime agent-hook work lands. See `docs/PHILOSOPHY.md` (honesty rule:
+never over-claim portable enforcement) and `CLAUDE.md` for the CC-specific layer
+(hooks, slash commands, plugin).
+
+The rest of this file is four portable zones. The goal-crafter
+(`commands/assign.md`) projects them into a six-section `/goal` (see "How a goal
+is composed" at the end).
+
+---
+
+## 1. Read in this order
+
+Orient before you touch anything. Read top to bottom; stop when you have enough.
+
+1. **AGENTS.md** (this file) - how work is done here; the operate-contract.
+2. **CLAUDE.md** - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin.
+3. **docs/specs/SPEC-NNN-<slug>.md** - the active spec; the shared contract for the cycle. Read its `## Verification` and `## After state` before implementing.
+4. **docs/architecture.md** / **WORKFLOW.md** - reference, not required per task. Read `docs/architecture.md` for how the pieces fit; read `WORKFLOW.md` for the lanes and the gate at each phase boundary.
+
+## 2. Task loop
+
+How to do one unit of work. The smallest verifiable increment, verified, committed.
+
+1. **Size the lane.** Pick `tiny` / `normal` / `full` / `bug` / `backfill` per `WORKFLOW.md`. When in doubt between two lanes, take the heavier one.
+2. **Read the spec and its acceptance criteria.** For a spec-driven task: the active spec's task row, its AC, its `## Verification`, and its `## After state`. No spec (tiny lane): the one obvious edit.
+3. **Implement the smallest verifiable increment.** One logical change. No speculative features, no premature abstraction; clarity over cleverness.
+4. **Verify.** Run the spec's `## Verification` command (or the lane's check). Do not claim a result you did not run.
+5. **Commit.** Conventional commit, one logical change. No spec/ticket IDs in the subject line.
+
+If you cannot make progress, see zone 4 (Pause if) and stop with a named blocker note. Do not churn.
+
+## 3. Done means
+
+A task or goal is done only when **its acceptance criteria are met AND the
+verifier actually ran the check**, not when you claim they pass. Self-reported
+"done" is not proof.
+
+Concretely, done means: **acceptance criteria met, the check actually ran (not
+just asserted), review recorded + report written, and the final response says
+what changed and what was not attempted.** If you could not run the check, report
+that plainly; the anti-rationalization hook is the backstop for premature
+completion under Claude Code, but the honesty obligation is yours under any
+runtime.
+
+## 4. Pause if (ask a human)
+
+Stop and ask a human before acting on any of these. These are decisions with
+direction or irreversible cost that a goal loop must not make on its own.
+
+- **Architecture direction** - a change to how the pieces fit, a new component, an interface or data-model shape.
+- **Source-of-truth hierarchy** - which file or section is canonical when two disagree (for example, moving the operate-contract between `AGENTS.md`, `CLAUDE.md`, and `WORKFLOW.md`).
+- **Validation removal** - weakening, deleting, or bypassing a test, an assertion, a hook, or any guardrail.
+- **Risk-classification change** - moving work to a lighter lane, or narrowing a `full`-lane trigger (auth, authz, hooks, data model, data loss, audit/security, external provider, API contract, migration).
+- **Privacy / security** - secrets, credentials, access scope, anything that touches what data leaves the repo or who can reach it.
+
+When you pause: write the named blocker, state the decision you are not making and why, and stop.
+
+---
+
+## How a goal is composed (for `commands/assign.md`)
+
+The goal-crafter projects these four zones into a six-section `/goal`. The mapping
+is a **composition, not 1:1**: two of the six sections come from the active spec,
+not from this file. Keep the four zone names stable; renaming one without updating
+`commands/assign.md` breaks the projection.
+
+| `/goal` section | Source |
+|---|---|
+| Context-to-read | AGENTS.md zone 1 (Read in this order) |
+| Constraints | CLAUDE.md / AGENTS.md rules |
+| Operating rules | AGENTS.md zone 2 (Task loop) |
+| Validation loop | the active spec's `## Verification` |
+| Done-when | AGENTS.md zone 3 (Done means) + the active spec's `## After state` |
+| Pause-if | AGENTS.md zone 4 (Pause if) |
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,12 +4,22 @@ All notable changes to dwarves-kit are documented here.
 
 ## [Unreleased]
 
+## [1.6.0] - 2026-05-22
+
 ### Changed
 
 - **Dropped hand-maintained component counts and stripped spec IDs from the WORKFLOW contract.** The exact `N hooks / N commands / N agents / N skill` strings are removed from every live surface (`.claude-plugin/{plugin,marketplace}.json`, `README.md`, `MANUAL.md`, `CLAUDE.md`, the `docs/architecture.md` component table, `tool.toml`) and replaced with qualitative phrasing; the `92`/`238` test-suite-total comments in `CLAUDE.md` are dropped too. Counts rot silently: this swept up live drift the guards never caught (`architecture.md` said 19 commands, `tool.toml` said 12 hooks / 18 commands / 9 agents). With nothing left to keep in sync, both count tests in `tests/test-meta.sh` are removed (the ID-013 component-count guard added earlier this cycle and the older command-count parity test); meta suite 238 -> 213. `WORKFLOW.md` no longer cites `SPEC-NNN`/`ADR-NNN` inline (the `docs/specs/SPEC-NNN-<slug>.md` filename pattern stays): rules are stated by concept with one provenance pointer to `docs/specs/` + `docs/decisions/`, and the count-sweep chore paragraph is replaced by a version-only note. SPEC-004 + SPEC-005 `Status:` reconciled VALIDATED -> SHIPPED (shipped alongside SPEC-006).
 
 ### Added
 
+- **Release-hygiene guard: a warn on a phantom version cut (SPEC-028).** Closes the recurring messy-release state the SPEC-018 + SPEC-027 retros both flagged (`VERSION`/`plugin.json` cut to a version that was never tagged, with `[Unreleased]` piling on top). Two warn-only surfaces detect the **phantom cut** (`VERSION` names a version with no matching `git tag`), plus a soft heads-up when `[Unreleased]` is non-empty above it: a `/user:ship` Step-4a warn (at the version/tag decision) and a `commands/kit-health.md` check (run before tagging, repo-scoped, degrades to a no-op outside the repo). Warn-only, never blocks (Approach A; PHILOSOPHY reserves hard blocks for safety, and an untagged cut is a legitimate release transient, so a hard `tests/test-meta.sh` tag assertion was rejected). Both surfaces share one pinned check shape (whitespace-strip + `git tag -l "v$VER"` + a `git rev-parse --git-dir` degrade guard + the same `[Unreleased]` non-empty awk), pinned by 2 `tests/test-meta.sh` assertions (meta suite to 256). The verification pipeline earned its keep here: task-verifier caught one DEC-005 guard drift, the integration-checker caught a second, and `/user:review` caught a third (the soft accumulation-signal drift), all on the guard's own two copies (DEC-005/DEC-006). Source: SPEC-028 / ID-026.
+- **Mid-flight spec amend: a declared path to add scope to a building spec without restarting the lane (SPEC-027).** Closes the `BUILDING -> SPECIFYING -> BUILDING` gap (operating-layer-vision Scenario 7): when a build reveals scope that must be added now ("also do Y"), the operator approves, then the spec is amended in place at a task checkpoint instead of being silently mutated or the lane restarted. Convention, not a new command or hook (Approach A; PHILOSOPHY "earn the abstraction"). Four invariants, canonical in `WORKFLOW.md` "## Mid-flight amend": no lane restart (`Status:` stays VALIDATED, only the delta re-validated), add-only (completed `- [x]` tasks frozen), recorded + operator-approved at a checkpoint (an optional on-demand `## Amendments` provenance section in the spec), resume with `/user:next` (not a fresh `/user:execute`). Projected into `docs/operating-layer-vision.md` §3.3 (the transition row) + §5 (gap closed), `commands/execute.md` (the no-silent-mutation anti-pattern now points at the amend path), `commands/spec.md` (the `## Amendments` template section), `docs/PLAYBOOK.md` Scenario 7 + `docs/ORCHESTRATION.md` 5.4 (operator projections), and pinned by 4 `tests/test-meta.sh` assertions (meta suite to 254). `/user:review` (HIGH) added the operator-approval gate the canonical rule had omitted (DEC-008). Source: SPEC-027 / ID-023.
+- **Kit-root `AGENTS.md`: a tool-agnostic operate-contract front door (SPEC-024).** A portable, four-zone operating layer (the same shape any agent runner can read) that fronts the kit's behavioral contract; `CLAUDE.md` and `WORKFLOW.md` now point at it so the cycle and house rules have one neutral entry point instead of being CLAUDE-only. Ships a downstream template at `examples/hello-spec/AGENTS.md` so a consuming project starts with the same front door. Source: SPEC-024.
+- **`backfill` brownfield lane (SPEC-024)**: an intake lane for adopting the kit into an existing repo (bring the operate-contract, specs convention, and guardrails onto code that predates them), recorded in `WORKFLOW.md` alongside the existing intake lanes.
+- **Six-section goal projection in `/user:assign`**: the goal draft `/user:assign` writes now projects the backlog item across six fixed sections (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if) sourced from the `AGENTS.md` zones + the active spec, so a handed-off goal carries fuller context into its lane.
+- **`## After state` section in the `/user:spec` template**: specs now scaffold the observable post-implementation end state, complementing the existing Verification / Open-questions stop-criteria.
+- **doc-impact-map rows for `AGENTS.md` and the top-level files**: the WORKFLOW doc-impact map (reviewed by `/user:ship` + `/user:retro`) now lists `AGENTS.md` and a "new top-level file" trigger so a change that touches them is flagged for a doc update.
+- **Regression coverage for the install settings-merge against a pre-existing third-party hook (test/guard, not a fix)**: `tests/test-meta.sh` gains a block asserting `install.sh`'s `settings.json` merge preserves an existing third-party hook entry. This is added coverage, not a bug fix: the current installer already unions third-party hooks correctly, so no installer change was needed. Plus review-driven anti-drift assertions (CLAUDE.md/WORKFLOW.md carry no read-order restatement) and hello-spec AGENTS.md zone pins. Meta suite to 241 asserts. Source: SPEC-024.
 - **`/user:ui-design` command + the downstream UI-design loop (SPEC-020)**: writes a structured `## UI design` brief (aesthetic-direction preamble + layout + states matrix + named viewports + a11y bars + 3-tier token ladder + voice) into the active spec (else the pre-spec brief), delegates generation to the external `frontend-design` skill (the kit ships no renderer), critiques via `/user:visual-team`, and runs a bounded auto-revise loop (E6 `<user-feedback>` injection-wrap, E7 unconditional accumulated-feedback re-send, terminate on SOLID / RECONSIDER / max-2 cap). Opt-in, report-only, downstream-facing (the PHILOSOPHY carve-out + kit-health allow-note now name both `/user:visual-team` and `/user:ui-design`). The kit is now 20 commands. Brief enriched per the 2026-05-21 deep scan (`docs/research/2026-05-21-ui-design-loop-deep-scan.md`): aesthetic-direction from `frontend-design`'s real input, token ladder + states matrix + a11y bars + voice adapted from `nextlevelbuilder/ui-ux-pro-max-skill` (its renderer / fonts / `.cjs`+`.py` tooling rejected per bash-over-binaries), loop shapes from gstack. Dogfooded through `/user:spec-validate` twice; the re-dogfood caught an unimplementable numeric stop (visual-team emits no combined score) and folded it to a SOLID-verdict stop. Credits: gstack (loop shapes), `frontend-design` (generator), `ui-ux-pro-max-skill` (brief sub-shapes). Source: SPEC-020.
 - **`/user:absorb` command + the absorption ritual (SPEC-004)**: a maintainer-only, proposal-only external-absorption audit that generalizes SPEC-002/SPEC-014's one-shot surveys into a recurring ritual. `docs/ABSORPTION.md` carries it: two lanes (Credits drift re-audit + a seed-rescan of the SPEC-014 survey set, scanning the interest areas workflow/agents/QA/UI), the adoption rubric (>=10), the gate, and the **human merge gate** (discovery + scoring + drafting are automatic; adopting a source or adding it to Credits is maintainer-approved, preserving "synthesize, don't originate"). `/user:absorb` writes a dated, ranked + capped, proposal-only report under `docs/absorption/` (HEAD-SHA baseline for since-last-run, an overflow appendix so a real ADOPT is never dropped, a `git status` self-check); QA/UI candidates needing binaries route to "recommend external". The kit is now 19 commands. Think+Design narrowed lane B to a seed-rescan (web-search discovery deferred as tool-weak). Also corrected the plugin manifests' stale hook/agent counts (now 14 hooks / 11 agents). Source: SPEC-004 + `docs/ABSORPTION.md`; DATA-not-instructions guard from ADR-0008.
 - **Reviewer 5 (Solution-Design & Extensibility Critic)** in `/user:spec-validate`: flags shallow or non-extensible designs, with a calibration clause (no false-positive storm) and a legacy-grace clause for specs predating the richer template. Source: SPEC-008; forked from `superpowers:brainstorming` ("design for isolation and clarity") + its spec-document-reviewer calibration. Not a runtime dependency.
@@ -44,8 +54,6 @@ All notable changes to dwarves-kit are documented here.
 
 ## [1.6.0] - 2026-05-20
 
-Orchestration layer (SPEC-003) plus upstream-audit absorption and lineage hygiene (SPEC-002).
-
 ### Added
 
 - **`WORKFLOW.md`** (repo root): the agent-facing workflow contract. Names each lifecycle phase, routes work by risk tier (tiny / normal / full), and points at the existing guardrail that enforces each boundary. Delivered via the `CLAUDE.md` pointer (auto-loaded each session); it suggests and routes, it does not block. Downstream template ships at `examples/hello-spec/WORKFLOW.md` (`.planning/` path convention). Source: SPEC-003; harness-experimental intake model + the AGENTS.md pattern.

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -57,12 +57,14 @@ The kit unified the spec location onto `docs/specs/SPEC-NNN-<slug>.md` for both
 
 ## Workflow
 
-The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`.
+The operate-contract (read-order, task loop, done-definition, the "Pause if" list) is canonical in [`AGENTS.md`](AGENTS.md), the tool-agnostic front door. Read it first; this file is the Claude-Code layer on top of it. Do not restate the operate-contract here; point at `AGENTS.md`.
+
+The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after `AGENTS.md` and this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`.
 
 `/user:kit-health` is the maintainer-only self-assessment against PHILOSOPHY.md. Run it before tagging.
 
 Hooks fire on Claude Code events automatically; do not call them manually. Debug with `DWARVES_KIT_DEBUG=1`.
 
 ## Template for downstream projects
 
-If you are scaffolding a new project that will use dwarves-kit, do not copy THIS file. Use `examples/hello-spec/CLAUDE.md` as the template; it shows the full shape (Project, Tech Stack, Commands, Repository Structure, Code Quality Rules, Workflow, Spec Location) with realistic placeholder content.
+If you are scaffolding a new project that will use dwarves-kit, do not copy THIS file. Use `examples/hello-spec/CLAUDE.md` as the template; it shows the full shape (Project, Tech Stack, Commands, Repository Structure, Code Quality Rules, Workflow, Spec Location) with realistic placeholder content. The downstream front door is `examples/hello-spec/AGENTS.md` (the tool-agnostic operate-contract); copy it alongside, with `CLAUDE.md` as the Claude-Code layer on top.