From 1840325a415d219e53fd6a95461968ea716b5aec Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:42:52 +0700 Subject: [PATCH 01/56] docs(spec): add validated SPEC-024 + accept ADR-0013 AGENTS.md operating-layer cycle planning artifacts: - SPEC-024 drafted and validated via /user:spec-validate (5 lenses); revisions DEC-004 corrected + DEC-005..008 added (four-zone contract, install.sh jq ownership, WORKFLOW dedup, edge-case tip model, doc-map file-trigger row). - ADR-0013 flipped proposed -> accepted with the After-state addendum. - BACKLOG: ID-015 (this work), plus ID-016/017 from the SPEC-025 retro. Co-Authored-By: Claude Opus 4.7 (1M context) --- _meta/BACKLOG.md | 4 + .../0013-agents-md-operating-layer.md | 4 +- .../SPEC-024-agents-md-operating-layer.md | 141 ++++++++++++++++++ 3 files changed, 148 insertions(+), 1 deletion(-) create mode 100644 docs/specs/SPEC-024-agents-md-operating-layer.md diff --git a/_meta/BACKLOG.md b/_meta/BACKLOG.md index 5da6a94..2696e1c 100644 --- a/_meta/BACKLOG.md +++ b/_meta/BACKLOG.md @@ -34,9 +34,13 @@ Lane = the WORKFLOW.md risk tier (`tiny` / `normal` / `full`). | ID-002 | Absorb skills/hooks developed in ops-toolkit into the kit (internal lane) | item e | SPEC-007 (PARKED; see its Parked note) | full | parked | | ID-003 | Deep orchestration scan of the copied repos vs our WORKFLOW (report + absorb plan) | item 2 | `docs/research/2026-05-20-orchestration-deep-scan.md` | normal | shipped (report delivered; recs folded into SPEC-006) | | ID-012 | Goal-loop fidelity (one theme, two parts). P1: stop-criteria in the `/spec` template (pin `## Verification` + `## Open questions` so specs are pointer-`/goal`-ready) -- build now, tiny lane. P2: QA gate around the autonomous loop (verify-into-loops + SPEC-006 completeness clauses + Reviewer 5) -- held until the pointer-`/goal` pattern has real runs | maintainer Q3 2026-05-20 + goal-readiness convergence 2026-05-21 | SPEC-012 | normal (P1) / full (P2) | P1 shipped (see CHANGELOG); P2 held | +| ID-015 | AGENTS.md operating layer: tool-agnostic entrypoint + brownfield backfill lane + six-section goal projection + observable After-state spec section (ships ADR-0013) | ADR-0013 (hoangnb24/harness-experimental study, 2026-05-21) | SPEC-024 | full | speccing | +| ID-016 | Promote the "new SPEC -> BACKLOG status row" doc-impact check to a guard (a hook or a kit-health line); it was missed across multiple cycles | SPEC-025 retro 2026-05-21 (recurrence clears the PHILOSOPHY section 5 bar) | TBD | normal | queued | +| ID-017 | Reconcile retro-file naming across the `retro` skill body + description, CLAUDE.md, and WORKFLOW.md to the repo's actual `RETRO-YYYY-MM-DD-.md` convention | SPEC-025 retro 2026-05-21 | TBD | tiny | queued | Dependency notes: - ID-012 P1 (spec stop-criteria) shipped (SPEC-012, normal lane; see CHANGELOG); P2 (loop QA gate) is held until the pointer-`/goal` pattern has real runs; it will build on the now-shipped SPEC-006 completeness clauses + the existing verification pipeline. - ID-002 (internal absorption lane, SPEC-007) is parked; independent of the orchestration chain. +- ID-016 / ID-017 came out of the SPEC-025 retro; both are kit-internal hygiene, independent of the SPEC-024 chain. ## Schema diff --git a/docs/decisions/0013-agents-md-operating-layer.md b/docs/decisions/0013-agents-md-operating-layer.md index d870b75..2f565c9 100644 --- a/docs/decisions/0013-agents-md-operating-layer.md +++ b/docs/decisions/0013-agents-md-operating-layer.md @@ -1,6 +1,8 @@ # ADR-0013: AGENTS.md as the tool-agnostic operating-layer entrypoint -## Status: proposed (2026-05-21). +## Status: accepted (2026-05-21). Scope approved 2026-05-21; ships as backlog item ID-015 / SPEC-024 (full lane, all six decisions). Entry stays BACKLOG-ID-first; the freeform "griller" entry (a casual intent with no ID) is deferred per that scope call. + +**Addendum (2026-05-21):** SPEC-024 scope also adds an observable **After-state** section: a definition-of-done picture made of statements that are false-now / true-after and each checkable by a human or command. It lands as a `## After state` spec-template section (feeding `## Acceptance Criteria`) and is projected into the six-section goal's `Done-when`. The matching anatomy piece is already added to the personal `goal-craft` skill so it applies beyond this repo. Load-bearing rule: observable, not narrated, or it is fluff and gets cut (PHILOSOPHY: every file justifies itself). ## Context A study of `hoangnb24/harness-experimental` (OpenAI "harness engineering" framing) showed an operator writing a rich, multi-section `/goal` (Context to read first / Constraints / Operating rules / Validation loop / Done when / Pause if) and running it reliably against a brownfield repo. That altitude is not prompt skill. Each section is a pointer to a *named artifact* the harness installs into the consuming repo: an ordered source-of-truth list, a feature-intake/risk doc, a test matrix, a done-definition, and an "ask before" list, all anchored by a single agent entrypoint, `AGENTS.md`. The operator references structure the repo already carries; they do not invent it in the prompt. diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md new file mode 100644 index 0000000..c338ad5 --- /dev/null +++ b/docs/specs/SPEC-024-agents-md-operating-layer.md @@ -0,0 +1,141 @@ +# Spec: AGENTS.md operating layer + brownfield backfill lane +Generated: 2026-05-21 +Status: VALIDATED + +> Implements ADR-0013 (accepted 2026-05-21) and its After-state addendum. Backlog: ID-015 (full lane). +> Validated 2026-05-21 via /user:spec-validate (5 lenses); revisions recorded in the Decision Log (DEC-004 corrected, DEC-005..DEC-008 added). +> Research note: the standard brownfield 4-agent research pass was skipped because the relevant +> surfaces were already mapped in the session that produced ADR-0013 (WORKFLOW.md, CLAUDE.md, +> commands/assign.md, commands/spec.md, install.sh, the WORKFLOW doc-impact map). No CONTEXT.md was +> regenerated; this spec + ADR-0013 are the implementation context. Regenerate CONTEXT.md at +> /user:execute time only if a worker needs more than the files named per task. + +## Problem +The kit out-enforces `hoangnb24/harness-experimental` (real hooks, verifier, push-blocker vs their advisory markdown) but under-ships the *legible in-repo operating layer a `/goal` can point at*: + +- The entrypoint is `CLAUDE.md` (+ `WORKFLOW.md`), which is Claude-Code-specific. There is no tool-agnostic front door. +- "Pause if / ask a human" is enforced by safety hooks but never *stated* as a directive a goal can mirror. +- There is no brownfield "review the codebase and backfill docs" entry; the cycle is greenfield-feature oriented (`/think -> /spec`). + +Consequence: operators hand-write the rich six-section `/goal` structure every time, with no in-repo referent, so they never reach the altitude shown in the trigger screenshot. The harness-experimental operator does not write that structure into the prompt; they reference structure the repo already carries (its `AGENTS.md` read-order, done-definition, and "ask before" list). We lack that carrier. + +## Solution + +### Approaches considered +- **A: `AGENTS.md` canonical operating layer; `CLAUDE.md` becomes a thin CC-specific pointer (chosen).** Tradeoff: some operate-contract prose moves out of CLAUDE.md; one-time churn. +- **B: Keep `CLAUDE.md` canonical, add `AGENTS.md` as a pointer to it.** Rejected: a Codex/Gemini agent reading `AGENTS.md` gets redirected to a CC-specific file; defeats portability. +- **C: Ship the full harness-experimental scaffold (empty `product/`, `stories/`, `TEST_MATRIX.md`).** Rejected: violates PHILOSOPHY "every file must justify its existence" + "no phantom features". Their product *is* the scaffold; ours is not. + +### Chosen approach + why +Adopt `AGENTS.md` as the tool-agnostic front door carrying the portable operate-contract (ordered read list, task loop, done-definition, explicit "Pause if" list). `CLAUDE.md` shrinks to the CC-only layer (hooks, slash commands, plugin) and points at `AGENTS.md` for the operate-contract. Add a `backfill` brownfield lane to `WORKFLOW.md`. Make the goal-crafter (`commands/assign.md`) emit the six-section operating directive, each section projecting an `AGENTS.md` artifact. Add an observable `## After state` section to the spec template and project it into the goal's `Done-when`. (ADR-0013 decisions 1, 2, 4, 5 + addendum. Decisions 3 and 6 are honored as Out-of-Scope guards below.) + +### Extensibility & boundaries +- **Load-bearing dimension: number of agent runtimes.** `AGENTS.md` is plain markdown any runtime reads. Enforcement stays Claude-Code-only (the hooks). Adding a runtime means it *reads* `AGENTS.md`; it does NOT inherit the guardrails until the v3.x agent-hook work lands. The spec must not over-claim portability of enforcement. +- **Units (each independently testable):** `AGENTS.md` (entrypoint), `CLAUDE.md` (CC layer), `WORKFLOW.md` (lanes), `commands/assign.md` (projection), `commands/spec.md` (after-state template). Each describable in <=3 sentences. + +### Architecture +``` +AGENTS.md (tool-agnostic front door) + | sections: ordered read list, task loop, done-definition, "Pause if" + | + +-- CLAUDE.md (CC-specific: hooks/commands/plugin) --points to--> AGENTS.md (no restating) + +-- WORKFLOW.md (lanes incl. new `backfill`) <-- AGENTS.md points here for lane selection + +-- commands/assign.md (goal-crafter) + | projects --> six-section /goal: + | Context-to-read <- AGENTS.md read list + | Constraints <- AGENTS.md/CLAUDE.md rules + | Operating rules <- AGENTS.md task loop + | Validation loop <- spec ## Verification + | Done-when <- AGENTS.md done-definition + spec ## After state + | Pause-if <- AGENTS.md "Pause if" list + +-- commands/spec.md template (new ## After state, feeds ## Acceptance Criteria) +``` + +## Technical Design + +### Interfaces (I/O contract) +- **`AGENTS.md`** consumes: nothing (it is the root). Produces: **four portable operate-contract zones** (ordered read list, task loop, done-definition, "Pause if" list). The goal-crafter *composes* the six-section `/goal` from these four zones plus the active spec's `## Verification` and `## After state` (and the CLAUDE.md rules), per the Architecture diagram above; the mapping is a composition, not 1:1. Invariant: the four zone names are stable; renaming one without updating `assign.md` breaks the projection. +- **`commands/assign.md`** consumes: the resolved `AGENTS.md` sections + the `_meta/BACKLOG.md` row for `ID-NNN`. Produces: a six-section operating directive in `.claude/goals/.md`. Invariant: BACKLOG-ID-first entry (no freeform-intent path in this spec). +- **`commands/spec.md`** template: gains a `## After state` block between `## Solution` and `## Acceptance Criteria`. Invariant: every After-state bullet is observable (checkable by a human or a command), never narrated prose. + +### Data model changes +- New file `AGENTS.md` at kit root and `examples/hello-spec/AGENTS.md`. +- `commands/spec.md` template gains a `## After state` section. + +### API changes (the goal-crafter contract) +- `commands/assign.md` output shape changes from a tight contract goal (outcome/verify/scope/blocker) to the six-section operating directive, each section pointing at `AGENTS.md`. `Done-when` includes the spec's observable After-state. + +### UI changes +None. The kit ships no UI. + +### Infrastructure changes +- `WORKFLOW.md`: add the `backfill` lane to the lane table + the cycle. Add an `AGENTS.md` companion-doc row to the doc-impact map. The existing bolded self-maintaining rule is scoped to a new top-level *dir*; `AGENTS.md` is a top-level *file*, so TASK-007 also adds a "new top-level file" trigger row so the map self-maintains for files too. +- `install.sh`: add the `AGENTS.md` copy tip alongside the existing `CLAUDE.md` tip. (Separately, this cycle's install dogfood exercised the settings merge/clean against a `settings.json` that already carried third-party hooks, the previously-untested path; TASK-009 owns it as a regression test and applies any jq fix the test surfaces. See DEC-004.) +- `tests/test-meta.sh`: assert `AGENTS.md` carries the four portable zones, and that `assign.md` carries the six-section projection. + +## Task Breakdown +Each task is atomic: implementable in one session, fits in ~50% of a context window. + +### Phase 1: Operating layer +- [ ] TASK-001: Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal. +- [ ] TASK-002: Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first. +- [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it. + +### Phase 2: Lanes + projection +- [ ] TASK-004: Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present. +- [ ] TASK-005: Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside. +- [ ] TASK-006: Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present. + +### Phase 3: Sync (the doc-impact map) +- [ ] TASK-007: Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row. +- [ ] TASK-008: Update `README.md` "Project structure" and `docs/architecture.md` cross-refs to include `AGENTS.md` and the `backfill` lane. - AC: both name `AGENTS.md`. +- [ ] TASK-009: Update `tests/test-meta.sh` to assert `AGENTS.md` exists and carries the four portable zones, and that `assign.md` carries the six-section projection. Add a regression test that runs `install.sh` into a throwaway HOME whose `settings.json` already contains a third-party hook (the dogfood case), and asserts the user's hook survives the merge and the result is valid JSON. If that test fails on the current `install.sh`, apply the minimal jq fix to the merge/clean within this task. - AC: `bash tests/test-meta.sh` exercises the four-zone + projection checks AND the merge-with-existing-hooks case, and passes. +- [ ] TASK-010: Version bump (`VERSION`, `.claude-plugin/plugin.json`, `tool.toml`), `CHANGELOG.md` entry (include the install.sh jq fix as a fix line), and update the `install.sh` `AGENTS.md` tip. - AC: versions in sync; CHANGELOG records the feature + the install fix. + +## Acceptance Criteria (global) +- [ ] All tasks pass their individual acceptance criteria. +- [ ] `AGENTS.md` is the front door (root + examples) carrying the four portable zones; `CLAUDE.md` and `WORKFLOW.md` point at it (no restating); `backfill` lane present; `assign.md` projects the six sections; `spec.md` template has `## After state`. +- [ ] No regressions: `bash tests/test-meta.sh && bash tests/test-hooks.sh` both exit 0. + +## Verification +`bash tests/test-meta.sh && bash tests/test-hooks.sh` exit 0 AND `grep -q 'Pause if' AGENTS.md` AND `grep -q backfill WORKFLOW.md` AND `grep -q '## After state' commands/spec.md` + +## After state +Observable; each bullet is false today and a real check once shipped. +- [ ] `AGENTS.md` exists at the kit root and `grep -q 'Pause if' AGENTS.md` passes. (Today: no `AGENTS.md` anywhere in the kit.) +- [ ] `CLAUDE.md` points at `AGENTS.md` and no longer restates the read-order / done-definition / pause list. +- [ ] `WORKFLOW.md` lists a `backfill` lane. (Today: only `tiny` / `normal` / `full` / `bug`.) +- [ ] `commands/assign.md` produces a six-section directive (Context-to-read / Constraints / Operating rules / Validation loop / Done-when / Pause-if), not a one-line contract goal. +- [ ] `commands/spec.md` template contains a `## After state` section, and a newly generated spec carries observable after-state bullets. +- [ ] `examples/hello-spec/AGENTS.md` exists so a downstream repo inherits the front door. + +## Edge Cases +1. A non-CC runtime (Codex/Gemini) reads `AGENTS.md` but gets no hook enforcement: `AGENTS.md` must state enforcement is CC-only so no operator assumes the guardrails are portable. +2. A downstream repo already has its own `AGENTS.md`: `install.sh` does not copy `AGENTS.md` at all. Like `CLAUDE.md`, it only prints a copy *tip* (`install.sh:299`), so there is no clobber path to guard. The real requirement is that the printed tip read as copy-if-absent, not a blind overwrite (it must not imply the installer will replace an existing `AGENTS.md`). (Commands are symlinked and rules are copy-with-skip; `AGENTS.md` follows the tip model, not either of those.) +3. An After-state bullet that cannot be made checkable: `/spec-validate` and `/review` must flag it and require cutting or rewriting it (the observable-not-narrated rule is enforced at the gate, not just stated). + +## Failure modes +| Failure class | Detection signal | Mitigation / recovery | +|---|---|---| +| `AGENTS.md` ↔ `CLAUDE.md` / `WORKFLOW.md` drift (a CC-layer doc restates the contract) | doc-verifier or `/review` finds duplicated read-order/done/pause in `CLAUDE.md` or `WORKFLOW.md` | `AGENTS.md` is the single source; both `CLAUDE.md` and `WORKFLOW.md` point only (replace, don't duplicate) | +| Over-claiming portable enforcement | a reviewer reads "works with Codex" as "enforced under Codex" | `AGENTS.md` states advisory-only under non-CC runtimes; PHILOSOPHY honesty rule | +| After-state rots into aspirational fluff | bullets are not checkable on re-read | observable-not-narrated rule + the spec-validate/review gate cut non-checkable bullets | +| `backfill` lane silently edits app code | a backfill run changes behavior, not just docs | lane definition forbids app-behavior change; pause-if covers it | + +## Out of Scope +- **Freeform "griller" entry** (a casual intent with no BACKLOG ID): BACKLOG-ID-first stays canonical (the 2026-05-21 scope call). Why: preserves the detector/mutator + traceability discipline. +- **Portable enforcement / agent-hooks for Codex/Gemini** (ADR-0013 decision 6): deferred to the v3.x multi-runtime work. We add portable *guidance*, not portable *guardrails*. +- **Empty `product/` / `stories/` / `TEST_MATRIX.md` scaffolds** (ADR-0013 decision 3): created only when real content exists. + +## Decision Log +- DEC-001: `AGENTS.md` canonical, `CLAUDE.md` a thin pointer. Rationale: tool-agnostic front door + no duplication. Alternatives B (CLAUDE.md canonical) and C (full empty scaffold) rejected. Who: ADR-0013 (human-accepted). +- DEC-002: `backfill` is a new lane, not a reuse of `normal`. Rationale: brownfield doc-backfill has different inputs (existing code) and a no-app-change constraint. Who: ADR-0013 decision 4. +- DEC-003: After-state is a spec section + a `Done-when` projection target, governed by observable-not-narrated. Rationale: a checkable picture both human and agent verify; fluff is rejected by PHILOSOPHY. Who: ADR-0013 addendum (human-requested). +- DEC-004: The install dogfood for this cycle exercised the settings merge/clean against a `settings.json` that already carried third-party hooks, the previously-untested path. TASK-009 owns this end to end: it adds the regression test and applies any minimal jq fix the test surfaces. (Earlier framing claimed the bugs "were already fixed"; `install.sh` is unchanged on this branch, so the work is owned by TASK-009, not pre-done.) Rationale: a merge that drops or crashes on a real user's existing hooks blocks install. Who: auto (raised in dogfood), corrected in spec-validate. +- DEC-005: `AGENTS.md` carries **four** portable zones (read list, task loop, done-definition, Pause-if); the six-section `/goal` is a *composition* of those four plus the spec's `## Verification` and `## After state`, not a 1:1 mapping. Rationale: the Architecture diagram already shows two goal sections sourced from the spec, so the earlier "six contract zones in `AGENTS.md` / 1:1" framing (Interfaces, TASK-001 AC, TASK-009) was internally contradictory and unsatisfiable. Who: spec-validate Reviewer 5/4. +- DEC-006: The operate-contract prose (ordered read-order + done-definition) lives in `WORKFLOW.md` today, not kit-root `CLAUDE.md`. TASK-002 therefore reconciles `WORKFLOW.md` (point at `AGENTS.md`), not just `CLAUDE.md`, and a failure-mode row now covers `AGENTS.md`↔`WORKFLOW.md` drift. Rationale: without this the dedup is a no-op on the named file and the real three-way overlap ships. Who: spec-validate Reviewer 3/5. +- DEC-007: Edge Case 2 reframed to the tip model: `install.sh` never copies `AGENTS.md` (tip-only, like `CLAUDE.md`), so the "must not clobber" guard described a mechanism that does not exist. Rationale: accuracy; the real requirement is the tip wording. Who: spec-validate Reviewer 2. +- DEC-008: TASK-007 adds a "new top-level file" trigger row to the doc-impact map. Rationale: the bolded self-maintaining rule is scoped to a new top-level *dir*; a top-level *file* like `AGENTS.md` was not actually covered, so line-71's "already requires" claim was false. Who: spec-validate Reviewer 4. + +## Open questions +(none; a /goal loop appends here if it hits a decision this spec does not cover, then stops) From 7c5015c6110ef67384c075e0e8ffe5794aef837f Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:45:06 +0700 Subject: [PATCH 02/56] docs(agents): add AGENTS.md operating-layer front door Tool-agnostic front door carrying the four portable operate-contract zones (read-in-order list, task loop, done-definition, Pause-if list) that commands/assign.md projects into the six-section /goal. States plainly that enforcement is Claude-Code-only (the hooks); under other runtimes AGENTS.md is advisory only. Co-Authored-By: Claude Opus 4.7 (1M context) --- AGENTS.md | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..6b4b19b --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,89 @@ +# AGENTS.md: the operating layer + +> Tool-agnostic front door. Any agent runtime (Claude Code, Codex, Gemini, a +> human) reads this first to learn how work is done in this repo. It carries the +> portable operate-contract: what to read, how to run a unit of work, when work +> is done, and when to stop and ask. + +## Enforcement boundary (read this first) + +**This file is a contract, not a guardrail.** Enforcement is Claude-Code-only: the +hooks (safety-gate, push-to-main blocker, anti-rationalization Stop hook, the +verification pipeline) are what actually block bad outcomes, and they run only +under Claude Code. Under any other runtime (Codex, Gemini, a bare LLM) +`AGENTS.md` is **advisory only**: it tells an agent what to do, but nothing +enforces it. Do not assume the guardrails are portable. They are not, until the +v3.x multi-runtime agent-hook work lands. See `docs/PHILOSOPHY.md` (honesty rule: +never over-claim portable enforcement) and `CLAUDE.md` for the CC-specific layer +(hooks, slash commands, plugin). + +The rest of this file is four portable zones. The goal-crafter +(`commands/assign.md`) projects them into a six-section `/goal` (see "How a goal +is composed" at the end). + +--- + +## 1. Read in this order + +Orient before you touch anything. Read top to bottom; stop when you have enough. + +1. **AGENTS.md** (this file) - how work is done here; the operate-contract. +2. **CLAUDE.md** - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin. +3. **docs/specs/SPEC-NNN-.md** - the active spec; the shared contract for the cycle. Read its `## Verification` and `## After state` before implementing. +4. **docs/architecture.md** / **WORKFLOW.md** - reference, not required per task. Read `docs/architecture.md` for how the pieces fit; read `WORKFLOW.md` for the lanes and the gate at each phase boundary. + +## 2. Task loop + +How to do one unit of work. The smallest verifiable increment, verified, committed. + +1. **Size the lane.** Pick `tiny` / `normal` / `full` / `bug` / `backfill` per `WORKFLOW.md`. When in doubt between two lanes, take the heavier one. +2. **Read the spec and its acceptance criteria.** For a spec-driven task: the active spec's task row, its AC, its `## Verification`, and its `## After state`. No spec (tiny lane): the one obvious edit. +3. **Implement the smallest verifiable increment.** One logical change. No speculative features, no premature abstraction; clarity over cleverness. +4. **Verify.** Run the spec's `## Verification` command (or the lane's check). Do not claim a result you did not run. +5. **Commit.** Conventional commit, one logical change. No spec/ticket IDs in the subject line. + +If you cannot make progress, see zone 4 (Pause if) and stop with a named blocker note. Do not churn. + +## 3. Done means + +A task or goal is done only when **its acceptance criteria are met AND the +verifier actually ran the check**, not when you claim they pass. Self-reported +"done" is not proof. + +Concretely, done means: **acceptance criteria met, the check actually ran (not +just asserted), review recorded + report written, and the final response says +what changed and what was not attempted.** If you could not run the check, report +that plainly; the anti-rationalization hook is the backstop for premature +completion under Claude Code, but the honesty obligation is yours under any +runtime. + +## 4. Pause if (ask a human) + +Stop and ask a human before acting on any of these. These are decisions with +direction or irreversible cost that a goal loop must not make on its own. + +- **Architecture direction** - a change to how the pieces fit, a new component, an interface or data-model shape. +- **Source-of-truth hierarchy** - which file or section is canonical when two disagree (for example, moving the operate-contract between `AGENTS.md`, `CLAUDE.md`, and `WORKFLOW.md`). +- **Validation removal** - weakening, deleting, or bypassing a test, an assertion, a hook, or any guardrail. +- **Risk-classification change** - moving work to a lighter lane, or narrowing a `full`-lane trigger (auth, authz, hooks, data model, data loss, audit/security, external provider, API contract, migration). +- **Privacy / security** - secrets, credentials, access scope, anything that touches what data leaves the repo or who can reach it. + +When you pause: write the named blocker, state the decision you are not making and why, and stop. + +--- + +## How a goal is composed (for `commands/assign.md`) + +The goal-crafter projects these four zones into a six-section `/goal`. The mapping +is a **composition, not 1:1**: two of the six sections come from the active spec, +not from this file. Keep the four zone names stable; renaming one without updating +`commands/assign.md` breaks the projection. + +| `/goal` section | Source | +|---|---| +| Context-to-read | AGENTS.md zone 1 (Read in this order) | +| Constraints | CLAUDE.md / AGENTS.md rules | +| Operating rules | AGENTS.md zone 2 (Task loop) | +| Validation loop | the active spec's `## Verification` | +| Done-when | AGENTS.md zone 3 (Done means) + the active spec's `## After state` | +| Pause-if | AGENTS.md zone 4 (Pause if) | From c343d9297e036c5e32939dd2ec7aebb34734a9b2 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:48:04 +0700 Subject: [PATCH 03/56] docs(workflow): point CC-layer docs at AGENTS.md AGENTS.md is the canonical operate-contract front door. Remove the restated ordered read-order and core done-definition from WORKFLOW.md (now pointers to AGENTS.md zone 1 and zone 3) and add the AGENTS.md front-door pointer to CLAUDE.md. WORKFLOW.md required-reading now names AGENTS.md first; kit-specific completeness clauses are kept. Co-Authored-By: Claude Opus 4.7 (1M context) --- CLAUDE.md | 4 +++- WORKFLOW.md | 18 +++++++++++------- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index e93ec6a..9640766 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -57,7 +57,9 @@ The kit unified the spec location onto `docs/specs/SPEC-NNN-.md` for both ## Workflow -The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`. +The operate-contract (read-order, task loop, done-definition, the "Pause if" list) is canonical in [`AGENTS.md`](AGENTS.md), the tool-agnostic front door. Read it first; this file is the Claude-Code layer on top of it. Do not restate the operate-contract here; point at `AGENTS.md`. + +The kit eats its own dog food. The full lifecycle, the risk-tier lanes, and the gate at each phase boundary live in one place: the [`WORKFLOW.md`](WORKFLOW.md) contract. Read it after `AGENTS.md` and this file. For kit-on-kit work, spec drafts live in `docs/specs/` (see Spec location above), not `.planning/`. `/user:kit-health` is the maintainer-only self-assessment against PHILOSOPHY.md. Run it before tagging. diff --git a/WORKFLOW.md b/WORKFLOW.md index b2a890c..b672f77 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -7,9 +7,12 @@ > hook, and the verification pipeline. ## Required reading (in order) -1. CLAUDE.md - project context: stack, structure, rules -2. docs/specs/SPEC-NNN-.md - the active spec; the shared contract for the cycle -3. docs/architecture.md - how the pieces fit (reference; not required per task) +`AGENTS.md` is the front door and owns the read-order. Read it first, then this +list; do not restate the order here. See `AGENTS.md` zone 1 ("Read in this order"). +1. AGENTS.md - the operate-contract: read-order, task loop, done-definition, "Pause if" +2. CLAUDE.md - the Claude-Code layer: stack, structure, rules, hooks, commands, plugin +3. docs/specs/SPEC-NNN-.md - the active spec; the shared contract for the cycle +4. docs/architecture.md - how the pieces fit (reference; not required per task) ## Size the work first (risk-tiered intake) Pick a lane before you start. Smaller work skips ceremony. @@ -71,10 +74,11 @@ session start **Detector/mutator split.** `/user:start` and `/user:next` only read and render; `/user:assign` is the only mutator. **Activator-agnostic.** `/user:assign` writes only the `.claude/goals/.md` draft and surfaces its body; activation (starting the loop) is done by whatever primitive is present (the built-in `/goal`, the `ralph-loop` plugin, or the `goal-craft` skill). The kit NEVER writes `.claude/last-goal.md`; if no activator exists, the draft is a plain reusable file. **"Even the goal loop follows WORKFLOW"** is delivered honestly: the safety subset is hard-enforced by existing hooks (anti-rationalization, the verification pipeline, the push-to-main blocker); decision/doc completeness is warned + logged to `~/.claude/dwarves-kit/logs/completeness.log` and reviewed at `/user:ship` + `/user:retro`, not hard-blocked mid-loop (PHILOSOPHY rejects hard-gating process completeness). ## Completion contract -A task is done only when its acceptance criteria are met and the verifier has -actually run the tests, not when you claim they pass. If you cannot run the -check, report that plainly; the anti-rationalization hook is the backstop for -premature completion. Self-reported "done" is not proof; the task-verifier is. +The done-definition is canonical in `AGENTS.md` zone 3 ("Done means"); do not +restate it here. In the kit, the task-verifier is what proves "done" (self-reported +"done" is not proof), and the anti-rationalization hook is the backstop for +premature completion. The clauses below add kit-specific completeness checks on top +of that done-definition. ### Completeness clauses (warn + log, reviewed at ship) Two self-check clauses run during Build/Reflect. Both WARN and LOG to `~/.claude/dwarves-kit/logs/completeness.log` (the `spec-drift-guard` logging shape); neither hard-blocks. `/user:ship` and `/user:retro` review that log at the gate. Hard blocks stay reserved for the safety subset (PHILOSOPHY rejects hard-gating process completeness). From 5e4bfc1ea143b41fbb88cdfb2475e0e0b3348736 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:50:41 +0700 Subject: [PATCH 04/56] docs(agents): add hello-spec AGENTS.md downstream template Mirror the kit-root AGENTS.md four-zone shape (Read in this order, Task loop, Done means, Pause if) as a downstream template for projects consuming dwarves-kit, voiced for the spm sample CLI rather than copied verbatim. Keeps the CC-only enforcement caveat and the goal-composition note. Reference it from the hello-spec README (front-of-reading-order). Co-Authored-By: Claude Opus 4.7 (1M context) --- examples/hello-spec/AGENTS.md | 87 +++++++++++++++++++++++++++++++++++ examples/hello-spec/README.md | 16 ++++--- 2 files changed, 97 insertions(+), 6 deletions(-) create mode 100644 examples/hello-spec/AGENTS.md diff --git a/examples/hello-spec/AGENTS.md b/examples/hello-spec/AGENTS.md new file mode 100644 index 0000000..1ede012 --- /dev/null +++ b/examples/hello-spec/AGENTS.md @@ -0,0 +1,87 @@ +# AGENTS.md: the operating layer (spm) + +> Tool-agnostic front door for the `spm` project. Any agent runtime (Claude Code, +> Codex, Gemini, a human) reads this first to learn how work is done in this repo. +> It carries the portable operate-contract: what to read, how to run a unit of +> work, when work is done, and when to stop and ask. This is the downstream +> template that ships with dwarves-kit; replace the `spm` specifics with your own. + +## Enforcement boundary (read this first) + +**This file is a contract, not a guardrail.** `spm` uses dwarves-kit, and the +kit's enforcement is Claude-Code-only: the hooks (safety-gate, push-to-main +blocker, anti-rationalization Stop hook, the verification pipeline) are what +actually block bad outcomes, and they run only under Claude Code. Under any other +runtime (Codex, Gemini, a bare LLM) this file is **advisory only**: it tells an +agent what to do, but nothing enforces it. Do not assume the guardrails are +portable. See `CLAUDE.md` for the Claude-Code layer (stack, rules, where the spec +lives). + +The rest of this file is four portable zones. A goal-crafter projects them into a +six-section operating directive (see "How a goal is composed" at the end). + +--- + +## 1. Read in this order + +Orient before you touch anything. Read top to bottom; stop when you have enough. + +1. **AGENTS.md** (this file) - how work is done here; the operate-contract. +2. **CLAUDE.md** - the project anchor: what `spm` is, the stack (Python 3.12 / uv / pytest / ruff), code-quality rules, where specs live. +3. **docs/specs/SPEC-NNN-.md** - the active spec; the shared contract for the cycle. Read its `## Verification` and `## After state` before implementing. (Today that is `docs/specs/SPEC-001-version-flag.md`.) +4. **WORKFLOW.md** - reference, not required per task. Read it for the lanes and the gate at each phase boundary. + +## 2. Task loop + +How to do one unit of work. The smallest verifiable increment, verified, committed. + +1. **Size the lane.** Pick `tiny` / `normal` / `full` / `bug` / `backfill` per `WORKFLOW.md`. When in doubt between two lanes, take the heavier one. +2. **Read the spec and its acceptance criteria.** For a spec-driven task: the active spec's task row, its AC, its `## Verification`, and its `## After state`. No spec (tiny lane): the one obvious edit. +3. **Implement the smallest verifiable increment.** One logical change. No speculative features (`spm` does install/freeze/list; new subcommands need a spec), no premature abstraction (no `BaseCommand` until there are 6 commands); clarity over cleverness. +4. **Verify.** Run the spec's `## Verification` command, or the lane's check: `uv run pytest && uv run ruff check .`. Do not claim a result you did not run. +5. **Commit.** Conventional commit, one logical change. No spec/ticket IDs in the subject line. + +If you cannot make progress, see zone 4 (Pause if) and stop with a named blocker note. Do not churn. + +## 3. Done means + +A task or goal is done only when **its acceptance criteria are met AND the +verifier actually ran the check**, not when you claim they pass. Self-reported +"done" is not proof. + +Concretely, done means: **acceptance criteria met, the check actually ran (not +just asserted), review recorded + report written, and the final response says +what changed and what was not attempted.** If you could not run the check, report +that plainly; under Claude Code the anti-rationalization hook is the backstop for +premature completion, but the honesty obligation is yours under any runtime. + +## 4. Pause if (ask a human) + +Stop and ask a human before acting on any of these. These are decisions with +direction or irreversible cost that a goal loop must not make on its own. + +- **Architecture direction** - a new subcommand, a new module, a change to the argparse dispatch shape, or any new public interface for the CLI. +- **Source-of-truth hierarchy** - which file or section is canonical when two disagree (for example, the operate-contract split across `AGENTS.md`, `CLAUDE.md`, and `WORKFLOW.md`). +- **Validation removal** - weakening, deleting, or skipping a test, a ruff rule, or any check; suppressing a `pip` subprocess error instead of propagating its exit code. +- **Risk-classification change** - moving work to a lighter lane, or narrowing a `full`-lane trigger (anything touching how `pip` is invoked, the wheel build, the published package, or `pyproject.toml` dependencies). +- **Privacy / security** - secrets, credentials, access scope, anything that touches what data leaves the repo or who can reach it. + +When you pause: write the named blocker, state the decision you are not making and why, and stop. + +--- + +## How a goal is composed + +A goal-crafter projects these four zones into a six-section operating directive. +The mapping is a **composition, not 1:1**: two of the six sections come from the +active spec, not from this file. Keep the four zone names stable; renaming one +breaks the projection. + +| Goal section | Source | +|---|---| +| Context-to-read | AGENTS.md zone 1 (Read in this order) | +| Constraints | CLAUDE.md / AGENTS.md rules | +| Operating rules | AGENTS.md zone 2 (Task loop) | +| Validation loop | the active spec's `## Verification` | +| Done-when | AGENTS.md zone 3 (Done means) + the active spec's `## After state` | +| Pause-if | AGENTS.md zone 4 (Pause if) | diff --git a/examples/hello-spec/README.md b/examples/hello-spec/README.md index 10b063a..143cf05 100644 --- a/examples/hello-spec/README.md +++ b/examples/hello-spec/README.md @@ -4,11 +4,14 @@ A small, self-contained example showing what dwarves-kit produces when you run i ## What this shows -A hypothetical Python CLI called `spm` ("simple package manager") needs a `--version` flag. This directory contains the three files dwarves-kit would generate or rely on to ship that change end-to-end. Each is a normal output, not a mock. +A hypothetical Python CLI called `spm` ("simple package manager") needs a `--version` flag. This directory contains the files dwarves-kit would generate or rely on to ship that change end-to-end. Each is a normal output, not a mock. -The point: see what `docs/specs/SPEC-001-version-flag.md` actually looks like; see how `CLAUDE.md` anchors a contractor's session; see how the kit's commands chain. +The point: see what `docs/specs/SPEC-001-version-flag.md` actually looks like; see how `AGENTS.md` is the tool-agnostic front door a downstream repo inherits; see how `CLAUDE.md` anchors a contractor's session; see how the kit's commands chain. -## The three files +## The files + +### `AGENTS.md` +The downstream front door. Tool-agnostic operating layer (read-order, task loop, done-definition, "Pause if" list) that any agent runtime reads first. This is the realistic-placeholder template a project inherits from dwarves-kit; `CLAUDE.md` sits under it as the Claude-Code-specific layer. Enforcement (hooks, verifier) is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. ### `CLAUDE.md` Project anchor. Read by Claude Code at session start, by `/user:spec` for stack detection, by every worker subagent in `/user:execute`. Tells Claude what the project is, what the stack is, where the spec lives, and what code-quality rules apply. @@ -49,9 +52,10 @@ PASS -> next task; FAIL:fixable -> fix-agent (max 2 retries); FAIL:escalate -> h ## Reading order -1. Read `CLAUDE.md` first; it's the project anchor a contractor sees first. -2. Then `docs/specs/SPEC-001-version-flag.md`: the actual feature plan. -3. Notice: no `--version` code yet. The spec is the input to `/user:execute`. The example stops at "spec ready to build", not "feature implemented", to keep the example readable. +1. Read `AGENTS.md` first; it's the tool-agnostic front door (read-order, task loop, done-definition, Pause-if). +2. Then `CLAUDE.md`: the Claude-Code project anchor that sits under the front door. +3. Then `docs/specs/SPEC-001-version-flag.md`: the actual feature plan. +4. Notice: no `--version` code yet. The spec is the input to `/user:execute`. The example stops at "spec ready to build", not "feature implemented", to keep the example readable. --- From 81929c65ae6b1163ac551a9d4ee22834b2d944ed Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:52:52 +0700 Subject: [PATCH 05/56] docs(spec): mark operating-layer tasks verified TASK-001..003 (Phase 1) passed task-verifier; SPEC-024 checklist updated. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/specs/SPEC-024-agents-md-operating-layer.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md index c338ad5..e26879e 100644 --- a/docs/specs/SPEC-024-agents-md-operating-layer.md +++ b/docs/specs/SPEC-024-agents-md-operating-layer.md @@ -77,8 +77,8 @@ None. The kit ships no UI. Each task is atomic: implementable in one session, fits in ~50% of a context window. ### Phase 1: Operating layer -- [ ] TASK-001: Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal. -- [ ] TASK-002: Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first. +- [x] TASK-001 (DONE 7c5015c, verified): Write `AGENTS.md` at kit root: ordered read list, task loop, done-definition (which MUST include "review recorded + report written, and the final response says what changed and what was not attempted"), and an explicit "Pause if / ask a human" list (architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, privacy/security). State plainly that enforcement is Claude-Code-only; under other runtimes `AGENTS.md` is advisory. - AC: file exists; `grep -q 'Pause if' AGENTS.md`; the four portable zones (read list, task loop, done-definition, Pause-if) are present and named so `assign.md` can project them into the six-section goal. +- [x] TASK-002 (DONE c343d92, verified): Make the CC-layer docs point at `AGENTS.md` for the operate-contract (replace, don't duplicate). Two files: (a) kit-root `CLAUDE.md` (NOT `examples/hello-spec/CLAUDE.md`, which TASK-003 handles) gets/keeps a one-line pointer to `AGENTS.md`; (b) `WORKFLOW.md` is the file that actually carries the operate-contract prose today (`## Required reading` ordered list + `## Completion contract` done-definition), so reconcile it to point at `AGENTS.md` as the read-order/done source rather than restating it. Pick one direction (`AGENTS.md` canonical, `WORKFLOW.md` points) and apply it to both. - AC: neither `CLAUDE.md` nor `WORKFLOW.md` restates the ordered read-order + done-definition that now live in `AGENTS.md`; both reference `AGENTS.md`; `WORKFLOW.md`'s required-reading list names `AGENTS.md` first. - [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it. ### Phase 2: Lanes + projection From f9c04cf2bda52bccb0044fa0d648a88886f0da86 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:54:01 +0700 Subject: [PATCH 06/56] feat(workflow): add backfill brownfield lane Brownfield lane: review an existing codebase and write the operating-layer docs (AGENTS.md / CLAUDE.md / specs) without changing application behavior. Doc-output only, no app-code edits, /spec optional. Consistent with the lanes list in AGENTS.md zone 2. The lane explicitly forbids app-behavior change, which the "backfill lane silently edits app code" failure-mode row depends on. Co-Authored-By: Claude Opus 4.7 (1M context) --- WORKFLOW.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/WORKFLOW.md b/WORKFLOW.md index b672f77..8492aed 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -23,6 +23,7 @@ Pick a lane before you start. Smaller work skips ceremony. | normal | one bounded feature or fix | /spec, /execute, /review, /ship | | full | touches auth, authz, hooks, data model, data loss, audit/security, an external provider, an API contract, a migration, or weakens validation | /think, /spec, /spec-validate, /execute, /review-team, /docs, /ship, /retro | | bug | a defect, regression, or failing test (not a new feature) | /debug (root cause before any fix), then /review | +| backfill | brownfield: review an existing codebase and write the operating-layer docs (AGENTS.md / CLAUDE.md / specs) | review the code, write the docs. Doc-output only; no app-behavior change, no app-code edits. /spec optional. | When in doubt between two lanes, take the heavier one. Anything in the full-lane trigger list uses the full lane unless you explicitly narrow the scope and say why. @@ -62,7 +63,7 @@ session start for whatever goal-loop activator is present, hand off to the lane's first command. Mutator; does NOT execute, never writes last-goal.md. - -> the lane runs tiny | normal | full (see the cycle table above) + -> the lane runs tiny | normal | full | bug | backfill (see the lane table above) normal/full -> /user:spec -> /user:spec-validate -> /user:execute (verify pipeline) (opt-in: /user:devs-team + /user:visual-team before spec; /user:test-plan before execute; /user:ui-design for downstream UI work, after /user:design) From b7bb3ee547a2c31c9c263a2ee1d1ead994739580 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 22:58:40 +0700 Subject: [PATCH 07/56] feat(assign): project six-section goal from AGENTS.md Replace the four-field contract-goal shape in Step 3 (Objective / Scope fence / Termination-on-blocker / Verification) with a six-section operating directive, each section projecting from an AGENTS.md zone or the active spec, matching AGENTS.md's "How a goal is composed" table: Context-to-read <- AGENTS.md zone 1 (Read in this order) Constraints <- AGENTS.md / CLAUDE.md rules + scope fence Operating rules <- AGENTS.md zone 2 (Task loop) Validation loop <- the active spec's ## Verification Done-when <- AGENTS.md zone 3 + the spec's ## After state Pause-if <- AGENTS.md zone 4 (Pause if) Done-when quotes the spec's observable ## After state bullets (falls back to AGENTS.md "Done means" when a spec lacks the section). Scope fence folds into Constraints; termination-on-blocker folds into Pause-if. Step 4 now points at the six-section directive. Spec-first opening move kept. Co-Authored-By: Claude Opus 4.7 (1M context) --- commands/assign.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/commands/assign.md b/commands/assign.md index 3be3c95..3d491fd 100644 --- a/commands/assign.md +++ b/commands/assign.md @@ -14,15 +14,18 @@ Read `$ARGUMENTS` for the `ID-NNN`. Find that row in the `_meta/BACKLOG.md` Acti If a `.claude/goals/.md` already exists for this id (one draft per id), re-surface the existing draft instead of creating a duplicate, and skip to Step 5. Mirrors SPEC-005 edge 6: the filesystem is the source of truth. -### Step 3: Goal-craft the breakdown +### Step 3: Project the six-section operating directive -From the item's Title + Target artifact, craft a goal with: -- **Objective**: the outcome that makes the item done, artifact-shaped and verifiable. -- **Scope fence**: the files/dirs in play, plus an explicit `Not:` list of adjacent things to leave alone. -- **Termination-on-blocker**: if blocked, write a named blocker note and stop (no churn). -- **Verification**: the command(s) that prove it done. If the Target artifact is a SPEC with a `## Verification` section, point the goal at that doc (a pointer-`/goal`); otherwise name the real check. +From the item's Title + Target artifact, craft the goal as a **six-section operating directive**, not a one-line contract. Each section *projects* from an `AGENTS.md` zone or a section of the active spec; this is the projection contract, and it must match `AGENTS.md`'s "How a goal is composed" table (keep the two in sync, or the `AGENTS.md`↔`assign.md` drift failure-mode fires). The mapping is a composition, not 1:1: two of the six sections come from the active spec, not from `AGENTS.md`. -If the repo is spec-driven and the lane is normal/full, the goal is **spec-first**: its opening move is the lane's first command (`/user:spec`), not building code. This matches the `goal-craft` skill's spec-driven-repo rule. +1. **Context-to-read** <- `AGENTS.md` zone 1 ("Read in this order"). Carry the ordered read list (AGENTS.md, CLAUDE.md, the active `docs/specs/SPEC-NNN-.md`, then reference docs) so the loop orients before touching anything. +2. **Constraints** <- the `AGENTS.md` / `CLAUDE.md` rules plus the item's scope fence: the files/dirs in play and an explicit `Not:` list of adjacent things to leave alone. This is where the old scope fence lives now. +3. **Operating rules** <- `AGENTS.md` zone 2 ("Task loop"): size the lane, read the spec + its AC, implement the smallest verifiable increment, verify, commit (one logical change, no spec/ticket IDs in the subject). +4. **Validation loop** <- the active spec's `## Verification`. If the Target artifact is a SPEC, point the goal at that doc's `## Verification` (a pointer-`/goal`); otherwise name the real check command. Do not write "tests pass"; name the command. +5. **Done-when** <- `AGENTS.md` zone 3 ("Done means") + the active spec's `## After state` (the observable bullets). The goal is done only when its AC are met, the check actually ran, review is recorded + a report written, and the final response says what changed and what was not attempted. Quote the spec's `## After state` observable bullets here verbatim as the done-picture. If the spec lacks an `## After state` section, fall back to `AGENTS.md` "Done means" alone. +6. **Pause-if** <- `AGENTS.md` zone 4 ("Pause if"): stop and ask a human (with a named blocker note, no churn) on architecture direction, source-of-truth hierarchy, validation removal, risk-classification change, or privacy/security. This is where the old termination-on-blocker lives now. + +If the repo is spec-driven and the lane is normal/full, the directive is **spec-first**: its opening move is the lane's first command (`/user:spec`), not building code. This matches the `goal-craft` skill's spec-driven-repo rule. ### Step 4: Write the draft (the SPEC-005 contract / ADR-0011) @@ -36,7 +39,7 @@ target_spec: status: drafted created: --- - + ``` `.claude/` is gitignored (per-machine drafts). The filesystem (`ls .claude/goals/*.md`) is authoritative; if a `.claude/goals/INDEX.md` cache exists, rebuild its row from the files. Do NOT write `.claude/last-goal.md`: the built-in `/goal` owns that slot (ADR-0011). From 86e3aa0b4f02f4ffb83b7a592acb7ea27a892bd0 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:01:04 +0700 Subject: [PATCH 08/56] feat(spec): add After state section to spec template Insert an observable definition-of-done section between ## Solution and ## Acceptance Criteria in the commands/spec.md template block. Each bullet is false-now/true-after and checkable by a human or a command. The inline rule states observable, not narrated, or it is fluff and gets cut, anchored to PHILOSOPHY's every-file-justifies-itself honesty rule. Co-Authored-By: Claude Opus 4.7 (1M context) --- commands/spec.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/commands/spec.md b/commands/spec.md index f8b442d..fb8066e 100644 --- a/commands/spec.md +++ b/commands/spec.md @@ -119,6 +119,12 @@ Each task must be atomic: implementable in one session, fits in 50% of a context ### Phase 3: Polish - [ ] TASK-004: [description] — [acceptance criteria] +## After state +The definition-of-done picture. Each bullet is false now and true after, and each is checkable by a human or a command. This feeds `## Acceptance Criteria` below and projects into the goal's `Done-when`. +Rule: observable, not narrated. If a bullet cannot be verified by reading a file, running a command, or seeing a state, it is fluff and gets cut (PHILOSOPHY: every file justifies its existence). Pair the after-state with a "(Today: ...)" current-state note where it sharpens the contrast. +- [ ] [observable end state]. (Today: [current state].) +- [ ] [observable end state, checkable by ``]. + ## Acceptance Criteria (global) - [ ] All tasks pass their individual acceptance criteria - [ ] Tests cover happy path + edge cases listed below From 8a1219178e7c6f6845c7b583a4df4ab09ec8ba44 Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:02:31 +0700 Subject: [PATCH 09/56] docs(spec): record verified lanes and projection progress TASK-004 backfill lane, TASK-005 six-section projection, TASK-006 After-state template: all passed task-verifier. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/specs/SPEC-024-agents-md-operating-layer.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/specs/SPEC-024-agents-md-operating-layer.md b/docs/specs/SPEC-024-agents-md-operating-layer.md index e26879e..922190f 100644 --- a/docs/specs/SPEC-024-agents-md-operating-layer.md +++ b/docs/specs/SPEC-024-agents-md-operating-layer.md @@ -82,9 +82,9 @@ Each task is atomic: implementable in one session, fits in ~50% of a context win - [ ] TASK-003: Add `examples/hello-spec/AGENTS.md` as the downstream template (realistic placeholder content, same shape). - AC: file exists; the hello-spec README/template note references it. ### Phase 2: Lanes + projection -- [ ] TASK-004: Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present. -- [ ] TASK-005: Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside. -- [ ] TASK-006: Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present. +- [x] TASK-004 (DONE f9c04cf, verified): Add a `backfill` brownfield lane to `WORKFLOW.md` (lane table + cycle): review an existing codebase and write the operating-layer docs without changing application behavior; spec-optional, doc-output, no app-code edits. - AC: `grep -q backfill WORKFLOW.md`; lane row + a one-line description present. +- [x] TASK-005 (DONE b7bb3ee, verified): Make `commands/assign.md` emit the six-section operating directive, each section projecting an `AGENTS.md` artifact, with `Done-when` including the spec's observable After-state. - AC: `assign.md` documents the six-section projection and the After-state in `Done-when`; the contract-goal-only shape is replaced, not left alongside. +- [x] TASK-006 (DONE 86e3aa0, verified): Add `## After state` to the `commands/spec.md` template (between `## Solution` and `## Acceptance Criteria`), with the observable-not-narrated rule inline. - AC: `grep -q '## After state' commands/spec.md`; the rule text is present. ### Phase 3: Sync (the doc-impact map) - [ ] TASK-007: Update the `WORKFLOW.md` doc-impact map: add a "new top-level file" trigger row (the current bolded self-maintaining rule covers only a new top-level *dir*, so a top-level *file* like `AGENTS.md` slips through), add the `AGENTS.md` companion-doc row, and a `backfill`-lane reference. - AC: the map names `AGENTS.md` and carries a top-level-file trigger row. From 9fc3e251ff2d8425518a45b7bada44c661c8007a Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:04:07 +0700 Subject: [PATCH 10/56] docs(workflow): map AGENTS.md and top-level-file impacts Extend the WORKFLOW.md doc-impact map for the operating-layer docs: - Add a "new top-level file under the kit root" trigger row, mirroring the existing top-level-dir row, so a top-level file like AGENTS.md no longer slips through the self-maintaining net. - Add an AGENTS.md companion-doc row (CLAUDE.md/WORKFLOW.md pointers, examples/hello-spec/AGENTS.md, commands/assign.md, tests/test-meta.sh). - Note that the backfill lane produces operating-layer docs, so a backfill run knows which docs it writes. - Widen the self-maintaining note to cover both dirs and files. Co-Authored-By: Claude Opus 4.7 (1M context) --- WORKFLOW.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/WORKFLOW.md b/WORKFLOW.md index 8492aed..9daac45 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -104,9 +104,13 @@ Per change-type, the companion docs that must move with it. This covers the enum | a new `docs/decisions/` ADR | README + `docs/architecture.md` cross-refs | | a new `docs/specs/SPEC-NNN` | `_meta/BACKLOG.md` status, the spec's `Status:` header | | **a new top-level dir under the kit root** | **this doc-impact map (WORKFLOW.md)**, README "Project structure", `docs/architecture.md` | +| **a new top-level file under the kit root** | **this doc-impact map (WORKFLOW.md)**, README "Project structure", `docs/architecture.md` | +| `AGENTS.md` (kit root) | `CLAUDE.md` + `WORKFLOW.md` pointers (must not drift), `examples/hello-spec/AGENTS.md` (downstream template), `commands/assign.md` (the six-section projection reads its zones), `tests/test-meta.sh` | | any shipped change (normal/full) | `CHANGELOG.md`, `VERSION`, `.claude-plugin/plugin.json` version, `tool.toml` version, `docs/retro/v.md` | -The bolded row is self-maintaining: adding a new top-level dir must update this map. +The bolded rows are self-maintaining: adding a new top-level dir or file must update this map. + +The `backfill` lane (see the lane table) produces operating-layer docs rather than touching a source path: a backfill run writes `AGENTS.md`, `CLAUDE.md`, and any specs for the reviewed codebase, so its companion docs are those it writes. **Version surfaces.** The version string is duplicated and must stay in sync: it lives in `VERSION` (the source of truth), `.claude-plugin/plugin.json`, and `tool.toml`. Bumping the version means updating those; `marketplace.json` inherits it via `"source": "."` and needs no bump. The kit does NOT keep component counts (`N hooks`, `N commands`, etc.) in prose: describe the component set qualitatively, never as a hand-maintained number that silently rots. From f51537e470c810e89610f46b219c2edfda46212c Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:06:15 +0700 Subject: [PATCH 11/56] docs: cross-reference AGENTS.md and backfill lane README "Project structure" now lists AGENTS.md (the tool-agnostic operate-contract front door) and WORKFLOW.md, and notes CLAUDE.md is the Claude-Code layer on top. README Workflow section names the backfill lane. docs/architecture.md data-flow intro cross-references AGENTS.md as the front door that CLAUDE.md and WORKFLOW.md point at, and names the backfill lane. These are structure listings and cross-references; the operate-contract itself stays in AGENTS.md (not restated here). Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 6 +++++- docs/architecture.md | 2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index ec08296..6a1615d 100644 --- a/README.md +++ b/README.md @@ -156,6 +156,8 @@ Don't run both install paths on the same machine -- hooks would register twice. /user:retro Retrospective (10 min, after shipping) ``` +Work is sized by risk lane before it starts (tiny / normal / full / bug, plus a `backfill` lane for brownfield: review an existing codebase and write the operating-layer docs without changing app behavior). The lanes, the gate at each phase boundary, and the operate-contract the agent follows live in [`AGENTS.md`](AGENTS.md) and [`WORKFLOW.md`](WORKFLOW.md). + ## Debug mode Set `DWARVES_KIT_DEBUG=1` to see what every hook is doing: @@ -216,10 +218,12 @@ These tools complement the kit but are installed separately: ``` dwarves-kit/ tool.toml Kit metadata (name, version, language=bash, deps) + AGENTS.md Tool-agnostic operate-contract front door (any runtime reads it first) + WORKFLOW.md The cycle, the risk-tier lanes, the gate at each boundary MANUAL.md Operator reference for the commands RUNBOOK.md Hook misbehavior diagnosis + recovery README.md / CONTRIBUTING.md / CHANGELOG.md / VERSION / LICENSE - CLAUDE.md Project template + CLAUDE.md Project template; the Claude-Code layer on top of AGENTS.md install.sh / settings.json Bash install path .claude-plugin/ Plugin install path (plugin.json, marketplace.json) .github/workflows/test.yml CI: macOS + Ubuntu test matrix diff --git a/docs/architecture.md b/docs/architecture.md index fba6b98..97aeb79 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -18,7 +18,7 @@ The kit is intentionally flat. Component dirs sit at the top of the repo, not ne ## Data flow through `docs/specs/SPEC-NNN-.md` -> The imperative companion to the descriptive map below is [`WORKFLOW.md`](../WORKFLOW.md) (repo root): the same lifecycle phrased as the contract an agent follows, with risk-tier lanes and the gate at each boundary. +> The front door to all of this is [`AGENTS.md`](../AGENTS.md) (repo root): the tool-agnostic operate-contract that any runtime reads first; `CLAUDE.md` and `WORKFLOW.md` point at it rather than restate it. The imperative companion to the descriptive map below is [`WORKFLOW.md`](../WORKFLOW.md) (repo root): the same lifecycle phrased as the contract an agent follows, with risk-tier lanes (including the `backfill` brownfield lane: review an existing codebase and write the operating-layer docs, doc-output only, no app-behavior change) and the gate at each boundary. `docs/specs/SPEC-NNN-.md` is the shared contract for the full lifecycle. It is the single source of truth that crosses command boundaries: From 6ec78805dac145f14eacd5b5f5da9f526d9f502e Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:09:25 +0700 Subject: [PATCH 12/56] test(meta): assert AGENTS.md zones, projection, and install merge Pin the cycle's structural outputs so a wording flip fails CI: - kit-root AGENTS.md: the four portable zones, the literal "Pause if", and the Claude-Code-only-enforcement statement. - commands/assign.md: the six-section /goal projection (writer side). - low-cost guards for hello-spec AGENTS.md and spec.md "## After state". Add the install merge-with-existing-hooks regression: seed a HOME whose settings.json already carries a third-party hook, run install.sh through the jq clean+merge path, and assert the hook survives, the JSON stays valid, and a dwarves-kit hook is merged in. The test passes against the current install.sh, so no jq fix was needed (DEC-004: bug not present). Co-Authored-By: Claude Opus 4.7 (1M context) --- tests/test-meta.sh | 144 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/tests/test-meta.sh b/tests/test-meta.sh index f0d62fd..1597a8e 100755 --- a/tests/test-meta.sh +++ b/tests/test-meta.sh @@ -149,6 +149,150 @@ INPLACE_BROKEN=$(for f in "$INPLACE_HOME/.claude/dwarves-kit/hooks/"*.sh; do [ - assert_eq "in-place install keeps hook scripts resolvable (broken: ${INPLACE_BROKEN:-none})" "" "$INPLACE_BROKEN" rm -rf "$INPLACE_HOME" +# ============================================================ +echo "" +echo "=== AGENTS.md operating layer (SPEC-024) ===" +# ============================================================ +# Part A: pin the cycle's structural outputs so a wording flip fails CI. +# 1. kit-root AGENTS.md exists + carries the four portable zones + the literal +# "Pause if" + the CC-only-enforcement statement. +# 2. commands/assign.md carries the six-section /goal projection (the writer +# side of the AGENTS.md->assign.md projection). +# 3. Low-cost regression guards for TASK-003 (hello-spec AGENTS.md) and +# TASK-006 (spec.md "## After state"). + +AGENTS_MD="$KIT_DIR/AGENTS.md" +TOTAL=$((TOTAL + 1)) +if [ -f "$AGENTS_MD" ]; then + echo -e " ${GREEN}PASS${NC} AGENTS.md exists at kit root (SPEC-024)" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} AGENTS.md missing at kit root" + FAIL=$((FAIL + 1)) +fi + +# The four portable zones (DEC-005). Pin the heading literals, not prose. +for ZONE in "## 1. Read in this order" "## 2. Task loop" "## 3. Done means" "## 4. Pause if (ask a human)"; do + TOTAL=$((TOTAL + 1)) + if grep -qF "$ZONE" "$AGENTS_MD" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} AGENTS.md has zone '$ZONE'" + PASS=$((PASS + 1)) + else + echo -e " ${RED}FAIL${NC} AGENTS.md missing zone '$ZONE'" + FAIL=$((FAIL + 1)) + fi +done + +# The literal "Pause if" (the fourth zone's stable phrase, also the goal section). +TOTAL=$((TOTAL + 1)) +if grep -qF 'Pause if' "$AGENTS_MD" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} AGENTS.md carries the literal 'Pause if'" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} AGENTS.md lost the literal 'Pause if'" + FAIL=$((FAIL + 1)) +fi + +# The CC-only-enforcement statement (PHILOSOPHY honesty rule: never over-claim +# portable enforcement). A drift to "enforcement is portable" would be a lie. +TOTAL=$((TOTAL + 1)) +if grep -qF 'Enforcement is Claude-Code-only' "$AGENTS_MD" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} AGENTS.md states enforcement is Claude-Code-only" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} AGENTS.md lost the CC-only-enforcement statement" + FAIL=$((FAIL + 1)) +fi + +# commands/assign.md carries the six-section projection (the writer side). A +# wording flip on any section name breaks the AGENTS.md->assign.md projection. +ASSIGN_MD="$KIT_DIR/commands/assign.md" +for SECTION in "Context-to-read" "Constraints" "Operating rules" "Validation loop" "Done-when" "Pause-if"; do + TOTAL=$((TOTAL + 1)) + if grep -qF "$SECTION" "$ASSIGN_MD" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} assign.md has projection section '$SECTION'" + PASS=$((PASS + 1)) + else + echo -e " ${RED}FAIL${NC} assign.md missing projection section '$SECTION'" + FAIL=$((FAIL + 1)) + fi +done + +# Low-cost regression guards: TASK-003 (hello-spec AGENTS.md w/ "Pause if") and +# TASK-006 (spec.md template's "## After state"). Pin both so they cannot silently +# regress. +DEMO_AGENTS="$KIT_DIR/examples/hello-spec/AGENTS.md" +TOTAL=$((TOTAL + 1)) +if [ -f "$DEMO_AGENTS" ] && grep -qF 'Pause if' "$DEMO_AGENTS" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} examples/hello-spec/AGENTS.md exists + carries 'Pause if' (TASK-003)" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} examples/hello-spec/AGENTS.md missing or lost 'Pause if'" + FAIL=$((FAIL + 1)) +fi + +TOTAL=$((TOTAL + 1)) +if grep -qF '## After state' "$KIT_DIR/commands/spec.md" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} commands/spec.md template carries '## After state' (TASK-006)" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} commands/spec.md template lost '## After state'" + FAIL=$((FAIL + 1)) +fi + +# ------------------------------------------------------------ +# Part B: install.sh merge-with-existing-hooks regression (DEC-004). +# The existing installer test runs into a HOME with NO settings.json, so it never +# exercises the jq clean+merge path. This test pre-seeds settings.json with a +# THIRD-PARTY hook (a command that does NOT contain "dwarves-kit") and asserts the +# merge preserves it, yields valid JSON, and still pulls in a dwarves-kit hook. +MERGE_HOME=$(mktemp -d) +mkdir -p "$MERGE_HOME/.claude" +THIRD_PARTY_CMD="/opt/acme/hooks/audit-log.sh" +# Build the pre-existing settings via jq so it is always well-formed JSON. +jq -n --arg cmd "$THIRD_PARTY_CMD" '{ + hooks: { + PreToolUse: [ + { matcher: "Bash", hooks: [ { type: "command", command: $cmd } ] } + ] + } +}' > "$MERGE_HOME/.claude/settings.json" + +HOME="$MERGE_HOME" bash "$KIT_DIR/install.sh" >/dev/null 2>&1 +MERGED_SETTINGS="$MERGE_HOME/.claude/settings.json" + +# (a) the third-party hook command survived the merge. +TOTAL=$((TOTAL + 1)) +if grep -qF "$THIRD_PARTY_CMD" "$MERGED_SETTINGS" 2>/dev/null; then + echo -e " ${GREEN}PASS${NC} install merge preserves the third-party hook (DEC-004)" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} install merge DROPPED the third-party hook (merge bug)" + FAIL=$((FAIL + 1)) +fi + +# (b) the resulting settings.json is valid JSON. +TOTAL=$((TOTAL + 1)) +if jq '.' "$MERGED_SETTINGS" >/dev/null 2>&1; then + echo -e " ${GREEN}PASS${NC} merged settings.json is valid JSON" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} merged settings.json is not valid JSON (merge corrupted it)" + FAIL=$((FAIL + 1)) +fi + +# (c) at least one dwarves-kit hook was merged in alongside the third-party one. +TOTAL=$((TOTAL + 1)) +KIT_HOOK_COUNT=$(jq '[.hooks | to_entries[] | .value[] | .hooks[] | select(.command | tostring | contains("dwarves-kit"))] | length' "$MERGED_SETTINGS" 2>/dev/null || echo 0) +if [ "${KIT_HOOK_COUNT:-0}" -gt 0 ]; then + echo -e " ${GREEN}PASS${NC} install merge added at least one dwarves-kit hook ($KIT_HOOK_COUNT)" + PASS=$((PASS + 1)) +else + echo -e " ${RED}FAIL${NC} install merge added no dwarves-kit hooks" + FAIL=$((FAIL + 1)) +fi +rm -rf "$MERGE_HOME" + # ============================================================ echo "" echo "=== Agent files ===" From 43fa85d6ef3fbfd90ccab286d61b1f527a01126f Mon Sep 17 00:00:00 2001 From: Han Ngo Date: Thu, 21 May 2026 23:12:51 +0700 Subject: [PATCH 13/56] chore(release): v1.7.0 AGENTS.md operating layer Bump VERSION, plugin.json, and tool.toml 1.6.0 -> 1.7.0 for the SPEC-024 feature release. Add the 1.7.0 CHANGELOG entry (AGENTS.md front door, backfill lane, assign goal projection, spec After-state section, doc-impact-map rows, downstream AGENTS.md template, and the added install settings-merge regression coverage framed as a guard, not a fix). Add a copy-if-absent AGENTS.md tip to install.sh. Co-Authored-By: Claude Opus 4.7 (1M context) --- .claude-plugin/plugin.json | 2 +- CHANGELOG.md | 13 +++++++++++-- VERSION | 2 +- install.sh | 3 +++ tool.toml | 2 +- 5 files changed, 17 insertions(+), 5 deletions(-) diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 9d9046e..78fa52a 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "dwarves-kit", - "version": "1.6.0", + "version": "1.7.0", "description": "Spec-driven Claude Code workflow with verification pipeline. Hooks, slash commands, and verification subagents.", "author": { "name": "Dwarves Foundation", diff --git a/CHANGELOG.md b/CHANGELOG.md index 9b9bc88..a4a5733 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -42,9 +42,18 @@ All notable changes to dwarves-kit are documented here. - **`context-readiness.sh` + `session-state-save.sh` count fragility**: `find ... | grep -v` and `grep -c ... || echo 0` both mishandled the zero-match case under `set -e`/pipefail. In a source-file-free repo the hook aborted with no output; a spec with 0 done tasks rendered `tasks:0\n0/N` plus an `integer expected` error. Now uses `{ grep -v ... || true; }` and `grep -c ... || true`. Found during SPEC-010 execution (ID-013). - **`install.sh` never materialized the hooks `settings.json` references (SPEC-025)**: settings hard-code every hook (and the statusline) at `$HOME/.claude/dwarves-kit/hooks/