Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions .claude/skills/gaia/references/plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ The deep synthesis runs in the planner spawned at step 4, and the planner's mode
- If the user picks option 1: spawn the agent with `model: opus`.
- If the user picks option 2: spawn without a model override (inherit current).

This decision governs the **planner** only. The plan's **execution** sub-agents are a separate decision: they default to Sonnet, pinned in the `ORCHESTRATOR.md`/`KICKOFF.md` the planner writes (see step 4's Sub-agent invocation bullet). Do not conflate the two.

### 3. Resolve plan directory

Derive a short kebab-case slug from the feature description (e.g. "auth rework" → `auth-rework`).
Expand Down Expand Up @@ -96,7 +98,7 @@ Then write the following files directly to `{PLAN_DIR}/`:
- Acceptance criteria (concrete and testable)
- Dependencies on other tasks in this plan

2. **`{PLAN_DIR}/README.md`**: task graph showing phases, which tasks run in parallel within each phase, and the frozen interface contracts shared across tasks. **If `{SPEC_PATH}` was provided** (i.e. this plan was derived from a SPEC), the README MUST open with a `## Source SPEC` section naming the SPEC id and the absolute path, so plan→SPEC discovery is one read away. Format: `Derived from {SPEC-id} ({SPEC_PATH}).`
2. **`{PLAN_DIR}/README.md`**: task graph showing phases, which tasks run in parallel within each phase, and the frozen interface contracts shared across tasks. **Annotate each phase with its execution model** (e.g. `Phase 1 (2 sub-agents, model sonnet)`); Sonnet is the default, so call out any phase you escalate to Opus explicitly and briefly say why. **If `{SPEC_PATH}` was provided** (i.e. this plan was derived from a SPEC), the README MUST open with a `## Source SPEC` section naming the SPEC id and the absolute path, so plan→SPEC discovery is one read away. Format: `Derived from {SPEC-id} ({SPEC_PATH}).`

3. **`{PLAN_DIR}/ORCHESTRATOR.md`**: instructions for running the plan. Must cover:
- **RUNNING sentinel.** As the very first step, write a sentinel file at `{PLAN_DIR}/RUNNING`. Content:
Expand Down Expand Up @@ -170,7 +172,7 @@ Then write the following files directly to `{PLAN_DIR}/`:

If `CONCURRENT_LIVE` is empty (no live concurrent detected): proceed without prompting.

- **Phase order** with per-phase quality gates (`pnpm typecheck && pnpm lint`).
- **Phase order** with per-phase quality gates (`pnpm typecheck && pnpm lint`). Name each phase's execution model in the outline (Sonnet by default; see the Sub-agent invocation bullet), so a cold orchestrator sees the model alongside the phase.
- **Pre-merge `code-review-audit` (non-skippable).** Before any `gh pr merge` call, the orchestrator spawns the `code-review-audit` agent on the current branch. The agent's clean pass writes `.gaia/local/audit/<HEAD-sha>.ok`, which the deny-hook (`.claude/hooks/pr-merge-audit-check.sh`) gates `gh pr merge` on. The orchestrator does NOT wait for the deny-hook to fire and learn from it, that round-trip is friction. Spawn the agent proactively. Contract: `wiki/concepts/PR Merge Workflow.md`. Verbatim agent-spawn template:

Task(
Expand All @@ -182,7 +184,7 @@ Then write the following files directly to `{PLAN_DIR}/`:

The audit's LOCAL Task return is terse (pointer + counts + marker line); the full per-finding detail lives in the re-run carry-forward ledger (`.gaia/local/audit/<base-sha>.rerun.json`). To surface the open findings, the orchestrator reads the ledger's `remaining[]` (enumerating Critical, Important, and escalated Suggestions for the user) instead of expecting a full inline report. Fail-open: if the ledger is absent, corrupt, or stale, the audit's return carries the full report (it emits the full report whenever it could not write the ledger), so the orchestrator surfaces the open findings from that report as today.

- **Sub-agent invocation:** the verbatim prompt template for each task sub-agent. Sub-agents do NOT commit, push, or open/update the PR, they only edit files and report. The orchestrator owns all git operations. **The prompt template MUST require sub-agents to end their return with a `## Notes for orchestrator` section** containing any of: `### Findings` (non-obvious things they noticed), `### Deviations from plan` (where the task spec was wrong / they had to work around it), `### Follow-ups` (work the user should consider after merge). Subsections may be empty or omitted; only non-trivial signal belongs here, routine "phase done, tests green" status does NOT.
- **Sub-agent invocation:** the verbatim prompt template for each task sub-agent. **Each task sub-agent MUST be dispatched as `general-purpose` with `model: "sonnet"` explicitly pinned.** The feature's complexity is resolved upstream, during `/gaia-spec` + its audit and `/gaia-plan` + the decomposition audit, precisely so execution can run on the cheaper model. Pin Sonnet on the dispatch itself so the executors run on Sonnet regardless of the orchestrator's own session model: a cold orchestrator is often on Opus, and an unpinned sub-agent inherits that. **Escape hatch:** the planner MAY pin `model: "opus"` on a specific phase or task it judges to be genuinely deep synthesis (a subtle parser grammar, a cross-cutting type redesign), but must name which phase and why in that phase's `ORCHESTRATOR.md` entry. Sonnet is the floor; Opus is a per-phase, justified exception, never the blanket default. Sub-agents do NOT commit, push, or open/update the PR, they only edit files and report. The orchestrator owns all git operations. **The prompt template MUST require sub-agents to end their return with a `## Notes for orchestrator` section** containing any of: `### Findings` (non-obvious things they noticed), `### Deviations from plan` (where the task spec was wrong / they had to work around it), `### Follow-ups` (work the user should consider after merge). Subsections may be empty or omitted; only non-trivial signal belongs here, routine "phase done, tests green" status does NOT.
- **Orchestrator-owned git flow.** After each phase that produces changes (and only once the quality gate is clean), the orchestrator stages, commits with a meaningful message, and pushes. The orchestrator opens the PR after the first phase's commit lands on the remote (using `gh pr create`) and updates it with subsequent commits. Never commit a broken state.
- **Phase findings ledger (`{PLAN_DIR}/SUMMARY.md`).** Append-only file the orchestrator maintains across the run, so sub-agent observations survive context compression. After each phase, the orchestrator appends a `## Phase N, <title>` block containing the phase's commit short-SHA and the merged `Notes for orchestrator` content from every sub-agent in that phase. If a phase produced no notes (all sub-agents reported only routine status), append the phase heading with `_No notes._` so the ledger reflects the full run. Sub-agents do not write to this file directly; the orchestrator owns it.
- **Stop conditions.** On any sub-agent failure or quality-gate failure: STOP and surface to the user. Do not "fix and continue", do not commit, do not push. Before stopping, append the failure context (which phase, which sub-agent, error) to `SUMMARY.md` under a `## Phase N, <title> (HALTED)` block so the user and any follow-up session see the same record.
Expand Down Expand Up @@ -224,7 +226,7 @@ Then write the following files directly to `{PLAN_DIR}/`:

No error surfaced. No `ExitWorktree` invocation in this branch. The continuation prompt is self-contained, the user pastes it into a new session and the cleanup completes without further investigation.

4. **`{PLAN_DIR}/KICKOFF.md`**: the orchestrator's kickoff prompt itself, ready to be read and executed verbatim. The file is the prompt, no preamble, no "copy and paste below" instruction, no surrounding commentary, no `---` separators framing the prompt as a quoted block. The opening line addresses the orchestrator directly (e.g. "You are the orchestrator for the {feature} plan…"). Must be fully self-contained with no assumed context: absolute paths to `README.md` and `ORCHESTRATOR.md`, the goal, hard rules, and the execution outline. The kickoff also includes a one-line reference to the pre-merge `code-review-audit` obligation (e.g. "Before any `gh pr merge`, run the `code-review-audit` agent, see ORCHESTRATOR.md's Pre-merge code-review-audit section."). The line ensures a cold-started orchestrator reads the requirement before doing any work, surviving any context compression that drops the ORCHESTRATOR.md content from the first read.
4. **`{PLAN_DIR}/KICKOFF.md`**: the orchestrator's kickoff prompt itself, ready to be read and executed verbatim. The file is the prompt, no preamble, no "copy and paste below" instruction, no surrounding commentary, no `---` separators framing the prompt as a quoted block. The opening line addresses the orchestrator directly (e.g. "You are the orchestrator for the {feature} plan…"). Must be fully self-contained with no assumed context: absolute paths to `README.md` and `ORCHESTRATOR.md`, the goal, hard rules, and the execution outline. The kickoff also includes a one-line reference to the pre-merge `code-review-audit` obligation (e.g. "Before any `gh pr merge`, run the `code-review-audit` agent, see ORCHESTRATOR.md's Pre-merge code-review-audit section.") and a one-line default-execution-model statement (e.g. "Dispatch each task sub-agent as `general-purpose` with `model: \"sonnet\"` unless ORCHESTRATOR.md's phase list escalates that phase to Opus."). Both lines ensure a cold-started orchestrator reads the requirement before doing any work, surviving any context compression that drops the ORCHESTRATOR.md content from the first read.

Before returning, delete `{PLAN_DIR}/.work/` if you created it. Use the literal repo-relative path so the project's `rm -rf .gaia/local/plans/*` permission auto-approves it without a prompt: `rm -rf .gaia/local/plans/<slug>/.work`. Do not reconstruct an absolute path from variables, which misses that match and trips the empty-variable rm guard.

Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ coverage
# Optional eslint cache
.eslintcache

# Python bytecode (observability test infra under .gaia/tests/)
__pycache__/
*.pyc

# Claude files
.claude/agent-memory/
.claude/worktrees/
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ A release change that requires the adopter to act, run a command or hand-migrate

### Changed

- `/gaia-plan` now pins each plan-execution task sub-agent to Sonnet by default instead of letting it inherit the orchestrator session's model (typically Opus). The feature's complexity is resolved upstream in `/gaia-spec` and `/gaia-plan` and their audits, so execution runs faster and cheaper on the lighter model; the planner can still escalate a specific deep-synthesis phase to Opus, naming which and why in `ORCHESTRATOR.md` (#538)
- `/update-deps` completes the flow on a `main`/`master` run: it now writes the update commit with a load-bearing `chore(deps)` subject (previously Phase 8 referenced "the update commit" but never wrote one), opens the PR, and merges it once the required checks are green (`--auto` under branch protection), then verifies the terminal `MERGED` state and cleans up the local checkout. A run on any other branch (or in CI) is unchanged: it pushes and leaves the PR to the branch owner. The `chore(deps)` subject clears the merge gate's dep-bump bypass, so the PR stays turnkey-mergeable without a code-review-audit marker (#534)
- GAIA's Serena code-search routing guidance is now language-agnostic: the advisory `code-search` rule nudges toward Serena's LSP-backed symbol tools for symbol queries in any language Serena indexes for the project (not only TypeScript, and no longer scoped to `app/`/`test/`), so an adopter who configures another language server gets the same routing. The enforcement guard stays deliberately TypeScript-conservative and tsconfig-gated, since a wrong hard-block on a non-TS search is worse than a miss while the rule only nudges (#533)
- `/gaia-debt` drives its fix PR through the standard PR Merge Workflow to completion instead of stopping after `gh pr create` for a human to merge by hand: it confirms intent (open-only or merge, defaulting to merge), then resolves the marker handshake, the maintainer-only CHANGELOG gate, the merge (`--auto` under branch protection, never `--admin`), and the post-merge verify-and-cleanup itself. The `code-review-audit` marker gate is unchanged, the skill drives the merge and never bypasses, fakes, or pre-empts it (#523)
Expand Down
2 changes: 1 addition & 1 deletion wiki/concepts/GAIA Plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ tags: [concept, claude, skill, orchestration]

## Orchestrator contract

`ORCHESTRATOR.md` mandates a brief **final summary** before awaiting merge confirmation: phases completed, sub-agents run, files touched (count), commits pushed (count + short SHAs), PR URL, and quality-gate status. A few lines, not a recap of every change. The final self-cleanup phase (deleting `.gaia/local/plans/{slug}/`) only runs after the user confirms the PR is ready to merge; see [[Task Orchestration]] for the gitignored-vs-tracked branching.
`ORCHESTRATOR.md` pins each task sub-agent to `model: sonnet` by default, decoupling execution from the orchestrator's own session model, since the spec and plan audits resolve the complexity upstream so execution runs on the cheaper model. The planner escalates a specific phase to Opus only when it names a genuinely deep-synthesis reason. `ORCHESTRATOR.md` also mandates a brief **final summary** before awaiting merge confirmation: phases completed, sub-agents run, files touched (count), commits pushed (count + short SHAs), PR URL, and quality-gate status. A few lines, not a recap of every change. The final self-cleanup phase (deleting `.gaia/local/plans/{slug}/`) only runs after the user confirms the PR is ready to merge; see [[Task Orchestration]] for the gitignored-vs-tracked branching.

## Pairs with

Expand Down
Loading