Skip to content

feat(gaia): reliable, cycle-aware cost accounting for plan execution (SPEC-017)#539

Merged
stevensacks merged 8 commits into
mainfrom
spec-017-execute-cost-accounting
Jul 3, 2026
Merged

feat(gaia): reliable, cycle-aware cost accounting for plan execution (SPEC-017)#539
stevensacks merged 8 commits into
mainfrom
spec-017-execute-cost-accounting

Conversation

@stevensacks

Copy link
Copy Markdown
Contributor

Summary

Hardens SPEC-013's execute-time token + time tally so it:

  • Fires deterministically off the orchestrator's per-phase git operations (a PreToolUse hook on git commit/git push), not a prose bullet a cold orchestrator can skip.
  • Aggregates the whole execution across every session it took (resumed, halted, worktree) by recording each session's execute row to the durable ledger.
  • Dedups a session's re-invocation rows on read (non-partial, max total, tiebreak latest ended_at).
  • Renders a full-cycle spec / plan / execute / total breakdown at a merge-time PostToolUse hook on gh pr merge, omitting the spec line for a spec-less plan.
  • Replaces the existing instruction-driven execute tally so no phase is double-counted.

Phases

Phase 1 (this PR, first commit):

  • ​.gaia/scripts/token-rollup.sh — the ledger roll-up reader + dedup/aggregate/render (FC-1). 15/15 bats.
  • ​.claude/hooks/token-tally-git-op.sh + shared resolver lib ​.claude/hooks/lib/gaia-active-plan.sh — the git-op recording hook (FC-2/FC-4). 13/13 bats.

Phase 2 (follow-up commit):

  • ​.claude/hooks/token-rollup-merge.sh — the merge-time roll-up hook (FC-3).
  • ​.claude/settings.json + plan.md wiring — register both hooks, remove the manual execute tally (FC-5).

Test plan

  • bats .gaia/scripts/tests/token-rollup.bats — green
  • bats .gaia/tests/hooks/token-tally-git-op.bats — green
  • pnpm typecheck && pnpm lint — clean
  • Pre-merge code-review-audit before merge.

🤖 Generated with Claude Code

…g hook

Add .gaia/scripts/token-rollup.sh: reads the token ledger, dedups each
session's re-invocation rows (non-partial, max total, tiebreak latest
ended_at), and renders the spec/plan/execute/total cycle breakdown with
summed active-span elapsed and the four billing buckets. Always exits 0;
degrades to a partial/absent figure with a marker, never fabricates.

Add .claude/hooks/token-tally-git-op.sh (PreToolUse Bash) plus shared
resolver lib .claude/hooks/lib/gaia-active-plan.sh: record this
execution session's ground-truth tally on the orchestrator's per-phase
git commit/push, gated on an active plan folder, keyed to the feature.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

code-review-audit skipped: no audit-relevant files changed in the un-audited delta (since the last clean audit); GAIA-Audit commit status stamped on HEAD so the merge gate is satisfied with no local audit run

Add .claude/hooks/token-rollup-merge.sh (PostToolUse Bash): on gh pr
merge, resolve the feature key from the active plan folder (or the
ledger's most recent execute row as a labeled fallback) and render the
full-cycle spec/plan/execute/total roll-up into the merging session.

Register both new hooks in .claude/settings.json (token-tally-git-op.sh
under PreToolUse, token-rollup-merge.sh under PostToolUse) and replace
plan.md's manual execute-tally instruction with the automatic git-op
recording plus roll-up reporting, so no phase is double-counted.

Also drop a stray em dash from token-rollup.sh's header comment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

code-review-audit skipped: no audit-relevant files changed in the un-audited delta (since the last clean audit); GAIA-Audit commit status stamped on HEAD so the merge gate is satisfied with no local audit run

Format token-rollup.sh's rendered action totals, grand total, and billing
buckets with thousands separators, right-aligned into shared columns
(totals to the grand-total width, buckets to the widest bucket) for
readability. Ledger values and all internal arithmetic stay raw; grouping
happens only at the print layer. Update the reader's bats oracles to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

code-review-audit skipped: no audit-relevant files changed in the un-audited delta (since the last clean audit); GAIA-Audit commit status stamped on HEAD so the merge gate is satisfied with no local audit run

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

code-review-audit skipped: no audit-relevant files changed in the un-audited delta (since the last clean audit); GAIA-Audit commit status stamped on HEAD so the merge gate is satisfied with no local audit run

stevensacks and others added 4 commits July 3, 2026 23:07
… lines (#539)

A ledger line that is valid JSON but not an object (a bare scalar or array)
survived the try/fromjson corrupt-line guard, then threw on .spec_id indexing
and aborted the whole filter, silently dropping every good row. Coerce
non-objects to the existing __BAD__ sentinel so they bump the bad count and fire
the partial marker instead. Adds a bare-scalar/array fixture and a regression
test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… edge (#539)

The command matchers use a newline as a shell separator, so a heredoc body
line that begins with the matched command does match. The comments overclaimed
'never inside a heredoc body'; correct them to note the benign, accepted edge
(mid-line prose in a quoted string still does not match).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

code-review-audit skipped: GAIA-Audit trailer matches version 1.6.1 tree 08733ae

@stevensacks stevensacks merged commit 9485383 into main Jul 3, 2026
6 checks passed
@stevensacks stevensacks deleted the spec-017-execute-cost-accounting branch July 3, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant