Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude-plugin/agents/cache-analyzer.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
name: cache-analyzer
description: Analyze disk usage, cache accumulation, and orphaned worktrees in the Buddy Evolver plugin. Use when asked to "analyze cache", "check disk usage", "find orphaned worktrees", or "cache report".
model: inherit
tools:
- Bash
- Read
Expand Down
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ cheap Ubuntu checks run in GitHub Actions automatically.
Before requesting review, run the full local suite on macOS:

- [ ] `make test-all` — all 9 tiers pass (328 tests: smoke / unit / security / integration / functional / UI / e2e / snapshots / docs)
- [ ] `scripts/upload-test-results.sh` — Check Run appears on this PR's head commit
- [ ] `scripts/upload-test-results.sh` — run after push, **before** opening this PR (posts commit status that CI checks immediately)
- [ ] If touching UI: `scripts/test-visual-smoke.sh` — visual checks pass, screenshot attached below

**Additional checks (run when relevant):**
Expand Down
34 changes: 13 additions & 21 deletions .github/workflows/ci-verify-local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@ name: Verify Local Tests

# Checks that the contributor has run local tests and uploaded results.
# Accepts two forms of evidence (in order of preference):
# 1. A "Local Tests (macOS)" Check Run on the head commit (created by
# upload-test-results.sh when the token has checks:write via GitHub App)
# 1. A "Local Tests (macOS)" commit status on the head commit (created by
# upload-test-results.sh using the Statuses API — works with any
# repo-scoped PAT, no GitHub App required)
# 2. A PR comment from upload-test-results.sh containing the results table
# (fallback used when only a PAT is available -- public-repo Checks API
# requires GitHub App auth)
# (fallback used if the Statuses API call fails)
#
# If neither is found, posts a sticky comment asking the contributor to run
# scripts/test-all.sh && scripts/upload-test-results.sh.
# scripts/test-all.sh && scripts/upload-test-results.sh after pushing.
#
# SECURITY: All event data flows through env: variables. No direct
# ${{ github.event.* }} interpolation in run: blocks.
Expand Down Expand Up @@ -41,22 +41,14 @@ jobs:
run: |
set -euo pipefail

# ── Option 1: Check Run (requires GitHub App token) ────────
runs_json=$(gh api "repos/$REPO/commits/$HEAD_SHA/check-runs" --paginate)
status=$(echo "$runs_json" | python3 -c '
import json, sys
data = json.load(sys.stdin)
runs = data.get("check_runs", [])
matches = [r for r in runs if r.get("name") == "Local Tests (macOS)"]
if not matches:
print("missing")
else:
print(matches[0].get("conclusion", "unknown"))
')

if [ "$status" != "missing" ]; then
echo "status=$status" >> "$GITHUB_OUTPUT"
echo "Found Check Run with status: $status"
# ── Option 1: Commit status (any repo-scoped PAT) ──────────
state=$(gh api "repos/$REPO/commits/$HEAD_SHA/status" \
--jq '[.statuses[] | select(.context == "Local Tests (macOS)") | .state][0]' \
2>/dev/null || echo "")

if [ -n "$state" ] && [ "$state" != "null" ]; then
echo "status=$state" >> "$GITHUB_OUTPUT"
echo "Found commit status: $state"
exit 0
fi

Expand Down
29 changes: 17 additions & 12 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,13 @@ scripts/test-smoke.sh Smoke tier: build sanity + CLI contract (<30s,
scripts/test-security.sh Security validation test suite (~25 tests)
scripts/test-ui.sh Buddy card rendering against fixtures (23 tests)
scripts/test-snapshots.sh Golden file comparison for CLI output (6 tests)
scripts/test-docs.sh Documentation path + link + count consistency (16 tests)
scripts/test-docs.sh Documentation path + link + count consistency (18 tests)
scripts/test-perf.sh Performance benchmarks (7 benchmarks, on-demand)
scripts/coverage.sh Local HTML coverage report (test-results/coverage/)
scripts/test-ui-renderer.py Standalone Python renderer (reference for /buddy-status)
scripts/test-visual-smoke.sh Manual pre-release visual check (interactive)
scripts/test-all.sh Master runner — all tiers, JSON/JUnit output
scripts/upload-test-results.sh Uploads results to GitHub as a Check Run
scripts/upload-test-results.sh Uploads results to GitHub as a commit status
scripts/bump-version.sh Atomic version bump across plugin.json + marketplace.json + README badge
scripts/update-changelog.sh Move [Unreleased] content to dated [X.Y.Z] section in CHANGELOG.md
scripts/setup-labels.sh One-time GitHub label setup for new/forked repos
Expand Down Expand Up @@ -170,19 +170,21 @@ Four workflows in `.github/workflows/`:

### Local → GitHub bridge

`scripts/upload-test-results.sh` reads `test-results/results.json` and POSTs a GitHub Check Run via `gh api`. On permission failure (e.g. forks without `checks:write`), falls back to `gh pr comment`.
`scripts/upload-test-results.sh` reads `test-results/results.json` and POSTs a commit status via the GitHub Statuses API (any repo-scoped PAT — no GitHub App required). Run this **after pushing but before opening the PR** so CI finds the status the moment the PR is created. Falls back to `gh pr comment` if the Statuses API fails.

### Contributor workflow

1. Edit code on macOS.
2. Run `scripts/test-all.sh` — all 6 tiers must pass.
3. Run `scripts/upload-test-results.sh` — results appear as a Check Run on the current commit.
4. Push the branch, open a PR.
5. `ci-quality.yml` runs on Ubuntu; `ci-verify-local.yml` confirms the Check Run is present and green.
6. Maintainer reviews and merges.
3. Commit and push the branch (Desktop App or `git push`).
4. Run `scripts/upload-test-results.sh` — posts a commit status on the pushed commit.
5. Open a PR — `ci-verify-local.yml` finds the commit status and passes immediately.
6. `ci-quality.yml` runs on Ubuntu; maintainer reviews and merges.

## Automations

**Session skills** (`/start-session` through `/session-exit`) are coordination-only — they run at your current main-session model and do not dispatch subagents. Only the agents below have a `model:` field that specifies an independent model for that agent's work.

### Hook: session-start context injection

A `SessionStart` hook in `hooks/hooks.json` runs `hooks/session-start.sh` at the start of each Claude Code session. **Dynamic discovery**: parses frontmatter from every SKILL.md, agent markdown file, and hook definition to emit up-to-date lists with no hardcoded drift. Compares the current branch to `origin/main` via a cached `git fetch` (5-min TTL) and warns if >10 commits behind. Always exits 0 (never blocks session startup). Timeout: 10s.
Expand All @@ -204,11 +206,14 @@ Plan → Execute transition checkpoint. Run after a plan is approved (Plan Mode
Pre-commit wrap-up. Run BEFORE clicking the Desktop App's "Commit Changes" button. Unconditional linear pipeline:
1. Token review with `--apply --force`
2. Full test pipeline via `scripts/test-all.sh` — all 6 tiers
3. Upload results to GitHub as a Check Run via `scripts/upload-test-results.sh`
4. Security review via `security-reviewer` agent (conditional on Swift changes)
5. Sync docs via `/sync-docs`
6. Comment review via the `comment-reviewer` Haiku agent
7. Unified summary table
3. Security review via `security-reviewer` agent (conditional on Swift changes)
4. Sync docs via `/sync-docs`
5. Comment review via the `comment-reviewer` Haiku agent
6. Unified summary table + reminder to run `scripts/upload-test-results.sh` after push

### Phase 4: GitHub (commit / PR / merge)

Handled via the Desktop App buttons — no Claude Code automation at this phase by design. Keeping humans at the publish gate.

### Skill: /session-deploy

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,4 +176,4 @@ use `findAll()`, assert byte-length, add `[DRY RUN]` branch, call
- Swift changes trigger the `security-reviewer` agent (read-only, reviews for
validation gaps, byte-length violations, unsafe patterns)
- Maintainer checks byte-length invariant compliance on all patcher changes
- CI must pass before merge: `ci-quality.yml` (lint/JSON/hygiene, runs on Ubuntu) and `ci-verify-local.yml` (confirms that `scripts/test-all.sh && scripts/upload-test-results.sh` has posted a passing Check Run on the head commit)
- CI must pass before merge: `ci-quality.yml` (lint/JSON/hygiene, runs on Ubuntu) and `ci-verify-local.yml` (confirms that `scripts/test-all.sh && scripts/upload-test-results.sh` has posted a passing commit status on the head commit — run the upload after push, before opening the PR)
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -574,13 +574,13 @@ bash scripts/test-all.sh
| Snapshots | `scripts/test-snapshots.sh` | full-system | Golden file comparison for CLI output |
| Docs | `scripts/test-docs.sh` | peripheral | Documentation path + link + count consistency |

**CI** is local-first: `ci-quality.yml` runs on Ubuntu for every PR (shellcheck, JSON/YAML validation, hygiene checks). macOS-dependent tests run on contributor machines via `scripts/test-all.sh && scripts/upload-test-results.sh`; `ci-verify-local.yml` blocks merge until the upload appears and passes.
**CI** is local-first: `ci-quality.yml` runs on Ubuntu for every PR (shellcheck, JSON/YAML validation, hygiene checks). macOS-dependent tests run on contributor machines via `scripts/test-all.sh && scripts/upload-test-results.sh`; `ci-verify-local.yml` blocks merge until a passing commit status appears. Run the upload after pushing but before opening the PR.

Run everything locally:

```bash
bash scripts/test-all.sh # all 6 tiers, emits test-results/results.json
bash scripts/upload-test-results.sh # publish as GitHub Check Run on this commit
bash scripts/upload-test-results.sh # post commit status (run after push, before PR)
bash scripts/coverage.sh # local HTML coverage → test-results/coverage/index.html
```

Expand All @@ -600,8 +600,9 @@ rather than opening a public issue. See [SECURITY.md](SECURITY.md) for details.
3. Make your changes
4. Run `swift test --package-path scripts/BuddyPatcher && bash scripts/test-security.sh`
5. Run `bash scripts/test-all.sh` — all 6 tiers must pass
6. Run `bash scripts/upload-test-results.sh` to post results as a Check Run
7. Open a PR against `main` — the [PR template](.github/PULL_REQUEST_TEMPLATE.md) will guide you
6. Commit and push the branch
7. Run `bash scripts/upload-test-results.sh` — posts commit status on the pushed commit
8. Open a PR against `main` — `ci-verify-local.yml` will find the status immediately

**Key constraints** — if you modify the Swift source in `scripts/BuddyPatcher/`:

Expand Down
2 changes: 1 addition & 1 deletion agents/security-reviewer.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ description: Use this agent when code changes are made to BuddyPatcher Swift fil
</commentary>
</example>

model: inherit
model: sonnet
color: red
tools: ["Read", "Grep", "Glob"]
---
Expand Down
75 changes: 75 additions & 0 deletions scripts/test-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# 8. hooks/hooks.json shell scripts exist on disk
# 9. Session workflow skills (session-end, session-deploy) only reference real files
# 10. Retired skills (buddy, test-patch, update-species-map) stay retired
# 11. session-execute agent model table matches all agent frontmatter
#
# Output: "Results: N passed, M failed" on the last line.
set -uo pipefail
Expand Down Expand Up @@ -419,6 +420,80 @@ fi

echo

# ── Group 11: Agent model table drift check ──────────────────────
#
# Validates that the agent model table in skills/session-execute/SKILL.md
# matches the actual model: fields in each agent's frontmatter.
# Prevents the table from silently drifting when an agent's model is changed.

echo " --- Group 11: Agent model table drift ---"
echo

drift_output=$(python3 - <<'PY'
import re, pathlib, sys

content = pathlib.Path("skills/session-execute/SKILL.md").read_text()

# Find the code block that starts with "Component Model Recommendations"
code_block_match = re.search(
r'```\nComponent Model Recommendations\n(.*?)```',
content, re.DOTALL
)
if not code_block_match:
print("ERROR: Could not find model table in skills/session-execute/SKILL.md")
sys.exit(1)

table_text = code_block_match.group(1)

# Agent rows follow the "Agent Model Configured in" header and its rule line,
# and end before the next box-drawing rule line.
agent_section_match = re.search(
r'Agent\s+Model\s+Configured in\n.+\n((?:.+\n)+)',
table_text
)
if not agent_section_match:
print("ERROR: Could not parse agent section from model table")
sys.exit(1)

rows = []
for line in agent_section_match.group(1).splitlines():
line = line.strip()
if not line or not line[0].isalpha():
break # stop at closing rule line (box-drawing chars are not alpha)
parts = line.split()
if len(parts) >= 3:
rows.append((parts[0], parts[1], parts[2]))

mismatches = []
for agent_name, expected_model, agent_file in rows:
try:
agent_content = pathlib.Path(agent_file).read_text()
m = re.search(r'^model:\s*(\S+)', agent_content, re.MULTILINE)
actual = m.group(1) if m else "inherit"
if actual != expected_model:
mismatches.append(
f"{agent_name}: table={expected_model!r} frontmatter={actual!r} ({agent_file})"
)
except FileNotFoundError:
mismatches.append(f"{agent_name}: agent file missing: {agent_file}")

for msg in mismatches:
print(f" {msg}")
sys.exit(1 if mismatches else 0)
PY
)
drift_exit=$?

if [ "$drift_exit" -eq 0 ]; then
assert_pass "Agent model table in /session-execute matches all agent frontmatter" "0"
else
echo " [FAIL] Agent model table drift detected:"
echo "$drift_output"
FAILED=$((FAILED + 1))
fi

echo

# ── Summary ────────────────────────────────────────────────────────

echo "Results: $PASSED passed, $FAILED failed"
Expand Down
68 changes: 32 additions & 36 deletions scripts/upload-test-results.sh
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
#!/bin/bash
# Upload local test results as a GitHub Check Run.
# Upload local test results as a GitHub commit status.
#
# Reads test-results/results.json (produced by test-all.sh) and creates a
# Check Run on the current commit via the GitHub Checks API. This is the
# local-to-CI bridge: macOS-dependent tests run on the contributor's machine,
# but GitHub sees the pass/fail state alongside the Ubuntu-side quality checks.
# commit status on the current commit via the GitHub Statuses API. This is
# the local-to-CI bridge: macOS-dependent tests run on the contributor's
# machine, but GitHub sees the pass/fail state alongside the Ubuntu-side
# quality checks.
#
# Requires:
# - gh CLI authenticated with a token that has `checks:write` on the repo
# - gh CLI authenticated with a repo-scoped token (no GitHub App needed)
# - test-results/results.json from a prior test-all.sh run
# - Must be run from inside the repo (git rev-parse used to locate commit)
# - Must be run AFTER the commit is pushed — the commit SHA must exist on
# the remote before CI fires (run before opening the PR)
#
# Usage:
# scripts/upload-test-results.sh # create Check Run on HEAD
# scripts/upload-test-results.sh # post commit status on HEAD
# scripts/upload-test-results.sh --dry-run # print payload, don't POST
#
# Fallback: if the Checks API isn't available (missing perms, no PR), the
# script tries to comment on the current PR instead (gh pr comment).
# Fallback: if the Statuses API fails, the script tries to comment on the
# current PR instead (gh pr comment).

set -uo pipefail

Expand Down Expand Up @@ -162,6 +164,8 @@ PY
TITLE=$(echo "$SUMMARY_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['title'])")
SUMMARY=$(echo "$SUMMARY_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['summary'])")
CONCLUSION=$(echo "$SUMMARY_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['conclusion'])")
TOTAL_PASSED=$(echo "$SUMMARY_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['total_passed'])")
TOTAL_TESTS=$(echo "$SUMMARY_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['total_tests'])")

echo
echo " Uploading test results for:"
Expand All @@ -172,49 +176,41 @@ echo " title: $TITLE"
echo

if [ "$DRY_RUN" -eq 1 ]; then
echo " [DRY RUN] would POST to /repos/$REPO_SLUG/check-runs"
STATE=$( [ "$CONCLUSION" = "success" ] && echo "success" || echo "failure" )
echo " [DRY RUN] would POST to /repos/$REPO_SLUG/statuses/$COMMIT_SHA"
echo " state: $STATE"
echo " context: Local Tests (macOS)"
echo " description: $TOTAL_PASSED/$TOTAL_TESTS passed"
echo
echo " Summary body:"
echo " ─────────────"
echo " Summary body (PR comment fallback):"
echo " ────────────────────────────────────"
echo "$SUMMARY"
echo " ─────────────"
echo " ────────────────────────────────────"
exit 0
fi

# ── Create Check Run via Checks API ────────────────────────────────
# ── Create commit status via Statuses API ──────────────────────────
# Works with any repo-scoped PAT — no GitHub App required.

CHECK_RUN_PAYLOAD=$(python3 - <<PY
import json
print(json.dumps({
"name": "Local Tests (macOS)",
"head_sha": "$COMMIT_SHA",
"status": "completed",
"conclusion": "$CONCLUSION",
"output": {
"title": $(python3 -c "import json,sys; print(json.dumps(sys.argv[1]))" "$TITLE"),
"summary": $(python3 -c "import json,sys; print(json.dumps(sys.argv[1]))" "$SUMMARY"),
},
}))
PY
)
STATE=$( [ "$CONCLUSION" = "success" ] && echo "success" || echo "failure" )
DESCRIPTION="$TOTAL_PASSED/$TOTAL_TESTS passed"

# POST to Checks API. Capture response to detect permission failures.
RESPONSE=$(echo "$CHECK_RUN_PAYLOAD" | gh api \
"repos/$REPO_SLUG/check-runs" \
RESPONSE=$(gh api \
"repos/$REPO_SLUG/statuses/$COMMIT_SHA" \
--method POST \
--input - 2>&1)
-f state="$STATE" \
-f context="Local Tests (macOS)" \
-f description="$DESCRIPTION" 2>&1)
STATUS=$?

if [ "$STATUS" -eq 0 ]; then
CHECK_URL=$(echo "$RESPONSE" | python3 -c "import json,sys; print(json.load(sys.stdin).get('html_url', ''))" 2>/dev/null || echo "")
echo " [+] Check Run created"
[ -n "$CHECK_URL" ] && echo " $CHECK_URL"
echo " [+] Commit status posted (context: Local Tests (macOS), state: $STATE)"
exit 0
fi

# ── Fallback: PR comment ────────────────────────────────────────────

echo " [!] Checks API failed (likely missing checks:write perm)"
echo " [!] Statuses API failed"
echo " Response: $RESPONSE" | head -3
echo " [~] Falling back to PR comment..."

Expand Down
Loading