feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156) by stack72 · Pull Request #1227 · systeminit/swamp

stack72 · 2026-04-24T23:05:41Z

Summary

Adds swamp doctor audit [--tool <name>] — a new preflight diagnostic that verifies the AI-tool audit integration configured in the repo is healthy end-to-end. Non-zero exit on any fail, safe to gate in CI.

doctor is registered as a parent namespace so future diagnostics (doctor datastore, doctor vault, ...) plug in as siblings. Implements swamp-club#156.

Five checks

Check	Scope	What it catches
`binary-on-path`	all 4 tools	AI tool binary missing from PATH
`swamp-binary-on-path`	all 4 tools (+ Kiro baked-path)	swamp missing from PATH, or kiro hook's baked absolute path orphaned by `brew upgrade`
`agent-config-loadable`	per-tool parser	config missing, malformed, or broken (e.g. Kiro `tools: ["*"]`)
`default-agent-set`	Kiro only	`.kiro/settings/cli.json` not pointing `chat.defaultAgent` at `swamp`
`recording-smoke-test`	all 4 tools	upstream normalizer drift — synthetic `postToolUse` payload piped through `swamp audit record --from-hook` must land as a row in today's audit JSONL

The headline verification

Temporarily removing "shell" from KIRO_SHELL_TOOL_NAMES in hook_input.ts, recompiling, and running doctor against a freshly-init'd kiro repo produces:

✗ recording-smoke-test  synthetic kiro payload did NOT land in today's audit JSONL
    hint: The hook normalizer in src/domain/audit/hook_input.ts may have drifted from the upstream contract...

4 passed, 1 failed, 0 skipped — OVERALL: FAIL

Reverted before this commit. This is the load-bearing proof that the feature protects against future upstream drift instead of being theatre.

Name override

The issue body says "please NOT doctor". The team picked doctor anyway — the command is a namespace and will grow; the name fits that shape. Acknowledging here for reviewer context.

Scope note on the three cited Kiro bugs

Issue 156 cites three concrete bugs as motivation. Their status in this PR:

tools: ["*"] in kiro agent config → agent-config-loadable flags it with a hint; NOT fixed here.
Missing chat.defaultAgent: "swamp" → default-agent-set flags it; NOT fixed here.
Hard-coded tool_name === "execute_bash" in Kiro normalizer → already fixed upstream of this PR. src/domain/audit/hook_input.ts:146-150 has KIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"]). This PR's diagnostic catches the class of bug, it does not fix that specific one.

Ancillary changes

Extract todaysAuditFilePath helper. src/infrastructure/persistence/jsonl_audit_repository.ts had two duplicate copies of the commands-YYYY-MM-DD.jsonl format string; moved to src/domain/audit/audit_path.ts and consumed by both the writer and the doctor smoke-test reader. Single source of truth for the filename format.
Reserve echo swamp-doctor-smoke-test as a sentinel command prefix. The smoke-test writes a real audit row with this prefix; the audit timeline filters it out of the default view. Opt in with swamp audit --include-diagnostic.
Add parseAiToolOrThrow in src/cli/ai_tool_parser.ts — central validator for --tool flags, so bogus values fail with a usage error listing valid names.
Docs: design/audit.md, design/audit-doctor.md, and a short "verify the audit integration" block in the swamp-repo skill.

Follow-ups

UAT coverage tracked in systeminit/swamp-uat#160 (CLI) and systeminit/swamp-uat#161 (adversarial).
swamp-club has GitHub issues disabled — content/manual/reference/doctor.md and a "verify the integration" section for each content/manual/how-to/use-swamp-with-*.md need to be routed through the team's docs process.
POSIX-only PATH resolution — documented; Windows support deferred.

Test Plan

deno check — clean
deno lint — clean
deno fmt --check — clean
deno run test — 4922 passed, 0 failed, 1 ignored
deno run compile — produces working binary
Manual smoke-run on freshly-init'd kiro / claude / cursor / opencode repos — all pass
Regression injection (remove "shell" from KIRO_SHELL_TOOL_NAMES, recompile, rerun) — smoke-test correctly FAILS on kiro
Real-repo test against ~/code/kiro-audit (existing audit history) — passes; sentinel filter confirmed with swamp audit --include-diagnostic
CWD-independence adversarial test — ran from CWD=/ with --repo-dir /tmp/..., synthetic row landed in the target repo, not the CWD

🤖 Generated with Claude Code

…#156) Adds a new `swamp doctor` parent command with `doctor audit [--tool <name>]` as its first subcommand. Runs five independent preflight checks against the AI-tool audit integration configured in the repo and reports per-check pass/fail/skip with actionable hints in log and JSON modes. Exits non-zero on any check fail — safe to gate in CI. Checks: - binary-on-path: AI tool's own binary is resolvable on PATH - swamp-binary-on-path: swamp is on PATH (all four tools); for Kiro, also verifies the absolute swamp path baked into the hook file still resolves - agent-config-loadable: per-tool config parses and is well-formed (e.g. Kiro's `tools: ["*"]` regression) - default-agent-set: Kiro workspace default-agent points at `swamp` - recording-smoke-test: end-to-end — synthetic postToolUse payload round-trips through `swamp audit record --from-hook` into today's JSONL The smoke-test writes a real audit row tagged with the reserved command prefix `echo swamp-doctor-smoke-test`; the audit timeline filters that prefix out of the default view (`--include-diagnostic` opts back in). Also: - Extracts `todaysAuditFilePath` / `auditFilePathForTimestamp` / related helpers into `src/domain/audit/audit_path.ts`, consumed by both the JSONL writer and the doctor smoke-test reader — single source of truth for the `commands-YYYY-MM-DD.jsonl` filename format (previously duplicated twice inside jsonl_audit_repository.ts). - Introduces `parseAiToolOrThrow` in `src/cli/ai_tool_parser.ts` for centralised `--tool` validation. - Adds `design/audit.md` + `design/audit-doctor.md` and updates the swamp-repo skill. **Name override:** the issue body requests "please NOT `doctor`"; the team picked `doctor` anyway — the command namespace is meant to grow (future `doctor datastore`, `doctor vault`, etc.), and the name fits that shape. **Scope note:** issue 156 cites three concrete Kiro bugs as motivation (wildcard tools, missing default-agent, hard-coded `execute_bash` normalizer). The `execute_bash` one is **already fixed** upstream of this PR — `src/domain/audit/hook_input.ts:146-150` has `KIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"])`. This PR adds a durable preflight for the whole class of upstream-drift bugs, not a fix for any specific one. The other two bugs are caught by the new checks but not fixed by this PR. **Load-bearing manual verification (ADV-2):** temporarily removing `"shell"` from `KIRO_SHELL_TOOL_NAMES` and recompiling produces a FAIL on `recording-smoke-test` against a freshly-init'd kiro repo, confirming the diagnostic does catch normalizer drift. Reverted before this commit. **UAT coverage:** filed systeminit/swamp-uat#160 (CLI scenarios) and systeminit/swamp-uat#161 (adversarial scenarios). swamp-club has issues disabled — docs gap (new `content/manual/reference/doctor.md` and "verify the integration" section in the four `use-swamp-with-*.md` how-tos) needs to be routed through the team's docs process. 4773 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…azzling-stallman

Extracts tool-resolution logic into `resolveTargetTool` and covers: - flag tool wins over marker - marker fallback when no flag - NoToolConfiguredError when both absent - --tool validation rejects garbage - audit-skip tools (codex/copilot/none) accepted as overrides - command is registered as subcommand of doctorCommand - --tool and --repo-dir options present Closes the CLI-level integration gap flagged in the feature PR. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

github-actions

CLI UX Review

Blocking

NoToolConfiguredError shows a stack trace instead of a clean error message (src/domain/audit/doctor/check.ts)

NoToolConfiguredError extends plain Error, not UserError. When a user runs swamp doctor audit in a repo with no tool configured (the most likely first-run scenario), the error propagates to main.ts → renderError, which branches on instanceof UserError. Since it's not a UserError, it falls into the else branch that logs the full error object including a stack trace:
```
Error: No AI tool configured in .swamp.yaml and no --tool flag provided...
    at new NoToolConfiguredError (file:///.../check.ts:12:5)
    at resolveTargetTool (file:///.../doctor_audit.ts:54:11)
    ...
```
The error message itself is excellent and actionable — the stack trace contradicts it. The UserError class exists precisely for this: user mistakes that should render cleanly without a stack trace.

Fix: Change class NoToolConfiguredError extends Error → class NoToolConfiguredError extends UserError in src/domain/audit/doctor/check.ts (and update the import).

Suggestions

check-started event doubles every check name in log output (src/presentation/renderers/audit_doctor.ts)

The log renderer prints … binary-on-path on check-started, then immediately ✓ binary-on-path <message> on check-completed. Because writeOutput has no cursor movement, users see two lines per check. Since checks run fast (no long waits), the in-progress … line adds noise without helping readability. Dropping the check-started handler entirely (or making it a no-op like the JSON renderer) would produce the cleaner output users expect from a doctor-style tool.
--tool flag inconsistency with repo init (src/cli/commands/doctor_audit.ts)

swamp repo init and swamp repo upgrade expose -t, --tool (short + long). doctor audit exposes only --tool. Minor, since swamp audit record also omits -t, so this matches the audit-subcommand convention. Just flagging for awareness when the doctor namespace grows.
warn exit-0 may surprise CI users who pass --tool codex (src/cli/commands/doctor_audit.ts)

overallStatus: "warn" (all checks skipped) exits 0. A CI gate like swamp doctor audit --tool codex && echo ok will pass silently. The OVERALL: WARN is visible in log mode and overallStatus is in the JSON — but $? can't distinguish it. Documenting this behavior in the command description or adding an --warn-as-fail flag would help. Low priority since codex/copilot/none are explicitly no-hook tools.

Verdict

NEEDS CHANGES — one blocking issue: NoToolConfiguredError must extend UserError to avoid showing a stack trace on the most common first-run error.

- NoToolConfiguredError now extends UserError, so the first-run "no tool configured" path renders cleanly via renderError instead of dumping a stack trace. (CLI UX Review blocking finding.) - Drop check-started log handler — checks run fast enough that the in-progress line is visual noise. Users see one line per check (completion) instead of two. (CLI UX Review suggestion.) - Trim swamp-repo SKILL.md description from 1183 to 885 chars so tessl's 1024-char validation passes. Kept all load-bearing triggers (doctor audit, verify audit, preflight, audit log empty); dropped redundant aliases. (Skill Review blocking failure.) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

github-actions

CLI UX Review

Blocking

None.

Suggestions

No progress indicator during smoke test (src/presentation/renderers/audit_doctor.ts, check-started handler): The comment justifies the no-op as "checks run fast enough." The smoke test is the exception — it spawns a real swamp audit record subprocess, which requires binary resolution + file I/O. On cold starts or slow machines this could silently pause for 1–2s between lines and feel like a hang. Consider emitting a dim "running…" (or similar) specifically for the smoke test check, or at minimum revisit the blanket no-op if more slow checks are added under doctor.
"OVERALL: WARN" for unsupported tools (src/domain/audit/doctor/doctor_service.ts:74, renderer summaryLine): Running swamp doctor audit --tool codex (or copilot) today results in all checks skipping → 0 passed, 0 failed, N skipped — OVERALL: WARN. That status is technically correct but opaque — a user who doesn't know these tools aren't fully covered yet will be confused about what WARN means. A one-line note in the summary (e.g., hint: all checks were skipped — no audit integration is defined for this tool yet) would clarify intent. Not blocking since the valid tool list in the --tool flag help text does include these values, but it's a discoverability gap.
--include-diagnostic description (src/cli/commands/audit.ts): The description "Include rows written by swamp doctor audit's smoke test (filtered by default)" is accurate, but it's not immediately obvious to a user who hasn't read the doctor audit docs why smoke-test rows would appear in the audit timeline at all. Minor; a brief parenthetical like (sentinel rows prefixed echo swamp-doctor-smoke-test) would help power users correlating raw JSONL with CLI output.

Verdict

PASS — no blocking issues. Error messages are clear and actionable (NoToolConfiguredError is especially good), log/JSON dual-mode is correctly wired, exit codes are right, --tool validation is consistent, and the output format is readable and scannable.

…shold tessl review flagged swamp-extension-model at 86% (below 90% threshold), blocking the Skill Review CI gate on every PR that touches any skill. Pre-existing issue surfaced by PR #1227 triggering a Skill Review run. Changes: - Add explicit "## Development Workflow" with eight numbered steps and per-step validation checkpoints (author → verify registration → smoke test → unit tests → adversarial review → version → dry-run → publish). Addresses the "workflow_clarity" score (2/3 → 3/3). - Tighten "## When to Create a Custom Model" from a verbose multi- paragraph structure into a compact four-step decision list. Tessl score after: description 100%, content 85%, average 92.5%. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…eator Initial fix for the 86% tessl score was itself verbose (added 34 lines of prose that duplicated content already elsewhere in the file, violating "concise is key" and bloating the file further past the 500-line soft limit). Compact the Development Workflow to link-heavy one-line bullets that point at existing sections (Quick Start, smoke_testing, adversarial review). Net add over pre-PR state: +5 lines instead of +35. Content score holds at 85%, workflow_clarity stays 3/3, total average 92.5%. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

github-actions

CLI UX Review

Blocking

None.

Suggestions

warn exit-code semantics in CI: swamp doctor audit --tool none (or codex/copilot) exits 0 with OVERALL: WARN in yellow. The PR description advertises this as "safe to gate in CI", but a user who accidentally passes a no-hooks tool gets a green CI step with a yellow banner — easy to miss. Consider a one-line note in the --tool description or the warn summary message explaining that WARN means "this tool has no audit hooks; nothing was checked". Not blocking — the check names and messages already imply it.
audit record --tool description stays vague: The sibling swamp audit record --tool still says "AI tool providing hook input" with no valid-values hint, while the new doctor audit --tool lists them inline. Consistency win if audit record got the same treatment, but that's outside this PR's scope.
--include-diagnostic discoverability: The new flag is added to swamp audit (hidden command), so its help text is only reachable via swamp audit --help. Since audit is hidden, users who hit a polluted timeline from a smoke test will struggle to find this flag. A mention in the recording-smoke-test hint text ("run swamp audit --include-diagnostic to inspect") would close the loop.

Verdict

PASS — help text is complete, error messages are actionable (NoToolConfiguredError is particularly good), log/JSON output cover all fields, exit codes are correct.

github-actions

Adversarial Review

Critical / High

None found.

Medium

--include-diagnostic is silently defeated by the noise filter — src/domain/audit/audit_service.ts:148-176

DIAGNOSTIC_COMMAND_PREFIX is "echo swamp-doctor-smoke-test". The diagnostic filter (line 148–154) correctly passes the row through when includeDiagnostic is true. However, the row then hits the noise filter at line 175: isNoiseCommand matches because "echo" is in NOISE_COMMAND_PREFIXES (line 37), and the command starts with "echo ".

Breaking input: Run swamp audit --include-diagnostic (without --all). Diagnostic rows are passed through the first filter but caught by the second, so they never appear in the timeline. The --include-diagnostic flag is dead without --all.

Suggested fix: Exempt diagnostic rows from the noise filter:
```
const isDiagnosticRow = trimmedCommand.startsWith(DIAGNOSTIC_COMMAND_PREFIX);
if (!options.showAll && !isDiagnosticRow && isNoiseCommand(trimmedCommand)) {
  continue;
}
```

Low

Day-boundary TOCTOU in recording smoke test — src/domain/audit/doctor/checks/recording_smoke.ts:76-92

The subprocess writes a row timestamped by new Date() inside audit record, and then readAuditRowsFromToday calls todaysAuditFilePath, which also uses new Date(). If the two new Date() calls straddle midnight UTC, the writer's row lands in tomorrow's file while the reader looks in today's file. The check would false-fail.

Practically negligible (sub-second window), but worth noting for awareness — especially in CI environments running near midnight UTC.
No subprocess timeout in makeSwampSpawnFn — src/cli/commands/doctor_audit.ts:69-103

The spawned swamp audit record --from-hook process has no timeout. If it hangs (e.g., waiting for network in a future code path, or a deadlocked pipe), the doctor command hangs indefinitely. The AbortSignal is checked between checks (doctor_service.ts:110) but not during a check's subprocess call.

Consider passing signal: deps.abortSignal to Deno.Command or AbortSignal.timeout(N) as a safety net.
Inconsistent error propagation in checkOpenCode — src/domain/audit/doctor/checks/agent_config_loadable.ts:217-219

checkOpenCode re-throws non-NotFound errors (e.g., permission denied), while checkKiro, checkClaude, and checkCursor all catch and return a fail result for the same category of errors. The outer try/catch in doctor_service.ts:123 catches it safely, but the resulting error message is the generic "check threw" rather than a tool-specific hint.

Verdict

PASS — No critical or high severity issues. The core doctor logic is well-structured: checks are independent, errors don't propagate, the streaming protocol is correct, and test coverage is thorough. The noise-filter interaction (finding 1) is a real bug in the --include-diagnostic flag but only affects the audit timeline view, not the doctor command itself. Worth fixing before merge but not blocking.

github-actions

Code Review

Well-structured PR. Clean DDD layering with domain types/services in src/domain/audit/doctor/, application-layer orchestration via auditDoctor() exported through src/libswamp/mod.ts, CLI in src/cli/commands/, and presentation properly separated. The PreflightCheck interface, SpawnFn port, and ResolveBinary port enable clean testing without real subprocesses or binary resolution. Streaming event pattern is consistent with the rest of the codebase (datastoreStatus, etc.).

Every new source file has a companion _test.ts with thorough coverage — fake spawn functions, temp directories for filesystem checks, round-trip normalization tests for synthetic payloads. The compile-time contract in synthetic_payloads_test.ts (round-tripping fixtures through the normalizer) is a smart pattern that catches drift at CI time.

Blocking Issues

None.

Suggestions

CONFIG_FILES in agent_config_loadable.ts (line 30) — the file-path arrays in the values are never read programmatically; only the keys drive appliesTo via tool in CONFIG_FILES. The actual paths are hardcoded again inside each checkKiro/checkClaude/etc. function. Consider either using the arrays to derive paths (single source of truth) or replacing with a Set<string> for appliesTo to make intent clearer.
No resolve_binary_test.ts — binaryNameFor() is tested indirectly through binary_on_path_test.ts, but a direct unit test would strengthen coverage for the default branch and document the kiro → kiro-cli mapping explicitly.
Minor: AiTool import inconsistency — domain code in check.ts imports AiTool from ../../repo/repo_service.ts while the CLI command in doctor_audit.ts imports it from ../../infrastructure/persistence/repo_marker_repository.ts. Exporting AiTool through libswamp/mod.ts would let the CLI import from the public API surface. (Low priority — the existing audit.ts command follows the same infrastructure import pattern.)

stack72 and others added 3 commits April 24, 2026 01:46

Merge remote-tracking branch 'origin/main' into worktree-glistening-d…

ad4aeae

…azzling-stallman

github-actions Bot requested changes Apr 24, 2026

View reviewed changes

github-actions Bot previously approved these changes Apr 24, 2026

View reviewed changes

stack72 dismissed github-actions[bot]’s stale review via 6546ac3 April 24, 2026 23:35

github-actions Bot approved these changes Apr 24, 2026

View reviewed changes

github-actions Bot reviewed Apr 24, 2026

View reviewed changes

github-actions Bot approved these changes Apr 24, 2026

View reviewed changes

stack72 merged commit 0022d1c into main Apr 24, 2026
10 checks passed

stack72 deleted the worktree-glistening-dazzling-stallman branch April 24, 2026 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227

feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227
stack72 merged 6 commits intomainfrom
worktree-glistening-dazzling-stallman

stack72 commented Apr 24, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stack72 commented Apr 24, 2026

Summary

Five checks

The headline verification

Name override

Scope note on the three cited Kiro bugs

Ancillary changes

Follow-ups

Test Plan

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

CLI UX Review

Blocking

Suggestions

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

CLI UX Review

Blocking

Suggestions

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

CLI UX Review

Blocking

Suggestions

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Adversarial Review

Critical / High

Medium

Low

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Code Review

Blocking Issues

Suggestions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant