Skip to content

feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227

Merged
stack72 merged 6 commits intomainfrom
worktree-glistening-dazzling-stallman
Apr 24, 2026
Merged

feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227
stack72 merged 6 commits intomainfrom
worktree-glistening-dazzling-stallman

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented Apr 24, 2026

Summary

Adds swamp doctor audit [--tool <name>] — a new preflight diagnostic that verifies the AI-tool audit integration configured in the repo is healthy end-to-end. Non-zero exit on any fail, safe to gate in CI.

doctor is registered as a parent namespace so future diagnostics (doctor datastore, doctor vault, ...) plug in as siblings. Implements swamp-club#156.

Five checks

Check Scope What it catches
binary-on-path all 4 tools AI tool binary missing from PATH
swamp-binary-on-path all 4 tools (+ Kiro baked-path) swamp missing from PATH, or kiro hook's baked absolute path orphaned by brew upgrade
agent-config-loadable per-tool parser config missing, malformed, or broken (e.g. Kiro tools: ["*"])
default-agent-set Kiro only .kiro/settings/cli.json not pointing chat.defaultAgent at swamp
recording-smoke-test all 4 tools upstream normalizer drift — synthetic postToolUse payload piped through swamp audit record --from-hook must land as a row in today's audit JSONL

The headline verification

Temporarily removing "shell" from KIRO_SHELL_TOOL_NAMES in hook_input.ts, recompiling, and running doctor against a freshly-init'd kiro repo produces:

✗ recording-smoke-test  synthetic kiro payload did NOT land in today's audit JSONL
    hint: The hook normalizer in src/domain/audit/hook_input.ts may have drifted from the upstream contract...

4 passed, 1 failed, 0 skipped — OVERALL: FAIL

Reverted before this commit. This is the load-bearing proof that the feature protects against future upstream drift instead of being theatre.

Name override

The issue body says "please NOT doctor". The team picked doctor anyway — the command is a namespace and will grow; the name fits that shape. Acknowledging here for reviewer context.

Scope note on the three cited Kiro bugs

Issue 156 cites three concrete bugs as motivation. Their status in this PR:

  • tools: ["*"] in kiro agent configagent-config-loadable flags it with a hint; NOT fixed here.
  • Missing chat.defaultAgent: "swamp"default-agent-set flags it; NOT fixed here.
  • Hard-coded tool_name === "execute_bash" in Kiro normalizeralready fixed upstream of this PR. src/domain/audit/hook_input.ts:146-150 has KIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"]). This PR's diagnostic catches the class of bug, it does not fix that specific one.

Ancillary changes

  • Extract todaysAuditFilePath helper. src/infrastructure/persistence/jsonl_audit_repository.ts had two duplicate copies of the commands-YYYY-MM-DD.jsonl format string; moved to src/domain/audit/audit_path.ts and consumed by both the writer and the doctor smoke-test reader. Single source of truth for the filename format.
  • Reserve echo swamp-doctor-smoke-test as a sentinel command prefix. The smoke-test writes a real audit row with this prefix; the audit timeline filters it out of the default view. Opt in with swamp audit --include-diagnostic.
  • Add parseAiToolOrThrow in src/cli/ai_tool_parser.ts — central validator for --tool flags, so bogus values fail with a usage error listing valid names.
  • Docs: design/audit.md, design/audit-doctor.md, and a short "verify the audit integration" block in the swamp-repo skill.

Follow-ups

  • UAT coverage tracked in systeminit/swamp-uat#160 (CLI) and systeminit/swamp-uat#161 (adversarial).
  • swamp-club has GitHub issues disabled — content/manual/reference/doctor.md and a "verify the integration" section for each content/manual/how-to/use-swamp-with-*.md need to be routed through the team's docs process.
  • POSIX-only PATH resolution — documented; Windows support deferred.

Test Plan

  • deno check — clean
  • deno lint — clean
  • deno fmt --check — clean
  • deno run test4922 passed, 0 failed, 1 ignored
  • deno run compile — produces working binary
  • Manual smoke-run on freshly-init'd kiro / claude / cursor / opencode repos — all pass
  • Regression injection (remove "shell" from KIRO_SHELL_TOOL_NAMES, recompile, rerun) — smoke-test correctly FAILS on kiro
  • Real-repo test against ~/code/kiro-audit (existing audit history) — passes; sentinel filter confirmed with swamp audit --include-diagnostic
  • CWD-independence adversarial test — ran from CWD=/ with --repo-dir /tmp/..., synthetic row landed in the target repo, not the CWD

🤖 Generated with Claude Code

stack72 and others added 3 commits April 24, 2026 01:46
…#156)

Adds a new `swamp doctor` parent command with `doctor audit [--tool <name>]`
as its first subcommand. Runs five independent preflight checks against the
AI-tool audit integration configured in the repo and reports per-check
pass/fail/skip with actionable hints in log and JSON modes. Exits non-zero
on any check fail — safe to gate in CI.

Checks:
- binary-on-path: AI tool's own binary is resolvable on PATH
- swamp-binary-on-path: swamp is on PATH (all four tools); for Kiro, also
  verifies the absolute swamp path baked into the hook file still resolves
- agent-config-loadable: per-tool config parses and is well-formed
  (e.g. Kiro's `tools: ["*"]` regression)
- default-agent-set: Kiro workspace default-agent points at `swamp`
- recording-smoke-test: end-to-end — synthetic postToolUse payload
  round-trips through `swamp audit record --from-hook` into today's JSONL

The smoke-test writes a real audit row tagged with the reserved command
prefix `echo swamp-doctor-smoke-test`; the audit timeline filters that
prefix out of the default view (`--include-diagnostic` opts back in).

Also:
- Extracts `todaysAuditFilePath` / `auditFilePathForTimestamp` / related
  helpers into `src/domain/audit/audit_path.ts`, consumed by both the
  JSONL writer and the doctor smoke-test reader — single source of truth
  for the `commands-YYYY-MM-DD.jsonl` filename format (previously
  duplicated twice inside jsonl_audit_repository.ts).
- Introduces `parseAiToolOrThrow` in `src/cli/ai_tool_parser.ts` for
  centralised `--tool` validation.
- Adds `design/audit.md` + `design/audit-doctor.md` and updates the
  swamp-repo skill.

**Name override:** the issue body requests "please NOT `doctor`"; the
team picked `doctor` anyway — the command namespace is meant to grow
(future `doctor datastore`, `doctor vault`, etc.), and the name fits
that shape.

**Scope note:** issue 156 cites three concrete Kiro bugs as motivation
(wildcard tools, missing default-agent, hard-coded `execute_bash`
normalizer). The `execute_bash` one is **already fixed** upstream of
this PR — `src/domain/audit/hook_input.ts:146-150` has
`KIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"])`.
This PR adds a durable preflight for the whole class of upstream-drift
bugs, not a fix for any specific one. The other two bugs are caught by
the new checks but not fixed by this PR.

**Load-bearing manual verification (ADV-2):** temporarily removing
`"shell"` from `KIRO_SHELL_TOOL_NAMES` and recompiling produces a FAIL
on `recording-smoke-test` against a freshly-init'd kiro repo, confirming
the diagnostic does catch normalizer drift. Reverted before this commit.

**UAT coverage:** filed systeminit/swamp-uat#160 (CLI scenarios) and
systeminit/swamp-uat#161 (adversarial scenarios). swamp-club has
issues disabled — docs gap (new `content/manual/reference/doctor.md`
and "verify the integration" section in the four `use-swamp-with-*.md`
how-tos) needs to be routed through the team's docs process.

4773 tests passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Extracts tool-resolution logic into `resolveTargetTool` and covers:
- flag tool wins over marker
- marker fallback when no flag
- NoToolConfiguredError when both absent
- --tool validation rejects garbage
- audit-skip tools (codex/copilot/none) accepted as overrides
- command is registered as subcommand of doctorCommand
- --tool and --repo-dir options present

Closes the CLI-level integration gap flagged in the feature PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

  1. NoToolConfiguredError shows a stack trace instead of a clean error message (src/domain/audit/doctor/check.ts)

    NoToolConfiguredError extends plain Error, not UserError. When a user runs swamp doctor audit in a repo with no tool configured (the most likely first-run scenario), the error propagates to main.tsrenderError, which branches on instanceof UserError. Since it's not a UserError, it falls into the else branch that logs the full error object including a stack trace:

    Error: No AI tool configured in .swamp.yaml and no --tool flag provided...
        at new NoToolConfiguredError (file:///.../check.ts:12:5)
        at resolveTargetTool (file:///.../doctor_audit.ts:54:11)
        ...
    

    The error message itself is excellent and actionable — the stack trace contradicts it. The UserError class exists precisely for this: user mistakes that should render cleanly without a stack trace.

    Fix: Change class NoToolConfiguredError extends Errorclass NoToolConfiguredError extends UserError in src/domain/audit/doctor/check.ts (and update the import).

Suggestions

  1. check-started event doubles every check name in log output (src/presentation/renderers/audit_doctor.ts)

    The log renderer prints … binary-on-path on check-started, then immediately ✓ binary-on-path <message> on check-completed. Because writeOutput has no cursor movement, users see two lines per check. Since checks run fast (no long waits), the in-progress line adds noise without helping readability. Dropping the check-started handler entirely (or making it a no-op like the JSON renderer) would produce the cleaner output users expect from a doctor-style tool.

  2. --tool flag inconsistency with repo init (src/cli/commands/doctor_audit.ts)

    swamp repo init and swamp repo upgrade expose -t, --tool (short + long). doctor audit exposes only --tool. Minor, since swamp audit record also omits -t, so this matches the audit-subcommand convention. Just flagging for awareness when the doctor namespace grows.

  3. warn exit-0 may surprise CI users who pass --tool codex (src/cli/commands/doctor_audit.ts)

    overallStatus: "warn" (all checks skipped) exits 0. A CI gate like swamp doctor audit --tool codex && echo ok will pass silently. The OVERALL: WARN is visible in log mode and overallStatus is in the JSON — but $? can't distinguish it. Documenting this behavior in the command description or adding an --warn-as-fail flag would help. Low priority since codex/copilot/none are explicitly no-hook tools.

Verdict

NEEDS CHANGES — one blocking issue: NoToolConfiguredError must extend UserError to avoid showing a stack trace on the most common first-run error.

- NoToolConfiguredError now extends UserError, so the first-run
  "no tool configured" path renders cleanly via renderError instead
  of dumping a stack trace. (CLI UX Review blocking finding.)
- Drop check-started log handler — checks run fast enough that the
  in-progress line is visual noise. Users see one line per check
  (completion) instead of two. (CLI UX Review suggestion.)
- Trim swamp-repo SKILL.md description from 1183 to 885 chars so
  tessl's 1024-char validation passes. Kept all load-bearing
  triggers (doctor audit, verify audit, preflight, audit log empty);
  dropped redundant aliases. (Skill Review blocking failure.)

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
github-actions[bot]
github-actions Bot previously approved these changes Apr 24, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. No progress indicator during smoke test (src/presentation/renderers/audit_doctor.ts, check-started handler): The comment justifies the no-op as "checks run fast enough." The smoke test is the exception — it spawns a real swamp audit record subprocess, which requires binary resolution + file I/O. On cold starts or slow machines this could silently pause for 1–2s between lines and feel like a hang. Consider emitting a dim "running…" (or similar) specifically for the smoke test check, or at minimum revisit the blanket no-op if more slow checks are added under doctor.

  2. "OVERALL: WARN" for unsupported tools (src/domain/audit/doctor/doctor_service.ts:74, renderer summaryLine): Running swamp doctor audit --tool codex (or copilot) today results in all checks skipping → 0 passed, 0 failed, N skipped — OVERALL: WARN. That status is technically correct but opaque — a user who doesn't know these tools aren't fully covered yet will be confused about what WARN means. A one-line note in the summary (e.g., hint: all checks were skipped — no audit integration is defined for this tool yet) would clarify intent. Not blocking since the valid tool list in the --tool flag help text does include these values, but it's a discoverability gap.

  3. --include-diagnostic description (src/cli/commands/audit.ts): The description "Include rows written by swamp doctor audit's smoke test (filtered by default)" is accurate, but it's not immediately obvious to a user who hasn't read the doctor audit docs why smoke-test rows would appear in the audit timeline at all. Minor; a brief parenthetical like (sentinel rows prefixed echo swamp-doctor-smoke-test) would help power users correlating raw JSONL with CLI output.

Verdict

PASS — no blocking issues. Error messages are clear and actionable (NoToolConfiguredError is especially good), log/JSON dual-mode is correctly wired, exit codes are right, --tool validation is consistent, and the output format is readable and scannable.

…shold

tessl review flagged swamp-extension-model at 86% (below 90% threshold),
blocking the Skill Review CI gate on every PR that touches any skill.
Pre-existing issue surfaced by PR #1227 triggering a Skill Review run.

Changes:
- Add explicit "## Development Workflow" with eight numbered steps and
  per-step validation checkpoints (author → verify registration →
  smoke test → unit tests → adversarial review → version → dry-run →
  publish). Addresses the "workflow_clarity" score (2/3 → 3/3).
- Tighten "## When to Create a Custom Model" from a verbose multi-
  paragraph structure into a compact four-step decision list.

Tessl score after: description 100%, content 85%, average 92.5%.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…eator

Initial fix for the 86% tessl score was itself verbose (added 34 lines
of prose that duplicated content already elsewhere in the file,
violating "concise is key" and bloating the file further past the
500-line soft limit).

Compact the Development Workflow to link-heavy one-line bullets that
point at existing sections (Quick Start, smoke_testing, adversarial
review). Net add over pre-PR state: +5 lines instead of +35. Content
score holds at 85%, workflow_clarity stays 3/3, total average 92.5%.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. warn exit-code semantics in CI: swamp doctor audit --tool none (or codex/copilot) exits 0 with OVERALL: WARN in yellow. The PR description advertises this as "safe to gate in CI", but a user who accidentally passes a no-hooks tool gets a green CI step with a yellow banner — easy to miss. Consider a one-line note in the --tool description or the warn summary message explaining that WARN means "this tool has no audit hooks; nothing was checked". Not blocking — the check names and messages already imply it.

  2. audit record --tool description stays vague: The sibling swamp audit record --tool still says "AI tool providing hook input" with no valid-values hint, while the new doctor audit --tool lists them inline. Consistency win if audit record got the same treatment, but that's outside this PR's scope.

  3. --include-diagnostic discoverability: The new flag is added to swamp audit (hidden command), so its help text is only reachable via swamp audit --help. Since audit is hidden, users who hit a polluted timeline from a smoke test will struggle to find this flag. A mention in the recording-smoke-test hint text ("run swamp audit --include-diagnostic to inspect") would close the loop.

Verdict

PASS — help text is complete, error messages are actionable (NoToolConfiguredError is particularly good), log/JSON output cover all fields, exit codes are correct.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. --include-diagnostic is silently defeated by the noise filtersrc/domain/audit/audit_service.ts:148-176

    DIAGNOSTIC_COMMAND_PREFIX is "echo swamp-doctor-smoke-test". The diagnostic filter (line 148–154) correctly passes the row through when includeDiagnostic is true. However, the row then hits the noise filter at line 175: isNoiseCommand matches because "echo" is in NOISE_COMMAND_PREFIXES (line 37), and the command starts with "echo ".

    Breaking input: Run swamp audit --include-diagnostic (without --all). Diagnostic rows are passed through the first filter but caught by the second, so they never appear in the timeline. The --include-diagnostic flag is dead without --all.

    Suggested fix: Exempt diagnostic rows from the noise filter:

    const isDiagnosticRow = trimmedCommand.startsWith(DIAGNOSTIC_COMMAND_PREFIX);
    if (!options.showAll && !isDiagnosticRow && isNoiseCommand(trimmedCommand)) {
      continue;
    }

Low

  1. Day-boundary TOCTOU in recording smoke testsrc/domain/audit/doctor/checks/recording_smoke.ts:76-92

    The subprocess writes a row timestamped by new Date() inside audit record, and then readAuditRowsFromToday calls todaysAuditFilePath, which also uses new Date(). If the two new Date() calls straddle midnight UTC, the writer's row lands in tomorrow's file while the reader looks in today's file. The check would false-fail.

    Practically negligible (sub-second window), but worth noting for awareness — especially in CI environments running near midnight UTC.

  2. No subprocess timeout in makeSwampSpawnFnsrc/cli/commands/doctor_audit.ts:69-103

    The spawned swamp audit record --from-hook process has no timeout. If it hangs (e.g., waiting for network in a future code path, or a deadlocked pipe), the doctor command hangs indefinitely. The AbortSignal is checked between checks (doctor_service.ts:110) but not during a check's subprocess call.

    Consider passing signal: deps.abortSignal to Deno.Command or AbortSignal.timeout(N) as a safety net.

  3. Inconsistent error propagation in checkOpenCodesrc/domain/audit/doctor/checks/agent_config_loadable.ts:217-219

    checkOpenCode re-throws non-NotFound errors (e.g., permission denied), while checkKiro, checkClaude, and checkCursor all catch and return a fail result for the same category of errors. The outer try/catch in doctor_service.ts:123 catches it safely, but the resulting error message is the generic "check threw" rather than a tool-specific hint.

Verdict

PASS — No critical or high severity issues. The core doctor logic is well-structured: checks are independent, errors don't propagate, the streaming protocol is correct, and test coverage is thorough. The noise-filter interaction (finding 1) is a real bug in the --include-diagnostic flag but only affects the audit timeline view, not the doctor command itself. Worth fixing before merge but not blocking.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured PR. Clean DDD layering with domain types/services in src/domain/audit/doctor/, application-layer orchestration via auditDoctor() exported through src/libswamp/mod.ts, CLI in src/cli/commands/, and presentation properly separated. The PreflightCheck interface, SpawnFn port, and ResolveBinary port enable clean testing without real subprocesses or binary resolution. Streaming event pattern is consistent with the rest of the codebase (datastoreStatus, etc.).

Every new source file has a companion _test.ts with thorough coverage — fake spawn functions, temp directories for filesystem checks, round-trip normalization tests for synthetic payloads. The compile-time contract in synthetic_payloads_test.ts (round-tripping fixtures through the normalizer) is a smart pattern that catches drift at CI time.

Blocking Issues

None.

Suggestions

  1. CONFIG_FILES in agent_config_loadable.ts (line 30) — the file-path arrays in the values are never read programmatically; only the keys drive appliesTo via tool in CONFIG_FILES. The actual paths are hardcoded again inside each checkKiro/checkClaude/etc. function. Consider either using the arrays to derive paths (single source of truth) or replacing with a Set<string> for appliesTo to make intent clearer.

  2. No resolve_binary_test.tsbinaryNameFor() is tested indirectly through binary_on_path_test.ts, but a direct unit test would strengthen coverage for the default branch and document the kirokiro-cli mapping explicitly.

  3. Minor: AiTool import inconsistency — domain code in check.ts imports AiTool from ../../repo/repo_service.ts while the CLI command in doctor_audit.ts imports it from ../../infrastructure/persistence/repo_marker_repository.ts. Exporting AiTool through libswamp/mod.ts would let the CLI import from the public API surface. (Low priority — the existing audit.ts command follows the same infrastructure import pattern.)

@stack72 stack72 merged commit 0022d1c into main Apr 24, 2026
10 checks passed
@stack72 stack72 deleted the worktree-glistening-dazzling-stallman branch April 24, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant