Conversation
`pnpm task start` now drops `agent-scope/.pending-onboarding` (gitignored)
and copies the trigger to the clipboard. Three parallel consumers compete
for the marker — whichever reads it deletes it, so onboarding fires once:
- sessionStart hook: consumes on any new chat
- postToolUse hook: consumes after any tool call in an existing chat
- Agent rule: mandatory top-of-turn `Read` check covers the gap
when the user sends a purely conversational message
This closes the "hi-in-existing-chat doesn't trigger" gap without relying
on Cursor's unreleased `beforeSubmitPrompt` additional_context support.
Also removes unused `agent-scope/.pending-onboarding` entry-point noise
from the pre-existing `pnpm task start` output.
Made-with: Cursor
Extends agent-scope to enforce / surface task-scoped writes across more
than just Cursor.
Hard enforcement (hook-supporting agents):
- Cursor (already shipped)
- Claude Code: thin .claude/hooks/ adapters that translate Claude
Code's PreToolUse / PostToolUse / SessionStart / UserPromptSubmit
JSON I/O to the same agent-scope/lib/ policy used by the Cursor
hooks. UserPromptSubmit additionally gives Claude Code transparent
one-shot onboarding for any chat (new or existing), since unlike
Cursor's beforeSubmitPrompt we can inject additionalContext there.
Soft enforcement (no hook system available; agent self-enforces):
- Codex CLI: AGENTS.md (OpenAI's project-instruction convention)
- Gemini CLI: GEMINI.md
- Continue / Cline / older Cursor: .cursorrules legacy fallback
New verification command:
pnpm task check-agent (or pnpm scope:check-agent)
Detects each supported agent in the repo, prints per-agent status
(active / soft / needs attention / not configured), tells the user
exactly what (if anything) they have to do after a fresh git pull.
9 unit tests cover the detection logic.
Other changes:
- PROTECTED_PATTERNS extended to defend the new system surfaces
(.claude/hooks/**, .claude/settings.json, AGENTS.md, GEMINI.md,
.cursorrules). 24 e2e checks confirm hard blocks fire correctly
in Claude Code.
- .gitignore: keep ignoring per-developer .claude/* state but
explicitly include settings.json + hooks/ so coworkers get hard
enforcement on git pull.
- Docs: README "Supported agents" matrix + per-agent setup notes;
CLAUDE.md / .cursor/rules/agent-scope.mdc updated to reflect
cross-agent coverage.
- 149 unit tests pass (was 140; +9 check-agent tests).
Made-with: Cursor
| const porcelain = gitPorcelain(root); | ||
| if (porcelain === null) return emit({}); | ||
|
|
||
| const entries = parsePorcelain(porcelain); |
There was a problem hiding this comment.
🔴 Bug: This hook diffs against the entire current worktree, not the delta from the shell command that just ran. In a dirty repo, any pre-existing user edit or untracked scratch file in a denied path will be reverted/deleted the next time the agent executes any shell command, which breaks the documented "humans are not restricted" behavior. Capture a pre-command snapshot and only revert paths introduced by that command.
| if (typeof relPath !== 'string' || relPath.length === 0) return 'deny'; | ||
| if (relPath.includes('..')) return 'deny'; | ||
|
|
||
| if (checkProtected(relPath, root) === 'deny') return 'protected'; |
There was a problem hiding this comment.
🔴 Bug: Bootstrap only bypasses checkProtected(). If a task is active, protected files still fall through to manifest matching and come back as deny, so the documented touch agent-scope/.bootstrap-token workflow cannot actually unlock protected files during a scoped task. If bootstrap is meant to disable the guard, checkPath() needs to short-circuit before task-scope matching as well.
| const c = glob[i]; | ||
| if (c === '*') { | ||
| if (glob[i + 1] === '*') { | ||
| re += '.*'; |
There was a problem hiding this comment.
🔴 Bug: Translating ** to .* and then skipping the following / makes **/ patterns broader than the manifest author asked for. For example, pkg/**/test.ts now matches pkg/footest.ts, not just pkg/test.ts or deeper descendants, which weakens the write boundary. Treat **/ as zero-or-more complete path segments instead of eliding the slash.
| } | ||
|
|
||
| const violations = []; | ||
| for (const sub of splitCommands(command)) { |
There was a problem hiding this comment.
🔴 Bug: Each fragment is scanned in isolation here, so shell state from earlier fragments is lost. cd .cursor && rm -rf hooks is analyzed as rm -rf hooks at repo root, and cd packages/foo && rm src/bar.ts has the same problem. That lets destructive commands against protected/out-of-scope files slip past the pre-check before the after-shell backstop runs. Either track cwd/pipeline context or conservatively deny destructive commands after cd/similar stateful constructs.
The default `pnpm task start` now walks the user through a short
questionnaire (description, packages, extras) and writes + activates the
manifest directly — no agent round-trip, works in every agent (and with
no agent at all). The legacy agent-guided flow is preserved behind
`pnpm task start --chat` and is also used automatically when stdin is
not a TTY so CI and pipes don't hang.
- lib/wizard.mjs pure logic: discoverPackages (pnpm-workspace.yaml,
package.json workspaces, or packages/* fallback),
deriveTaskId, suggestPackagesFromDescription
(keyword overlap scoring), draftGlobs, buildManifest
- lib/prompter.mjs tiny readline-based prompter (ask / askYesNo /
askChoice / askMultiNumber / askLines) with
injectable streams for tests
- bin/task.mjs rewired start(): interactive by default, --chat
forces legacy marker+clipboard flow, no-TTY
auto-falls-back; preview/edit/cancel step before
saving; overwrite confirmation on id collision
- wizard.test.mjs 25 unit tests covering every pure helper
- docs README/CLAUDE.md/Cursor rule/AGENTS.md/GEMINI.md/
.cursorrules updated: wizard is the default path,
agent-guided onboarding only fires under --chat
Made-with: Cursor
…description capture
--smart captures a multi-line task description in the CLI, embeds it in the
one-shot marker, and hands off to the agent. The agent reads the description
verbatim (no re-asking), explores the repo, and proposes a scope via a rich
two-part AskQuestion (multi-select packages + single-select action).
- onboarding.mjs: add buildOnboardingTrigger({description}) and
extractDescription(); keep ONBOARDING_TRIGGER_TEXT as a backcompat alias.
New trigger text embeds the description in a fenced block and describes
the two-question AskQuestion protocol the agent must use.
- onboarding.test.mjs: +8 tests for buildOnboardingTrigger / extractDescription
round-trip, multi-line preservation, whitespace trimming, empty-description
fallback, and malformed-marker tolerance.
- bin/task.mjs: rewire start() — add --smart mode with multi-line description
capture (blank-line-terminated, safe against closed stdin), keep --chat as a
deprecated alias with a warning, remove the old TTY auto-fallback and error
out cleanly with 'use pnpm task create' guidance instead. Default
pnpm task start still runs the interactive wizard.
- Rules + docs (.cursor/rules, CLAUDE.md, AGENTS.md, GEMINI.md, .cursorrules,
README.md): full rewrite of the onboarding protocol — agents now must check
for the description block and use the two-part AskQuestion layout with a
package picker and action picker.
Tests: 183 green in ~0.8s. Schemas + base manifests still validate.
Made-with: Cursor
The smart-onboarding flow ends with a plan-mode AskQuestion where the user explicitly approves the proposed scope. Until now, the agent still had to bounce the `pnpm task create` command back for the user to run manually — the afterShell hooks would otherwise delete the new manifest as an untracked write inside the protected `agent-scope/tasks/**` path. Add a narrow allowlist in both hooks (`.cursor/hooks/shell-diff-check.mjs` and `.claude/hooks/shell-diff-check.mjs`) driven by `extractTaskCreateId()` in `shell-parse.mjs`. When the shell command that just ran matches the canonical shapes — - pnpm task create <id> ... - pnpm run task create <id> ... - node agent-scope/bin/task.mjs create <id> ... — and the id validates against the manifest-id regex, the hook lets the two specific files that command legitimately writes (`agent-scope/tasks/<id>.json` and `agent-scope/active`) persist. Everything else inside the same turn — impostor `echo > ...`, `cp`, opaque evaluators, other task manifests, ids with path-escape chars, non-canonical wrappers (npm/yarn/bun), Write/Edit tool calls — is still reverted/deleted. Each waived write is audited to the denial log as `afterShell.approved-create`. Protocol docs (CLAUDE.md, .cursor/rules/agent-scope.mdc, AGENTS.md, README.md) are updated to say the agent runs `pnpm task create` itself on approve. - 14 new shell-parse tests (60 in the parse suite; 200 total). - E2E smoke (/tmp/allowlist-smoke.sh) covers 6 scenarios including chained commands and log auditing. Made-with: Cursor
Previously `pnpm task start --smart` required two Enter presses to
submit a description: one to end the line, one to signal "done". The
reader waited for a blank line after seeing content, which was cumbersome
for the common case of a short one-line task summary.
Fix: single-Enter submits immediately. Multi-line pastes are still
captured in full via paste-detection — terminals deliver each pasted
line as a separate `line` event within a few ms, so after the first
blocking line read we poll `tryReadLine(80ms)` and only stop once no
new line arrives inside that quiet window. Trailing blank lines from
pastes are trimmed; blank lines inside a paste (paragraph breaks) are
preserved.
Implementation:
- New `tryReadLine(timeoutMs)` primitive on the prompter that cleans up
its own waiter on timeout (no leak if the line never arrives).
- New `askPasteableDescription(prompt, opts)` prompter method composing
blocking first-line read + paste-detection tail. Configurable quiet
window, line cap, and leading-blank tolerance.
- `readMultilineDescription()` in `bin/task.mjs` replaced with a direct
call to `prompter.askPasteableDescription('> ')`.
- CLI copy updated: "Finish with an empty line." → "Press Enter to
submit. (Multi-line pastes are captured in full.)"
- README onboarding-flow section updated to match.
Tests: 16 new cases in `agent-scope/lib/prompter.test.mjs` covering
tryReadLine (buffered/timeout/late/closed/no-steal) and
askPasteableDescription (single-line, multi-line paste, paragraph
breaks, trailing-blank trim, leading-blank tolerance, bail path,
inside-window late line, outside-window late line, maxLines cap on
runaway input). Added to `scope:test` — full suite: 216 tests, <1s.
Made-with: Cursor
The plan-mode AskQuestion surface was too busy. Onboarding asked a
two-part question with a multi-select package list plus a seven-option
action menu. Denials asked agents to include a 5-bullet prompt (denied
path + why restricted + agent reasoning + recommendation + full options
list), where "full options" could mean up to 6 verbose entries like
`Add "packages/foo/bar.ts" to my-task's manifest`. The prose carried
ALL-CAPS banners ("PROTECTED PATH —", "OUT OF TASK SCOPE —", "STOP.")
and meta copy ("Agent: surface the menu below via AskQuestion"). On
both ends — the user's and the LLM's — it read like a compliance form.
The user asked for something closer to plan mode: one question, two
options — the LLM's recommendation and "something else — tell me what"
— phrased like a human chatting with a coworker.
Changes:
- `agent-scope/lib/denial.mjs` now emits two new fields in every
structured denial payload:
- `humanSummary`: one or two natural-language sentences describing
the situation. Agents are told to quote it verbatim in the
AskQuestion prompt. Replaces the old multi-block prose (banners,
"Why this file is guarded", "What happens if the user says YES/NO",
file lists, STOP notices, agent-directed copy).
- `simpleOptions`: always exactly two entries — the recommended
action (with a short casual label like "Add this folder to the
task and try again", "Skip it", "Yes, unlock it so I can do this
edit") and a `custom_instruction` free-text fallback labelled
"Something else — tell me what". The verbose `options` list stays
for audit / back-compat / tests but is not surfaced to the user.
- Rendered prose is now a one-line `agent-scope: <humanSummary>` plus
the fenced JSON. No banners. No agent-directed meta copy. After-shell
context still lists reverted/deleted paths below the summary for
reference.
- `.cursor/rules/agent-scope.mdc`, `CLAUDE.md`, `AGENTS.md`:
- Smart-onboarding step 4 is now "one AskQuestion, two options":
`go` ("Yes, go with that") + `custom_instruction` ("Tell me what
to change"). The prompt is a 3-sentence max rephrase + scope
bullets + "Sound good?"
- Denial protocol step 3 now reads: quote `humanSummary` verbatim,
add one short sentence of your own reasoning, pass `simpleOptions`
verbatim. Never surface the verbose `options` list.
- New "Phrasing rules" section: no ALL-CAPS banners, no architecture
explanations in prompts, one sentence of reasoning, no emoji
unless the user uses them first.
- `agent-scope/lib/onboarding.mjs` trigger payload matches the new
protocol (was instructing the agent to ask Q1/Q2 and print the
command for the user to run manually — both outdated).
- `agent-scope/bin/task.mjs` description prompt is now the casual
"What are you working on?" / "One or two sentences is plenty. Paste
longer briefs if you have them." / "Press Enter to send." rather
than the old "Describe the task — what to build or fix, which
packages / behaviours / tests, and any files you already know about."
- `agent-scope/README.md`: onboarding-flow + denial-menu sections
rewritten to document `humanSummary` and `simpleOptions` as the
user-facing surface, with a worked example.
Tests:
- `denial.test.mjs`: banner-era assertions (`OUT OF TASK SCOPE`,
`Why this file is guarded`, `Reverted via`) replaced with asserts
that `humanSummary` is present, short (<= 400 chars), mentions the
denied path/task, has no banners, and is quoted in the rendered
prose. Three new test cases cover the `simpleOptions` invariant
(exactly two entries; first matches `recommendedOptionId`; second
is always `custom_instruction`), the natural-label mapping
(add_glob → "Add this folder to the task and try again", cancel →
"Skip it", etc.), and the `humanSummary`-shape contract across all
five builders.
- Full suite: 219 tests, ~1.2s.
Made-with: Cursor
Drop the interactive CLI wizard and the --smart flag. `pnpm task start` now always captures a task description, drops the one-shot marker, and lets the agent propose a scope in chat. - bin/task.mjs: collapse start() + startSmart() + startInteractive() into a single start(); remove --smart/--chat/--interactive flags and all wizard-only helpers. - lib/wizard.mjs + wizard.test.mjs: removed (unused). - package.json: drop wizard.test.mjs from scope:test. - Rule files (.cursor/rules/agent-scope.mdc, CLAUDE.md, AGENTS.md, GEMINI.md, .cursorrules) and agent-scope/README.md: rewrite the onboarding section to describe one flow; rename "Smart onboarding protocol" to "Task onboarding protocol". All 194 unit tests pass. Made-with: Cursor
Strip the explanatory paragraphs, parentheticals, and marker-file plumbing from the user-facing output. Keep only what someone actually needs to read: what to type, and what to do next. Made-with: Cursor
- `pnpm task start` prompt drops the "one or two sentences" hint; just asks "What are you working on?". - Post-capture message: "Send any message in chat (e.g. `start working`)" and "ask you to accept a scope" (was "OK a scope"). - `custom_instruction` option label rewritten from "Something else — tell me what" / "Tell me what to change" to "Type what you want instead" across denial.mjs, onboarding trigger, and rule files (cursor, claude, agents, readme) — makes it unambiguous that option B is the free-text entry. - Onboarding scope proposal now uses a numbered list (1), 2), 3) …) instead of bullet points. Examples + rules updated in sync. Made-with: Cursor
…aper) Two bugs were causing correct agent-scope flows to break: 1. `2>&1` and similar fd-duplication tokens were being parsed as writes to a file literally named `&1`, so read-only commands like `pnpm task show 2>&1`, `tsc --noEmit 2>&1`, or `wc -l foo 2>&1` got denied by beforeShellExecution. Fixed in shell-parse.extractRedirections by skipping any redirect target that starts with `&` (fd reference, not a path). Added unit tests for `2>&1`, `1>&2`, `>&1`, `1>&-`, `&>&1`, and the `&>/dev/null` positive case. 2. The afterShellExecution reaper was deleting the currently active task's manifest (`agent-scope/tasks/<id>.json`) and the `agent-scope/active` pointer whenever any subsequent shell command ran, because those files are untracked and live in a protected path. The existing approved-write allowlist only covered the exact `pnpm task create` turn, not future turns. Added an active-task exemption in both the Cursor and Claude Code hooks so the active task's own state survives unrelated shell calls. Stale manifests for other ids are still reverted/deleted. Made-with: Cursor
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: This matcher omits StrReplace, Delete, and EditNotebook, but .claude/hooks/scope-guard.mjs explicitly handles them. In Claude Code those tool invocations will never hit the guard, so an agent can modify or delete protected/out-of-scope files through those write paths. Please align the matcher with the full guarded tool set (as you already do in .cursor/hooks.json).
| activeTaskExemptions.add('agent-scope/active'); | ||
| } | ||
|
|
||
| const entries = parsePorcelain(porcelain); |
There was a problem hiding this comment.
🔴 Bug: entries is the entire current worktree, not the delta from the Bash command that just ran. If the repo is already dirty, any unrelated user change in a denied/protected path will be reverted or deleted on the next shell command. Please diff against a pre-command snapshot (or use tool-reported touched paths) so this hook only cleans up changes introduced by the just-finished command.
| continue; | ||
| } | ||
| try { | ||
| execSync(`git checkout -- ${JSON.stringify(path)}`, { |
There was a problem hiding this comment.
🔴 Bug: execSync runs through a shell, and JSON.stringify(path) does not neutralize shell expansion inside double quotes. A filename containing $(...), backticks, or $VAR will execute or expand during cleanup. Please switch to execFileSync('git', ['checkout', '--', path], ...) here and in the Cursor mirror.
| } | ||
|
|
||
| async function main() { | ||
| if (process.env.AGENT_SCOPE_BOOTSTRAP === '1') return allow(); |
There was a problem hiding this comment.
🔴 Bug: Bootstrap mode is documented as disabling only the hardcoded protected-path check, but this early return disables all Bash scope enforcement when AGENT_SCOPE_BOOTSTRAP=1 is set. That lets destructive shell commands hit any out-of-scope path instead of just protected files. Please remove the short-circuit and let checkPath/coversProtected apply the narrower bootstrap semantics (same issue in the Cursor hook).
| let task = null; | ||
| if (taskId) { | ||
| try { task = loadTask(root, taskId); } | ||
| catch { return allow(); } |
There was a problem hiding this comment.
🔴 Bug: If the active manifest is malformed or missing, this falls open and lets Bash run without any scope check. The mirrored catch in shell-diff-check.mjs also returns early, so a broken agent-scope/tasks/<id>.json becomes a full shell escape hatch. Please surface the same manifest-load denial you already use in scope-guard.mjs, or otherwise fail closed.
| 'agent-scope/active', | ||
| 'agent-scope/.bootstrap-token', | ||
| 'AGENTS.md', | ||
| 'GEMINI.md', |
There was a problem hiding this comment.
🔴 Bug: Claude support in this PR relies on CLAUDE.md for onboarding and denial behavior, but CLAUDE.md is not in PROTECTED_PATTERNS. That means a Claude agent can still rewrite its own instructions without bootstrap if its task covers the repo root. Please add CLAUDE.md here (and to the documented hardcoded-path lists) alongside AGENTS.md/GEMINI.md.
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: This matcher only wires scope-guard.mjs into Write|Edit|MultiEdit|NotebookEdit, but the hook itself explicitly handles StrReplace, Delete, and EditNotebook too. Those write-capable tools would bypass the protected-path/task-scope guard entirely in Claude Code. Expand the matcher to cover the same tool names the hook enforces.
| // Extracted for unit-testability. No IO, no dependencies on scope.mjs. | ||
|
|
||
| // Split on &&, ||, ;, | — treat each sub-command independently. | ||
| export function splitCommands(cmd) { |
There was a problem hiding this comment.
🔴 Bug: Treating each shell segment independently loses cwd state. A command like cd agent-scope/tasks && rm sync.json gets checked as rm sync.json relative to the repo root, so the pre-shell guard can miss protected or out-of-scope targets after a cd. Track directory changes while scanning, or conservatively deny destructive segments once an earlier segment changes cwd.
| // task-create invocation. Task id validation matches the JSON schema | ||
| // (kebab-case, alphanumerics + hyphens/underscores, 1-64 chars). | ||
|
|
||
| const TASK_ID_RE = /^[a-zA-Z0-9][a-zA-Z0-9_-]{0,63}$/; |
There was a problem hiding this comment.
🔴 Bug: This regex does not match the task-id format accepted elsewhere (task.mjs/validateManifest allow dots), so a valid command like pnpm task create release.1 --activate will not be recognized as approved. The after-shell hook will then delete/revert agent-scope/tasks/release.1.json and agent-scope/active immediately after the user-approved onboarding flow. Reuse the manifest/task-create id regex here so the allowlist stays in sync.
| let task = null; | ||
| if (taskId) { | ||
| try { task = loadTask(root, taskId); } | ||
| catch { return allow(); } |
There was a problem hiding this comment.
🔴 Bug: Falling back to allow() when loadTask() fails makes the shell guard fail open for a broken active manifest. In that state Write/Edit are blocked correctly, but Bash can still mutate out-of-scope or protected files. This should deny with the same manifest-load error instead of disabling enforcement (and the Claude mirror should do the same).
| // unrelated shell command runs (because it shows up as untracked in a | ||
| // protected path). Only shield the active-task id — every other | ||
| // manifest (including stale ones) is still reverted/deleted. | ||
| const activeTaskExemptions = new Set(); |
There was a problem hiding this comment.
🔴 Bug: Exempting the active manifest and agent-scope/active from after-shell cleanup on every command creates a persistence hole for any write the pre-shell parser misses. For example, after cd agent-scope/tasks && rm sync.json, the actual protected file can survive because it is globally exempted here. Limit this exception to the explicit approved task create flow, or compare against a pre-command baseline instead of blanket-exempting these files.
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: This matcher omits StrReplace, Delete, and EditNotebook, so those Claude Code write tools never invoke scope-guard.mjs even though the hook itself knows how to police them. In an active task, an agent can still modify or delete out-of-scope files through those paths. Expand the matcher to the full write-class set used by the guard.
| if (!relPath || typeof relPath !== 'string') return 'deny'; | ||
| if (bootstrapActive(root)) return 'allow'; | ||
| for (const pattern of PROTECTED_PATTERNS) { | ||
| if (globToRegex(pattern).test(relPath)) return 'deny'; |
There was a problem hiding this comment.
🔴 Bug: checkProtected() only tests the .../** patterns against the exact path, so the directory roots themselves still return allow (.claude/hooks, .cursor/hooks, agent-scope/lib, etc.). That means a single Delete of the protected directory can bypass the hardcoded-path guard. Either add the bare directory paths too, or make /** patterns cover the directory root as well.
| let task = null; | ||
| if (taskId) { | ||
| try { task = loadTask(root, taskId); } | ||
| catch { return allow(); } |
There was a problem hiding this comment.
🔴 Bug: If the active task was inferred from the branch/git config and its manifest is missing or invalid, this catch disables Bash enforcement entirely. That turns a broken manifest into unrestricted shell writes, even though session-start.mjs tells the agent that all writes will be denied. Fail closed here (and in the mirrored post-shell hooks) so a bad manifest can't silently drop protection.
| continue; | ||
| } | ||
| try { | ||
| execSync(`git checkout -- ${JSON.stringify(path)}`, { |
There was a problem hiding this comment.
🔴 Bug: This interpolates the repo path into a shell command. A filename containing $(), backticks, or similar shell metacharacters will be evaluated when the hook tries to revert it, which turns an attacker-controlled path into command execution. Invoke Git without a shell, e.g. execFileSync('git', ['checkout', '--', path], ...).
| // task-create invocation. Task id validation matches the JSON schema | ||
| // (kebab-case, alphanumerics + hyphens/underscores, 1-64 chars). | ||
|
|
||
| const TASK_ID_RE = /^[a-zA-Z0-9][a-zA-Z0-9_-]{0,63}$/; |
There was a problem hiding this comment.
🔴 Bug: This task-id regex is stricter than the manifest/CLI validator, which allows dots (^[a-z0-9][a-z0-9-_.]{0,63}$). A valid onboarding command like pnpm task create foo.bar --activate won't be recognized as approved, so the after-shell hook will revert agent-scope/tasks/foo.bar.json and agent-scope/active. Keep this parser in sync with the canonical task-id rule.
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: The Claude pre-tool hook is only registered for Write|Edit|MultiEdit|NotebookEdit, but .claude/hooks/scope-guard.mjs also expects mutating tools like StrReplace, Delete, and EditNotebook. Those writes would bypass scope/protected-path enforcement entirely. Expand the matcher (or add separate entries) so every write-capable tool is guarded.
| activeTaskExemptions.add('agent-scope/active'); | ||
| } | ||
|
|
||
| const entries = parsePorcelain(porcelain); |
There was a problem hiding this comment.
🔴 Bug: This post-shell revert logic scans the entire current git status, not just files touched by the Bash command that just ran. In a dirty worktree, any pre-existing out-of-scope user edits or untracked files will get reverted/deleted on the next Bash invocation. Capture a before/after snapshot or otherwise limit the cleanup to paths introduced by this command. The mirrored Cursor hook has the same problem.
| continue; | ||
| } | ||
| try { | ||
| execSync(`git checkout -- ${JSON.stringify(path)}`, { |
There was a problem hiding this comment.
🔴 Bug: This shells out with a filename-derived string. JSON.stringify(path) still leaves command substitutions/backticks active inside double quotes, so a crafted repo path can execute arbitrary shell when the hook tries to revert it. Use execFileSync('git', ['checkout', '--', path]) (same fix needed in the mirrored Cursor hook).
| if (!inPlace) return { cmd: head, targets: [] }; | ||
| } | ||
|
|
||
| for (const t of rest) { |
There was a problem hiding this comment.
🔴 Bug: The generic operand scan treats every non-flag token as a write target. That misclassifies non-path/read-only operands like the sed -i script (s/x/y/) and cp sources, so valid in-scope commands get blocked under an active task. Parse these verbs separately and only return actual write destinations / mutated paths.
| // task-create invocation. Task id validation matches the JSON schema | ||
| // (kebab-case, alphanumerics + hyphens/underscores, 1-64 chars). | ||
|
|
||
| const TASK_ID_RE = /^[a-zA-Z0-9][a-zA-Z0-9_-]{0,63}$/; |
There was a problem hiding this comment.
🟡 Issue: This allowlist regex is narrower than the manifest schema and task create, which both accept dots in task ids (^[a-z0-9][a-z0-9-_.]{0,63}$). A valid pnpm task create my.id --activate will not be recognized here, so the after-shell hook will delete/revert the manifest it just created. Reuse the shared task-id validator/regex so approved-create detection stays in sync.
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: This matcher only attaches the guard to Write|Edit|MultiEdit|NotebookEdit, but .claude/hooks/scope-guard.mjs is explicitly written to protect StrReplace, Delete, and EditNotebook too. In Claude Code those write-capable tools would bypass agent-scope entirely. Expand the matcher to the full guarded set (or remove the matcher and let the hook filter internally).
| // task-create invocation. Task id validation matches the JSON schema | ||
| // (kebab-case, alphanumerics + hyphens/underscores, 1-64 chars). | ||
|
|
||
| const TASK_ID_RE = /^[a-zA-Z0-9][a-zA-Z0-9_-]{0,63}$/; |
There was a problem hiding this comment.
🔴 Bug: TASK_ID_RE no longer matches the task schema/CLI (^[a-z0-9][a-z0-9-_.]{0,63}$). Valid ids like foo.bar are rejected here, so pnpm task create foo.bar --activate will succeed in the CLI but the after-shell allowlist will miss it and immediately revert/delete the protected writes. Reuse the same id regex as the manifest validator/CLI.
| // Scan the full command (which may contain multiple sub-commands joined | ||
| // with `&&` / `||` / `;` / `|`) and return the FIRST approved task-create | ||
| // id we find, or null. | ||
| export function extractTaskCreateId(command) { |
There was a problem hiding this comment.
🔴 Bug: The comments say the post-shell allowlist should only apply to the canonical pnpm task create ... shape, but this helper approves any compound command that happens to contain one matching subcommand. That means node some-script.js && pnpm task create foo --activate can preserve protected writes to agent-scope/tasks/foo.json / agent-scope/active from the first subcommand. Only approve when the entire command is exactly a canonical task-create invocation, and gate agent-scope/active on --activate.
| continue; | ||
| } | ||
| try { | ||
| execSync(`git checkout -- ${JSON.stringify(path)}`, { |
There was a problem hiding this comment.
🔴 Bug: This feeds a repo-controlled path into a shell command string. A filename containing $() or backticks will be expanded when the hook runs git checkout, which turns the revert path into command execution. Use execFileSync('git', ['checkout', '--', path], ...) here instead of execSync(...) (the Claude mirror has the same issue).
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: scope-guard.mjs explicitly handles StrReplace, Delete, and EditNotebook, but this matcher never invokes the hook for those tool names. In Claude Code those mutations can bypass agent-scope entirely. Expand the matcher to the full guarded set the hook expects.
| // `checkPath(task, ...)` → unchanged (works on the synthetic task). | ||
|
|
||
| export function resolveActiveTaskId(root, _opts = {}) { | ||
| const scope = resolveActiveScopeSync({ root }); |
There was a problem hiding this comment.
🔴 Bug: every enforcement hook goes through resolveActiveTaskId(), and this shim only reads the cached scope. Once that cache expires it falls back to no-active-task instead of refreshing from the daemon, so tasks created/finished mid-session never change enforcement until the next SessionStart. The hot path needs an async refresh on cache miss/staleness instead of failing open.
| } | ||
|
|
||
| async function main() { | ||
| if (process.env.AGENT_SCOPE_BOOTSTRAP === '1') return allow(); |
There was a problem hiding this comment.
🔴 Bug: bootstrap mode is documented to activate via either AGENT_SCOPE_BOOTSTRAP=1 or agent-scope/.bootstrap-token, but this hook only checks the env var. If the user follows the documented token-file flow, protected shell edits are still blocked here (same issue in the Claude hook). Use isBootstrapActive(root) instead.
|
|
||
| SELECT ?task ?title ?modified ?scope WHERE { | ||
| ?task a tasks:Task ; | ||
| tasks:status "in_progress" ; |
There was a problem hiding this comment.
🔴 Bug: this query assumes each task has only one live tasks:status triple. Pre-existing tasks in the graph still store status on their main assertion, so after this migration an old in_progress triple can survive alongside the new task-status-* assertion and keep the guard treating the task as active. Either migrate/clean legacy status triples or query for the latest status across both layouts.
| const triples: Array<{ subject: string; predicate: string; object: string }> = []; | ||
| emit(triples, U(taskUri), U(NS.tasks + 'status'), L(status)); | ||
| emit(triples, U(taskUri), U(ModifiedP), L(nowIso, XSD_DATETIME)); | ||
| emit(triples, U(taskUri), U(AttrP), U(config.agentUri)); |
There was a problem hiding this comment.
🔴 Bug: writing prov:wasAttributedTo on every status update lets any updater 'adopt' the task. The scope query only checks whether the task subject has any matching attribution, so if agent B updates agent A's task, that task's scopedToPath globs start counting toward B's allow-list. Keep updater provenance on the StatusEvent (or enforce owner-only updates) instead of adding another attribution triple to the task subject.
Coworkers pulling this branch previously had to handcraft `.dkg/config.yaml`
themselves and could hit Cursor MCP wiring issues if pnpm/tsx weren't on
the spawn PATH. With this change, the onboarding flow becomes:
pnpm install # postinstall auto-runs scripts/scope-setup.mjs
pnpm build # builds packages/mcp-dkg/dist/index.js
dkg start # in another terminal
# open Cursor and chat normally
What's new:
- scripts/scope-setup.mjs auto-creates .dkg/config.yaml with team
defaults (api=localhost:9200, tokenFile=~/.dkg/auth.token,
contextGraph=dev-coordination) and a per-machine agent URI derived
as `urn:dkg:agent:cursor-${user}-${hostname}`. If the daemon's up
it also creates the dev-coordination paranet via the daemon API.
Idempotent, silent on re-run, never fails postinstall when the
daemon is down.
- package.json exposes `pnpm scope:setup` (manual) and `postinstall`
(auto with --auto flag) so the script runs on every fresh install
without anyone having to remember it.
- .cursor/mcp.json switched from `pnpm exec tsx <source>` to
`node packages/mcp-dkg/dist/index.js`. Cursor spawns MCP servers
in a non-interactive environment that often lacks NVM-managed
pnpm/tsx on PATH, but `node` almost always resolves; this also
cuts ~500ms off MCP startup.
- agent-scope/README.md gains an "Onboarding (new clone)" section
with the new flow plus a fallback recipe for the rare cases where
the project-level Cursor MCP doesn't load.
Made-with: Cursor
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: This matcher only invokes scope-guard.mjs for Write|Edit|MultiEdit|NotebookEdit, but the guard implementation also handles StrReplace, Delete, and EditNotebook. In Claude those tool paths would bypass the hard block entirely, so protected or out-of-scope files can still be deleted or string-replaced. Expand the matcher to the full write-tool set.
| .replace(/[^a-z0-9-]+/g, '-') | ||
| .replace(/^-+|-+$/g, '') | ||
| .slice(0, 40) || 'x'; | ||
| return `urn:dkg:agent:cursor-${slug(rawUser)}-${slug(rawHost)}`; |
There was a problem hiding this comment.
🔴 Bug: This hardcodes every auto-generated identity to urn:dkg:agent:cursor-*, and the same .dkg/config.yaml is then consumed by Claude/Codex/Gemini too. Different agents on the same machine will therefore share one prov:wasAttributedTo value, so the scope query unions their in-progress tasks and can grant writes across agents. The default URI needs an agent-specific segment or separate per-agent configs.
| 'Use also when the agent wants to file follow-up work detected during a chat (e.g. "revisit ' + | ||
| 'SHACL on promote path"). Attribution via prov:wasAttributedTo.', | ||
| inputSchema: { | ||
| title: z.string().describe('Imperative, e.g. "Add SHACL validation on /promote endpoint".'), |
There was a problem hiding this comment.
🔴 Bug: The docs/examples added in this PR tell agents to call dkg_add_task({ taskUri: ... }), but this schema doesn't accept taskUri, so that field will be ignored and the tool will silently mint a different URI. That breaks the documented "re-file the same task" flow and makes a later dkg_update_task_status({ taskUri }) target the wrong entity unless the caller parses the returned URI. Please accept an optional taskUri and prefer it when supplied.
| } | ||
| emit(triples, U(id), U(AttrP), U(config.agentUri)); | ||
|
|
||
| const assertion = `agent-task-${slug}-${rand(4)}`; |
There was a problem hiding this comment.
🔴 Bug: The main task assertion is still written under a random name and never discarded. Re-running dkg_add_task for the same task can only append extra scopedToPath/touches triples; it can never replace or remove old ones, so stale scope stays live indefinitely. Use a stable assertion key per task and discard/rewrite it before writing.
| } | ||
|
|
||
| async function main() { | ||
| if (process.env.AGENT_SCOPE_BOOTSTRAP === '1') return allow(); |
There was a problem hiding this comment.
🔴 Bug: Bootstrap via agent-scope/.bootstrap-token is not honored here. checkPath() respects the token file, but the opaque-body and xargs branches above still call bodyTouchesProtected(PROTECTED_PATTERNS) directly, so protected-path shell commands remain blocked unless the env var is set. Gate the precheck on isBootstrapActive(root) instead (same issue exists in the Cursor mirror).
|
|
||
| function gitPorcelain(root) { | ||
| try { | ||
| return execSync('git status --porcelain', { |
There was a problem hiding this comment.
🔴 Bug: git status --porcelain hides ignored untracked files, so out-of-scope generated artefacts (for example dist/ or coverage/) survive the post-shell cleanup even though the guard promises to delete untracked leakage. Use porcelain with ignored entries or another filesystem walk so ignored files in denied paths are cleaned too.
| } | ||
|
|
||
| const isMain = (() => { | ||
| try { return import.meta.url === `file://${process.argv[1]}` || import.meta.url.endsWith(process.argv[1] || ''); } |
There was a problem hiding this comment.
🟡 Issue: This main-module check fires on import under node -e/tests because endsWith(process.argv[1]) can match unrelated argv values. That makes runCli() print the full report as a side effect of importing detectAgents. Compare fileURLToPath(import.meta.url) against a resolved process.argv[1] exactly, or use import.meta.main when it's available.
The previous commit got coworkers most of the way to "pull, install, build, start node, chat" — but left one rough edge: the dev-coordination paranet was only created if the daemon happened to be up at install time, which it usually isn't. A coworker would chat normally, the agent would try `dkg_add_task`, the call would 404, and they'd have to remember to run `pnpm scope:setup` manually after `dkg start`. This commit closes that gap by having the MCP server itself ensure the paranet exists on startup, before serving any tools. Specifically: - `DkgClient.createContextGraph` is added, mirroring the v10 `/api/context-graph/create` endpoint with a legacy `/api/paranet/create` fallback so older daemons still work. - `ensureContextGraph()` in the MCP entrypoint runs once per session: list projects, check if the configured `contextGraph` is present, create it if not. Best-effort — daemon down or auth errors are logged and the server continues serving (read tools still work against existing graphs, and the next session will retry). So now every Cursor / Claude Code session naturally "heals" the paranet on first connect. The setup script's "daemon was down at install time" warning is softened accordingly: there's nothing to re-run, the MCP will handle it. README onboarding section trimmed to match. Made-with: Cursor
| const cacheFile = cachePathFor(cfg.projectId, cfg.agentUri); | ||
| const cached = readCache(cacheFile); | ||
| if (cached) return { ...cached, fromCache: true, stale: false }; | ||
| return { ...makeEmpty(cfg, 'no-active-task', 'cache miss / expired; resolve async first'), fromCache: false, stale: true }; |
There was a problem hiding this comment.
🔴 Bug: A cache miss/expiry is being downgraded to no-active-task, and every hook treats that as “scope enforcement off”. Since the cache TTL is only 5s and the async refresh only happens on session start, any session that stays open longer than that can write anywhere until something else repopulates the cache. Preserve the last known scope or force a refresh/deny when stale === true instead of failing open here.
| const triples: Array<{ subject: string; predicate: string; object: string }> = []; | ||
| emit(triples, U(taskUri), U(NS.tasks + 'status'), L(status)); | ||
| emit(triples, U(taskUri), U(ModifiedP), L(nowIso, XSD_DATETIME)); | ||
| emit(triples, U(taskUri), U(AttrP), U(config.agentUri)); |
There was a problem hiding this comment.
🔴 Bug: Writing prov:wasAttributedTo onto the task subject on every status flip changes who the guard thinks owns this task. resolveDkgScope() matches any tasks:Task with prov:wasAttributedTo <agent>, so if agent B updates agent A’s task to in_progress, both agents now satisfy the query and inherit the task’s scopedToPath. Keep updater provenance on the StatusEvent (or a separate field), but leave the task subject’s attribution stable.
| # Auto-derived from \`${process.env.USER || 'anon'}@${os.hostname()}\` so | ||
| # each coworker / machine has a distinct identity. Override this if you | ||
| # want a different URI (e.g. one per parallel Cursor chat). | ||
| uri: ${agentUri} |
There was a problem hiding this comment.
🔴 Bug: This emits one machine-level agent.uri into .dkg/config.yaml, and the config loader gives file values precedence over DKG_AGENT_URI. That means Cursor, Claude Code, Codex, etc. on the same workstation all share one identity and therefore one active scope, so a task opened in one client authorizes writes in the others. Either derive a client-specific URI here or leave agent.uri unset so each client can supply its own.
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "Write|Edit|MultiEdit|NotebookEdit", |
There was a problem hiding this comment.
🔴 Bug: this matcher omits StrReplace, Delete, and EditNotebook, even though scope-guard.mjs is written to guard them and the Cursor wiring includes them. On Claude Code those operations will bypass agent-scope entirely, so an out-of-scope delete/replace can still land. Expand the matcher to the full guarded tool set.
| // `checkPath(task, ...)` → unchanged (works on the synthetic task). | ||
|
|
||
| export function resolveActiveTaskId(root, _opts = {}) { | ||
| const scope = resolveActiveScopeSync({ root }); |
There was a problem hiding this comment.
🔴 Bug: every write-time hook resolves scope through this cache-only sync path. Once the 5s cache expires, or after a mid-session dkg_add_task / dkg_update_task_status, resolveActiveTaskId() drops to null and checkPath() treats that as "no active scope" -> allow. In practice that disables task enforcement after the cache TTL. The pre-write hooks need an async refresh on stale/miss, or stale cache needs to fail closed instead of allowing writes.
| if (!inPlace) return { cmd: head, targets: [] }; | ||
| } | ||
|
|
||
| for (const t of rest) { |
There was a problem hiding this comment.
🔴 Bug: sed -i commands are parsed as though every non-flag token is a file target, so the replacement program (s/a/b/, -e ..., etc.) gets treated as a path. That makes otherwise in-scope sed -i edits fail the scope check because the guard tries to authorize the script text as a repo path. Special-case sed so only actual file operands are returned.
| const status = line.slice(0, 2); | ||
| const rest = line.slice(3); | ||
| const arrow = rest.indexOf(' -> '); | ||
| const path = arrow >= 0 ? rest.slice(arrow + 4) : rest; |
There was a problem hiding this comment.
🔴 Bug: porcelain rename entries are old -> new; keeping only the new path means the revert step loses the original tracked path. A bypass like git mv can therefore leave an out-of-scope/protected rename unreverted because git checkout -- <new> won't restore old. Preserve both sides of renames here (the Claude mirror has the same issue).
System that doesnt let agents collide with other code and work