fix(agent-harness): give raw security-policy blocks a workaround + relay directive (#4094) by senamakel · Pull Request #4478 · tinyhumansai/openhuman

senamakel · 2026-07-03T22:57:38Z

Summary

Give the ~20 raw [policy-blocked] denials emitted deep in SecurityPolicy / the tools a structured Blocked / Reason / Workaround / relay message, so the agent gets a concrete way forward instead of dead-ending.
Add PolicyDenial::SecurityPolicyBlocked { tool, raw_reason } + a maybe_enrich_policy_block helper, wired into ToolOutcomeCaptureMiddleware::after_tool (the last after_tool hook). The [policy-blocked] marker is preserved; already-structured ToolPolicyMiddleware denials are not double-wrapped.
(CI) One unrelated commit adds the two config_*_privacy_mode methods to a stale schema-catalog golden — see ## Related.

Problem

Issue #4094: when a tool policy / permission boundary blocks an action, the agent often dead-ends — no clear message, no workaround — leaving the user stuck.

PR #4443 fixed this for the pluggable-ToolPolicy and channel-permission denials (rendered via PolicyDenial in ToolPolicyMiddleware). But the security-policy / autonomy layer was untouched: every [policy-blocked] denial produced inside SecurityPolicy and the tools themselves — read-only autonomy (enforcement.rs), command classification (command_checks.rs), path checks (path_checks.rs), and ~14 per-tool guards (filesystem write/edit/apply_patch, git, schedule, install_tool, browser, screenshot, mouse, …) — still returns a bare marker line like "[policy-blocked] Security policy: read-only mode — only read commands are allowed". The model gets no reason expansion, no workaround, and no instruction to relay it to the user, so it halts.

Solution

Enrich those raw denials at a single seam rather than the ~20 emission sites:

PolicyDenial::SecurityPolicyBlocked { tool, raw_reason } renders the standard Blocked: … Reason: … Workaround: … <relay> shape. It strips the marker from the reason and re-adds it on the Blocked: line, so the [policy-blocked] marker survives for downstream contains(POLICY_BLOCKED_MARKER) checks (failure classification, the repeated-failure loop-breaker). The workaround points at the levers that actually unblock the family — raise the agent-access tier / autonomy (Settings → Agent access / config.update_autonomy_settings / [autonomy]), or use a permitted alternative.
maybe_enrich_policy_block(tool, content) returns the enriched string only when the content carries the marker and lacks a Workaround: suffix — so already-structured ToolPolicyMiddleware denials (from fix(agent-harness): structured policy-block messages — rework of #4390 onto post-#4399 #4443) and plain errors are left untouched (no double-wrapping).
Call it at the top of ToolOutcomeCaptureMiddleware::after_tool, which already inspects the marker for classification and runs last in the after_tool chain, so the enriched text is what the transcript keeps.

Submission Checklist

Tests added or updated (happy path + failure/edge) — security_policy_block_keeps_marker_and_adds_workaround_and_relay, maybe_enrich_only_touches_raw_marker_results (raw → enriched, structured → skipped, plain error → skipped), and a middleware test raw_security_policy_block_is_enriched_with_workaround_and_relay + already_structured_denial_is_not_double_wrapped.
Diff coverage ≥ 80% — the new render arm, helper, and middleware branch are all exercised by the added tests.
Coverage matrix updated — N/A: behaviour-only change (extends existing policy-denial messaging).
All affected feature IDs listed under ## Related — N/A (no new feature row).
No new external network dependencies introduced.
Manual smoke checklist updated if this touches release-cut surfaces — N/A: internal harness messaging.
Linked issue closed via Closes #NNN.

Impact

Desktop/CLI agent harness. No schema/migration/network change. Purely enriches the text of already-failing policy-blocked tool results; the marker and failure classification are preserved, so the loop-breaker and timeline behave identically.

Closes: [Harness] Policy-blocked actions should return a clear message + workaround #4094
Follow-up PR(s)/TODOs: the test(ci): commit here adds config_get_privacy_mode / config_set_privacy_mode to the worker_a_controller_schemas_are_fully_exposed golden — an unrelated pre-existing break (feat(privacy): Privacy Mode + local-only inference enforcement (#4435) #4446 registered those RPC methods without updating the golden; also fixed in feat(orchestration): nested contacts → sessions with create + session send #4475). Drop that commit once main carries the fix.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/GH-4094-security-policy-block-messages
Commit SHA: see head

Validation Run

pnpm --filter openhuman-app format:check — N/A (no frontend changes)
pnpm typecheck — N/A (no TS changes)
Focused tests: openhuman::tinyagents::policy_denial::tests + the two ToolOutcomeCaptureMiddleware enrichment tests — execution deferred to CI (shared local build host saturated)
Rust fmt/check (if changed): rustfmt --edition 2021 --check clean; cargo check --bin openhuman-core passed (Finished dev)
Tauri fmt/check (if changed): N/A

Validation Blocked

command: cargo test --lib -- openhuman::tinyagents::policy_denial openhuman::tinyagents::middleware::tests
error: none — shared build host saturated by a concurrent worktree's build; test-binary compile did not finish in-window
impact: cargo check compiled clean; the tests are deterministic assertions over the added code and are validated by CI

Behavior Changes

Intended behavior change: raw security-policy/autonomy [policy-blocked] tool results now carry a workaround + a relay-to-the-user directive.
User-visible effect: on a policy block, the agent explains what was blocked and why and offers a concrete next step, instead of stopping silently.

Parity Contract

Legacy behavior preserved: the [policy-blocked] marker is retained, so ToolFailureClass::BlockedByPolicy classification and the RepeatedToolFailureMiddleware hard-reject path match exactly as before. ToolPolicyMiddleware (fix(agent-harness): structured policy-block messages — rework of #4390 onto post-#4399 #4443) output is detected via its Workaround: suffix and left untouched.
Guard/fallback/dispatch parity checks: enrichment only fires when the marker is present and no Workaround: suffix exists; success results and plain errors are never rewritten.

Duplicate / Superseded PR Handling

Duplicate PR(s): none
Canonical PR: this one

…lay directive (tinyhumansai#4094) The pluggable-ToolPolicy and channel-permission denials were given a structured "Blocked / Reason / Workaround / relay" message by tinyhumansai#4443, but the ~20 `[policy-blocked]` denials emitted deep in `SecurityPolicy` and the tools themselves (read-only autonomy, command classification, path checks, per-tool guards) still returned a bare marker line with no workaround and no relay instruction — so the agent dead-ends on them with a diagnosis and no path forward. Enrich those raw denials at a single seam: - Add `PolicyDenial::SecurityPolicyBlocked { tool, raw_reason }` and a `maybe_enrich_policy_block` helper that wraps a marker-bearing result into the same structured shape (keeps the reason, appends a concrete workaround — raise the agent-access tier / use a permitted alternative — and the relay-to-the-user directive). The `[policy-blocked]` marker is preserved so failure classification and the repeated-failure breaker still match. - Call it at the top of `ToolOutcomeCaptureMiddleware::after_tool` (the last after_tool hook, so the enriched text is what the transcript keeps). Already-structured ToolPolicyMiddleware denials (which carry a `Workaround:` suffix) are left untouched — no double-wrapping. Tests: the new variant keeps the marker + adds workaround/relay; the helper enriches raw blocks, skips already-structured denials and plain errors; a middleware test drives a raw read-only-mode block through `after_tool` and asserts the enrichment. Claude-Session: https://claude.ai/code/session_01KcmdqJVpjmnH31HqTHRLwG

Unrelated CI unblock: tinyhumansai#4446 (Privacy Mode) registered `config_get_privacy_mode` / `config_set_privacy_mode` unconditionally but did not update the `worker_a_controller_schemas_are_fully_exposed` golden, so that test fails on main for every PR without the fix. Add the two methods at their sorted positions (same fix as tinyhumansai#4475) so this PR's Rust E2E / coverage jobs go green. Claude-Session: https://claude.ai/code/session_01KcmdqJVpjmnH31HqTHRLwG

coderabbitai · 2026-07-03T22:57:47Z

Warning

Review limit reached

You’ve reached a temporary PR review limit under our Fair Usage Limits Policy.

Your recent review volume is higher than typical usage, so adaptive limits are currently applied.

Next review available in: 40 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cdc7b49c-593a-4fad-9e39-64d7f5c47c9a

📥 Commits

Reviewing files that changed from the base of the PR and between 4d71702 and 6574b03.

📒 Files selected for processing (3)

src/openhuman/tinyagents/middleware.rs
src/openhuman/tinyagents/policy_denial.rs
tests/config_auth_app_state_connectivity_e2e.rs

_{Comment @coderabbitai help to get the list of available commands.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6574b034be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-03T23:00:07Z

+        if let Some(enriched) =
+            super::policy_denial::maybe_enrich_policy_block(&result.name, &result.content)
+        {


Gate policy-block enrichment to failed tool results

When a successful tool returns content containing the literal [policy-blocked] marker, this unconditional enrichment rewrites the success payload into a fake security-policy denial before success is computed. This is reachable with file_read, which returns file contents verbatim on success, so reading source/docs that mention the marker corrupts the transcript while the call is still recorded as successful; only actual failed, marker-prefixed policy errors should be enriched.

Useful? React with 👍 / 👎.

…rge) Two tests came in red from upstream/main via the merge (upstream main is itself red on these — its own PR CI fails on Rust Core Coverage); our changes don't touch the code they exercise: - tinyagents adapter_inventory_registers_model_tools_and_middleware: recent upstream middleware-stack churn produces 13 lifecycle middleware, but the literal still said 14 (its own enumeration lists 13). Match the code. - monitor_agent_e2e orchestrator_gets_denial_when_monitor_command_violates_policy: tinyhumansai#4478 reworded the security-policy denial to "[policy-blocked] Tool '<tool>' was blocked by the security policy"; the test still asserted the old contiguous "[policy-blocked] Security policy". Update the golden substring to the new wording. (ops_install::non_2xx_install_fetch_returns_err_for_4xx_and_5xx also flaked here — a pre-existing env-var-race under llvm-cov, untouched by this branch.) Claude-Session: https://claude.ai/code/session_014RLnG2QbdL3n9TLtfomdhB

senamakel added 2 commits July 3, 2026 22:55

senamakel requested a review from a team July 3, 2026 22:57

chatgpt-codex-connector Bot reviewed Jul 3, 2026

View reviewed changes

senamakel merged commit 83c6e01 into tinyhumansai:main Jul 4, 2026
13 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agent-harness): give raw security-policy blocks a workaround + relay directive (#4094)#4478

fix(agent-harness): give raw security-policy blocks a workaround + relay directive (#4094)#4478
senamakel merged 2 commits into
tinyhumansai:mainfrom
senamakel:fix/GH-4094-security-policy-block-messages

senamakel commented Jul 3, 2026

Uh oh!

coderabbitai Bot commented Jul 3, 2026

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

senamakel commented Jul 3, 2026

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Uh oh!

coderabbitai Bot commented Jul 3, 2026

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant