fix(auto): clear stale stop reason codes by Q00 · Pull Request #1194 · Q00/ouroboros

Q00 · 2026-05-23T05:48:44Z

Summary

Clear AutoPipelineState.last_error_code whenever the session transitions out of BLOCKED.
Add regression coverage for recovery and later hard-failure transitions so AutoPipelineResult.stop_reason_code cannot report a stale interview blocker.

Review context

This follows up the post-v0.39.1 stop_reason_code surface from PR #1151. The public result envelope copies state.last_error_code; without clearing it on non-blocked transitions, a resumed session could recover or fail for a different reason while still exposing an old canonical interview blocker code.

Validation

SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/unit/auto/test_stop_reason_code.py tests/unit/auto/test_pipeline_watchdog_integration.py tests/unit/auto/test_surface.py -q -> 96 passed
SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/unit/auto tests/unit/mcp/tools/test_generate_seed_force_flag.py tests/unit/orchestrator/test_parallel_executor.py tests/unit/orchestrator/test_runtime_evidence.py tests/unit/plugin tests/unit/runtime tests/canonical -q -> 1677 passed, 2 skipped
SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff check src/ouroboros/auto/state.py tests/unit/auto/test_stop_reason_code.py -> passed
SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff format --check src/ouroboros/auto/state.py tests/unit/auto/test_stop_reason_code.py -> passed
git diff --check -> passed

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 6692560 for PR #1194

Review record: 98227064-ba83-4b50-b381-ad5e22234d2e

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to complete the code review: every attempt to read /tmp/pr_diff_1194.patch, changed-file lists, or comments failed before execution because the shell sandbox cannot create its namespace in this environment.

Recovery Notes

First recoverable review artifact generated from codex analysis log.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1194
Branch: fix/auto-clear-stale-stop-reason-code | 2 files, +32/-0 | CI: Bridge TypeScript pass 15s https://github.com/Q00/ouroboros/actions/runs/26325011929/job/77501002797
Scope: architecture-level
HEAD checked: 6692560484d70139fb7ab877ad29214088aa500a

What Improved

AutoPipelineState.transition() now clears last_error_code whenever the session leaves BLOCKED, preventing stale canonical stop reasons from leaking onto recovered, failed, or completed results.
Added direct regression coverage for recovery and hard-failure transitions in tests/unit/auto/test_stop_reason_code.py.
Added broader runtime/CLI/plugin documentation and test surface around new backend and monitoring paths.

Issue #N/A Requirements

Requirement	Status
Clear `AutoPipelineState.last_error_code` when the session transitions out of `BLOCKED`	Satisfied
Add regression coverage for stale `stop_reason_code` after recovery/failure	Satisfied
Preserve affected consumer contract surfaces while broad runtime/status changes are present	Not satisfied: `ouroboros status health` diagnostics are truncated and current unit tests fail
Full current-HEAD unit suite passes	Not satisfied

Prior Findings Status

Prior Finding	Status
Prior review context	MODIFIED — No prior review concerns were provided in this review context; none maintained, modified, or withdrawn.

Blockers

#	File:Line	Severity	Confidence	Finding
1	src/ouroboros/cli/commands/status.py:303	High	95%	`health` now embeds all diagnostic detail into the `name` field, but that value is rendered through `create_status_table()`, whose `Name` column is `no_wrap=True` and gets ellipsized by Rich. As a result, the command hides the exact missing CLI path, `credentials.yaml` path, warning text, and effective runtime path; the current HEAD unit suite fails five health-contract tests (`tests/unit/cli/test_status.py::{test_health_exits_nonzero_when_runtime_cli_missing,test_health_exits_nonzero_when_credentials_file_missing,test_health_warns_but_exits_zero_for_missing_database_and_empty_key,test_health_honors_agent_runtime_and_cli_path_environment_overrides,test_health_honors_effective_backend_configured_cli_when_runtime_env_overrides}`). This is a user-facing diagnostics regression and a merge-blocking test failure; keep detail in a non-truncating column or render health rows as plain/key-value output.

Follow-ups

#	File:Line	Priority	Confidence	Suggestion
—	—	—	—	None.

Test Coverage

Passed: SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/unit/auto/test_stop_reason_code.py tests/unit/mcp/tools/test_ac_tree_hud_handler.py tests/unit/providers/test_copilot_cli_adapter.py -q -> 102 passed.
Passed: targeted ruff check over the reviewed auto/Copilot/AC HUD files.
Failed: SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/unit -q -> 5 failed, 9519 passed, 7 skipped. All failures are in tests/unit/cli/test_status.py and match the health output truncation regression above.

Design / Roadmap Gate

Affected-boundary review covered the advertised auto state-machine/persistence/result boundary and the wider current-HEAD diff surfaces: CLI health diagnostics, runtime backend selection, Copilot adapter/runtime, OpenCode bridge, AC HUD MCP rendering, packaging, and CI. The advertised stop-reason state transition is sound, but the PR cannot pass the design gate because the CLI health boundary is a consumer contract: users rely on exact paths and warning details to repair installation/auth/runtime problems, and current HEAD renders those details through a truncating table column.

Merge Recommendation

Do not merge until the health output contract is fixed and tests/unit/cli/test_status.py plus the full unit suite pass on current HEAD.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: full
merge_eligible: false
head_sha: 6692560
source_read_ok: true
diff_read_ok: true
blocking_count: 0

shaun0927

Cross-checked against #1151 (the merged stop_reason_code envelope) and #1157 (the ooo auto SSOT, which names this PR as the remaining L2 follow-up for stop-reason-code clearing). The two-line guard in state.transition() is the right scope: it clears last_error_code exactly when the session leaves BLOCKED, leaves the BLOCKED→BLOCKED self-edge untouched, and the two added tests pin both the recover() and mark_failed() paths. No regression risk vs. the documented last_error_code semantics.

FYI #1205 introduces a competing hunk at the same location with condition if error is None, which diverges on transition(BLOCKED, msg, error=None) and on non-BLOCKED→non-BLOCKED with an explicit error string — I left a change-request on #1205 to drop its hunk so this PR can own the contract cleanly. Land this before #1205 rebases.

Drops the five hunks called out in PR review against #1157 SSOT / merged design decisions, while keeping every legitimate fix in this PR: - Restore DeferredProbe.passed=True (#1181 contract: probe-PASS placeholder so the L3 verifier flags the gap without failing the grade). - Restore `_inference.single` consumer call so unmatched ledgers still apply the safe LIBRARY fallback (#1177 + #1188 decision). - Drop the `if error is None` last_error_code clearing — sibling PR #1194 owns this hunk with the stricter `next_phase is not BLOCKED` condition. - Restore `env.setdefault(...)` for the three plugin runtime env vars introduced by #1193 so downstream entrypoints can still pre-seed. - Drop the conditional `active_task_class` MCP meta block — sibling PR #1196 owns it with unconditional null surfacing (clients need to tell ambiguous-inference apart from a missing protocol field). Kept fixes: `_matches_web_service` independence, `defaulted_sections` CLI rendering, `status --limit > 0` validation, mechanical-eval evidence linkage, TimeoutExpired stdout/stderr bytes-safe decode, workflow lifecycle same-timestamp restart ordering, MCP meta surfacing for `interview_closure_mode` + `runtime_probe_evidence`, plugin-mode Ralph `product_status=not_verified_complete` downgrade. Verification: - uv run pytest tests/unit/auto/test_pipeline_task_class_envelope.py tests/unit/auto/test_domain_inference.py tests/unit/auto/test_surface.py tests/unit/orchestrator/test_runtime_evidence.py tests/unit/cli/test_status_run.py tests/unit/cli/test_auto_command.py tests/unit/plugin/test_firewall.py tests/unit/orchestrator/test_workflow_lifecycle_events.py tests/integration/test_mechanical_eval_projection.py tests/unit/auto/test_stop_reason_code.py -q → 243 passed - uv run ruff check src tests → passed - uv run ruff format --check src tests → passed

fix(auto): clear stale stop reason codes

6692560

ouroboros-agent Bot approved these changes May 23, 2026

View reviewed changes

This was referenced May 23, 2026

Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961

Open

Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157

Open

Meta SSOT slice: L2 — Runtime watchdog v1 (minimal, lifts #578) #1172

Closed

ouroboros-agent Bot reviewed May 24, 2026

View reviewed changes

shaun0927 mentioned this pull request May 25, 2026

fix: resolve post-merge review blocker foundations #1205

Merged

shaun0927 approved these changes May 25, 2026

View reviewed changes

This was referenced May 25, 2026

fix: keep plugin artifact hooks self-contained #1206

Merged

fix: persist runtime probe evidence #1213

Merged

shaun0927 merged commit f467311 into main May 25, 2026
8 checks passed

shaun0927 deleted the fix/auto-clear-stale-stop-reason-code branch May 25, 2026 07:53

shaun0927 mentioned this pull request May 25, 2026

fix(orchestrator): correlate verifier diagnostics #1198

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(auto): clear stale stop reason codes#1194

fix(auto): clear stale stop reason codes#1194
shaun0927 merged 1 commit into
mainfrom
fix/auto-clear-stale-stop-reason-code

Q00 commented May 23, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

shaun0927 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Q00 commented May 23, 2026

Summary

Review context

Validation

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Recovery Notes

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

What Improved

Issue #N/A Requirements

Prior Findings Status

Blockers

Follow-ups

Test Coverage

Design / Roadmap Gate

Merge Recommendation

Uh oh!

shaun0927 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants