build/20260608 195531/0040 transient gate failure resilience#124
Merged
cahenesy merged 26 commits intoJun 10, 2026
Merged
Conversation
cahenesy
commented
Jun 9, 2026
Owner
- test(failing): ci-checks retry-once recovers a flake + knob/non-numeric guards (TDD 0040 §1, FR-15/NFR-4)
- step(1): ci-checks retry-once in run_ci_checks (TDD 0040 Component 1, FR-15/NFR-4)
- fix(step 1): add result-tally/exit-code epilogue so eval assertions are enforceable
- test(failing): retries-exhausted FAIL must log an explicit FAILED-after-N line (TDD 0040 §1, NFR-4)
- step(1): log an explicit FAILED-after-N line on ci-checks retry exhaustion (TDD 0040 §1, NFR-4)
- revert(step 1): back out ci-checks retry impl to re-derive it test-first
- step(1): ci-checks retry-once re-derived from the now-genuinely-failing eval (TDD 0040 Component 1, FR-15/NFR-4)
- test(failing): gate-unobservable enum membership + resume-first action + status render (TDD 0040 §6, Component 3)
- step(2): add gate-unobservable to the closed halt-cause enum + status render mirror (TDD 0040 Component 3, FR-57/NFR-4)
- test(failing): review/verify no-verdict → gate-unobservable; observed BLOCK untouched (TDD 0040 §3-§5, Component 2)
- step(3): no-verdict review subprocess → resumable gate-unobservable (TDD 0040 Component 2, FR-57/NFR-4/ADR 0006)
- fix(step 3): correct comment-vs-code — verify-runtime call site is NOT rewired
- step(4): finalize the transient-gate-resilience eval — TDD Failure-modes coverage (TDD 0040 §4)
- fix(step 4): guard tgr_build_output + git rev-parse call sites (FR-74 Harden /implement with real gates instead of self-reported success (v0.8.0) #1 fail-loud)
- test(failing): aggregator must register the transient-gate-resilience eval (TDD 0040 §5, TDD 0038 §3 wire-in rule)
- step(5): wire transient-gate-resilience eval into the implement-gate aggregator (TDD 0040 §5)
- chore: give §E3 captured gate_one output diagnostic use (clear SC2034)
- test(failing): §7 set_halt_cause write-failure → _classify_gate_no_verdict fails loud (TDD 0040 §7)
- fix: fail-loud on set_halt_cause write failure in _classify_gate_no_verdict (TDD 0040 §7)
- mark 0040-transient-gate-failure-resilience implemented (verified + reviewed)
…ic guards (TDD 0040 §1, FR-15/NFR-4)
… FR-15/NFR-4) On a ci-checks failure, re-run up to THROUGHLINE_CI_CHECKS_RETRIES (default 1) more times in the same worktree; the first passing run wins and a recovered flake is logged (not silent). RETRIES=0 restores no-retry; non-numeric default-and-warns. Signature unchanged so the gate_one call site is untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…re enforceable The test file exited 0 unconditionally (no epilogue reading RESULTS), making every §1-§2 assertion non-enforceable. Add the standard PASS/FAIL tally + final [ "$FAIL" -eq 0 ] so a single failing assertion exits the script non-zero (FR-74 #1 fail-loud; lets the step-5 aggregator wire-in catch regressions). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…er-N line (TDD 0040 §1, NFR-4) Genuine red→green for step 1: with the enforcement epilogue now present, this assertion exits the suite non-zero against the current implementation (which returns silently on exhaustion). Re-establishes the failing-test-first discipline the prior vacuous test(failing) commit lacked. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…stion (TDD 0040 §1, NFR-4) A retries-exhausted real FAIL now records a telemetry line as visible as the recovered-flake line, so the gate log distinguishes a retries-exhausted FAIL from a single-shot one. Completes the genuine red→green for step 1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The initial test(failing) commit (0cd8e0d) was vacuous — it lacked the enforcement epilogue, so it exited 0 against pre-impl code and never drove the core retry behavior with a genuine red. History cannot be rewritten (divergence guard), so re-derive honestly: with the epilogue now present, backing out the implementation makes the §1 core-retry assertions genuinely RED (suite exits non-zero), and the next commit re-implements to green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ng eval (TDD 0040 Component 1, FR-15/NFR-4) Re-implements run_ci_checks retry-once after the test-first backout (0f210a7): the §1 core-retry assertions were genuinely RED against the backed-out code and this commit greens them. On a ci-checks failure, re-run up to THROUGHLINE_CI_CHECKS_RETRIES (default 1) more times; first pass wins, a recovered flake and a retries-exhausted FAIL are both logged explicitly (NFR-4). RETRIES=0 disables retry; non-numeric default-and-warns. Signature unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n + status render (TDD 0040 §6, Component 3) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… render mirror (TDD 0040 Component 3, FR-57/NFR-4) state.sh _next_actions_for_cause gains a gate-unobservable arm with a resume-first action list (no revision precondition — a no-verdict gate is safe to re-run), admitting it to the closed FR-63 enum so set_halt_cause accepts it and the blocked fragment is auto-resumable via _resume_from's blocked arm. status.sh _halt_cause_known gains the mirror so it renders without an unknown-cause warning. No schema change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… BLOCK untouched (TDD 0040 §3-§5, Component 2) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…TDD 0040 Component 2, FR-57/NFR-4/ADR 0006) Add the gate-agnostic _classify_gate_no_verdict helper (set_halt_cause gate-unobservable + _terminal_state blocked, in that order) and _gate_output_tail (stderr/output-tail detail), and rewire both no-verdict paths in _rework_loop (the rc!=0-no-fresh-verdict path and the neither-PASS-nor-BLOCK crash guard) from the old terminal 'failed' to the resumable gate-unobservable halt. The discriminator is verdict-presence, never exit code — an observed BLOCK/PASS is untouched. The verify-runtime call site (gate_one in lib/resume.sh, outside this TDD's declared ## Touched files) reuses the same gate-agnostic classifier. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…T rewired The _classify_gate_no_verdict docstring and the §4/header eval comments claimed (present tense) the verify-runtime call site reuses the classifier. It does not: resume.sh is outside this TDD's ## Touched files and its no-verdict path still records terminal 'failed'. Restate accurately: the helper is gate-AGNOSTIC and ready, but only the review gate drives it in this TDD; wiring the verify-runtime call site is a follow-up within resume.sh's scope (ADR 0006 honest comments). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…des coverage (TDD 0040 §4) Add the remaining ## Failure modes & edge cases assertions over behavior already delivered in steps 1-3 (no production-code change): double-flake bounded to FAIL (retry-once is not retry-until-green), RETRIES=2 raises the bound, and a malformed/truncated verdict resolves to gate-unobservable (NFR-4: ambiguity is couldn't-observe, never a guessed verdict). Pure test-coverage hardening — the production code is unchanged, so this step is legitimately no-new-behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fail-loud) A setup failure in tgr_build_output (or the preceding git rev-parse) would silently drop the downstream assertion with no bad() record. Guard all three call sites (§3, §5, §E) with || { bad ...; exit 0; } so a fixture failure is surfaced loud, not swallowed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… eval (TDD 0040 §5, TDD 0038 §3 wire-in rule) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…aggregator (TDD 0040 §5) Register the eval (run it) and add [ "$TGR_FAIL" -eq 0 ] to the final AND-chain so ci-checks regression-gates Components 1-3. New gating behavior driven red→green by the eval's §W dogfood (05ae5f0) per the TDD 0038 §3 wire-in rule. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The §E3 'st' capture of gate_one's output was unused (SC2034). Reference it in the failure diagnostic. The remaining TDDS=() (consumed by the sourced implement.sh via dynamic scope, required by the SOURCE_ONLY guard) and TGR_FAIL=1 (consumed inside eval "$chain" in the §W dogfood, mirroring TFP §8) are necessary idioms shellcheck-static cannot see — present in every sibling eval. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…failure-resilience
…31/0040-transient-gate-failure-resilience
…31/0040-transient-gate-failure-resilience
…31/0040-transient-gate-failure-resilience
…rdict fails loud (TDD 0040 §7)
…erdict (TDD 0040 §7)
…31/0040-transient-gate-failure-resilience
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.