refactor(test/e2e): consolidate shell-probe with scenario spawn infra#5033
refactor(test/e2e): consolidate shell-probe with scenario spawn infra#5033laitingsheng wants to merge 3 commits into
Conversation
…wn infra Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughExtracts trusted-shell token validation and a superviseChild helper; migrates shell-probe, phase orchestrator, probe utilities, and probes to validate argv tokens and use the shared supervisor for detached process-group management, timeout escalation, abort handling, and stdout/stderr streaming. ChangesUnified Shell Infrastructure Consolidation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: Dispatch required scenario E2E:
Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 0 worth checking, 0 nice ideas This is an automated advisory review. A human maintainer must make the final merge decision. |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/e2e-scenario/scenarios/orchestrators/phase.ts (1)
174-181:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPass declared secret values into
redactString.Line 203/206 and Line 453/456 only run the regex/context redactors. Since these children still receive
action.secretEnv/step.secretEnv, a probe that echoes one of those values verbatim can now leak it into the evidence log andstderrTailunchanged.🔧 Minimal fix
+ const explicitRedactions = (action.secretEnv ?? []) + .map((key) => env[key]) + .filter((value): value is string => typeof value === "string" && value.length > 0); + const supervised = await superviseChild(child, { timeoutMs: timeoutSeconds * 1_000, onStdout: (chunk) => { - logStream.write(redactString(chunk)); + logStream.write(redactString(chunk, explicitRedactions)); }, onStderr: (chunk) => { - const redacted = redactString(chunk); + const redacted = redactString(chunk, explicitRedactions); logStream.write(redacted); stderrTail = (stderrTail + redacted).slice(-4096); }, });Mirror the same change in
runShellStepusingstep.secretEnv.Also applies to: 202-208, 390-397, 452-458
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/scenarios/orchestrators/phase.ts` around lines 174 - 181, The regex/context-only redaction calls need to also consider declared secret values so verbatim secrets from child processes aren't leaked; update runShellStep to pass step.secretEnv into redactString (just like the change already applied for action.secretEnv when building env) wherever you produce evidence logs or stderrTail using redactString, ensuring both action.secretEnv and step.secretEnv are forwarded to redactString in the code paths that create evidence/log entries; search for runShellStep, redactString, step.secretEnv and update the redaction calls to include the secret list.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e-scenario/framework/shell/supervisor.ts`:
- Around line 101-114: The abort path can be misreported as a timeout because
the wall-clock timeout remains active after opts.signal triggers; update the
abort handling so that onAbort clears the timeout (clearTimeout(timeout)) before
calling terminate and ensure that terminate (or the surrounding logic that sets
timedOut) is not setting timedOut when termination was caused by opts.signal;
specifically modify the onAbort handler and/or terminate to clear the timer and
propagate an explicit abort reason (or a boolean) so timedOut remains false for
external cancels (references: timeout, timedOut, terminate, onAbort,
opts.signal).
In `@test/e2e-scenario/scenarios/orchestrators/phase.ts`:
- Around line 183-184: validateShellToken can throw on NUL bytes and those
exceptions currently escape runAction / runShellStep (seen where bashArgs are
mapped and where runShellStep is called), aborting the entire phase; wrap the
validateShellToken call(s) inside runShellStep/runAction so any thrown error is
caught and converted into a structured failure object with type "runner-infra"
(include the original error message in the failure details) and return that
result instead of letting the exception bubble to run()/runStep(); update both
call sites (the bashArgs mapping and the other occurrence around runShellStep)
to follow this pattern and ensure callers handle the structured failure return.
In `@test/e2e-scenario/scenarios/probes/diagnostics.ts`:
- Around line 89-95: When supervised.spawnError is present, don't collapse it
into a normal child exit path; instead record the spawn failure separately (e.g.
set a boolean like infraFailure/runnerInfra or a distinct sentinel variable)
while still appending the message to stderrTail, and leave exitCode
unset/nullable (or set to a distinct sentinel) so later logic that inspects
exitCode !== 0 can still differentiate a real child exit from a runner-infra
spawn failure; update the code around supervised.spawnError, exitCode and any
later classification logic that checks exitCode to use the new
infraFailure/runnerInfra signal when classifying failures.
In `@test/e2e-scenario/scenarios/probes/docs-validation.ts`:
- Around line 78-88: The runCheck() path in docs-validation.ts currently masks
supervised.spawnError by converting it into exitCode: 127; update runCheck() and
the DocsCheckResult shape so it returns the original spawn error instead of an
artificial exit code: add/return a spawnError (or error) field on
DocsCheckResult when supervised.spawnError is present (preserve
stderrTail/stdoutTail and elapsedMs), and ensure any callers of
runCheck()/DocsCheckResult (the docs check runner/runner-infra classifier)
inspect this new spawnError field rather than treating exitCode 127 as a normal
script failure; update type/interface for DocsCheckResult and usages that
construct/consume it accordingly (look for runCheck and DocsCheckResult in
docs-validation.ts and consumer code).
In `@test/e2e-scenario/scenarios/probes/util.ts`:
- Around line 126-141: The returned command result currently omits the
supervised.timedOut flag, so update both return paths (the spawnError branch and
the normal branch around supervised.spawnError, supervised.exitCode,
supervised.signal) to include timedOut: supervised.timedOut (or false if absent)
so callers can detect timeouts reliably; search for superviseChild /
SuperviseResult and ensure the returned object mirrors timedOut from the
supervise result (and then update downstream callers to branch on
result.timedOut instead of inferring from signal).
---
Outside diff comments:
In `@test/e2e-scenario/scenarios/orchestrators/phase.ts`:
- Around line 174-181: The regex/context-only redaction calls need to also
consider declared secret values so verbatim secrets from child processes aren't
leaked; update runShellStep to pass step.secretEnv into redactString (just like
the change already applied for action.secretEnv when building env) wherever you
produce evidence logs or stderrTail using redactString, ensuring both
action.secretEnv and step.secretEnv are forwarded to redactString in the code
paths that create evidence/log entries; search for runShellStep, redactString,
step.secretEnv and update the redaction calls to include the secret list.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 73dd647b-2149-4482-834c-320b399ff340
📒 Files selected for processing (8)
test/e2e-scenario/framework-tests/e2e-shell-supervisor.test.tstest/e2e-scenario/framework/shell-probe.tstest/e2e-scenario/framework/shell/supervisor.tstest/e2e-scenario/framework/shell/trusted-command.tstest/e2e-scenario/scenarios/orchestrators/phase.tstest/e2e-scenario/scenarios/probes/diagnostics.tstest/e2e-scenario/scenarios/probes/docs-validation.tstest/e2e-scenario/scenarios/probes/util.ts
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
Summary
Consolidate the TypeScript spawn boundaries shared between the Vitest fixture layer and the scenario orchestrator into a single framework module. Every TS spawn site now reaches the same detached process-group cleanup, SIGTERM -> SIGKILL escalation, and NUL-byte argv guard, and the
validation_suites/sandbox-exec.shtransport stays centralised for in-sandbox commands.Related Issue
Resolves #4988.
Changes
test/e2e-scenario/framework/shell/: a lifecycle-onlysuperviseChild(timeout, abort, SIGTERM -> SIGKILL on a detached process group, chunk callbacks) andtrustedShellCommand/validateShellToken(NUL-byte argv guard, brand symbol).framework/shell-probe.ts,scenarios/orchestrators/phase.ts(runAction+runShellStep), and the probe helpers (probes/util.ts,probes/docs-validation.ts,probes/diagnostics.ts) now spawn at their own site (preserving each site's argv contract and CodeQL marker) and hand the child to the shared supervisor.spawnBash,runHostCmd, docs / diagnostics probes) gain detached process-group cleanup and SIGKILL escalation;runHostCmdgains the NUL-byte argv guard.framework-tests/e2e-shell-supervisor.test.tsto cover the headline guarantees at the leaf level; existinge2e-fixture-context.test.tsande2e-phase-orchestrators.test.tscontinue to cover end-to-end behaviour.Type of Change
Verification
Signed-off-by: Tinson Lai tinsonl@nvidia.com
Summary by CodeRabbit
New Features
Tests
Bug Fixes