Skip to content

[codex] Add per-probe runtime deadline#500

Merged
robinebers merged 2 commits into
robinebers:mainfrom
zergzorg:codex/resource-budget-probe-timeout
May 25, 2026
Merged

[codex] Add per-probe runtime deadline#500
robinebers merged 2 commits into
robinebers:mainfrom
zergzorg:codex/resource-budget-probe-timeout

Conversation

@zergzorg
Copy link
Copy Markdown
Contributor

@zergzorg zergzorg commented May 23, 2026

Summary

  • add a 30s wall-clock budget for each plugin probe execution
  • use the QuickJS interrupt handler to stop CPU-bound plugin scripts after the probe budget expires
  • pass the same deadline into the host API for the currently deadline-aware blocking calls: host.http and host.ccusage
  • keep the existing plugin output contract by returning an Error badge when a probe times out

Why

This is another small slice of #487. Previous follow-ups limit duplicate auto-update probes and batch fan-out; this one bounds the duration of an individual provider probe for the highest-risk paths: CPU-bound JavaScript loops and the existing long-running HTTP/ccusage calls.

The timeout is intentionally conservative at 30 seconds per provider. Existing provider-level HTTP defaults and ccusage timeouts continue to work, but they cannot exceed the remaining per-probe budget.

Scope and limitations

This is not a complete hard deadline for every possible host API path yet. It does not rewrite every synchronous local operation into a killable subprocess helper. In particular, local filesystem operations and the remaining system-command based helpers (security, sqlite3, ps, lsof, plus env shell reads) are left for a later, more invasive slice.

That follow-up is called out separately so this PR stays small and reviewable: first stop CPU-bound plugin scripts and clamp the known long-running HTTP/ccusage calls, then add a shared timeout wrapper for the remaining host subprocess helpers.

This does not change frontend scheduling, event names, provider enablement, or manual refresh behavior.

Validation

  • bun run bundle:plugins
  • cargo test plugin_engine::runtime::tests
  • cargo test plugin_engine::host_api::tests::probe_deadline_clamps_host_timeout_to_remaining_budget
  • cargo test plugin_engine::host_api::tests::ccusage_timeout_stops_runner_fallback
  • cargo test plugin_engine::host_api::tests::ccusage_runner_retries_legacy_package_when_current_package_fails
  • cargo test --lib -- --skip ccusage_timeout_kills_descendant_and_closes_pipes
  • bun run build
  • node ./node_modules/vitest/vitest.mjs run

I also ran full cargo test; Rust compilation and 97 tests passed, but the existing ccusage_timeout_kills_descendant_and_closes_pipes test still fails locally while trying to read descendant.pid before it exists. I did not change that unrelated test in this PR. That test is handled separately in #501.

Part of #487.

@github-actions github-actions Bot added the rust Pull requests that update rust code label May 23, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

@zergzorg
Copy link
Copy Markdown
Contributor Author

Stack note for maintainers/agents:

Recommended #487 merge order:

  1. [codex] Skip auto-update probes already in flight #498 first: skip duplicate auto-update probes for providers already in flight.
  2. [codex] Cap concurrent plugin probes per batch #499 second: cap backend batch fan-out to 4 probe workers.
  3. [codex] Add per-probe runtime deadline #500 third: add a per-provider runtime deadline.

This PR is based on main and has no hard code dependency on #498 or #499, but it is easiest to review last: after duplicate work and batch width are bounded, this bounds the duration of a single provider probe.

Remaining follow-up candidates after this stack: shared timeout wrapper for remaining host subprocess calls, bounded local HTTP API accept path, cache-write debounce, and lightweight probe diagnostics.

#490 is separate from this stack and can merge anytime.

@validatedev
Copy link
Copy Markdown
Collaborator

@codex review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a per-probe wall-clock deadline (default 30s) to prevent plugin probes from running indefinitely. It enforces the deadline both for CPU-bound QuickJS execution (via the QuickJS interrupt handler) and for blocking host API calls by clamping host.http and host.ccusage timeouts to the remaining probe budget, while preserving the existing “Error badge on failure” output contract.

Changes:

  • Introduce a 30s probe budget and QuickJS interrupt-based cancellation for CPU-bound scripts.
  • Propagate a ProbeDeadline into the host API and clamp HTTP/ccusage timeouts to remaining budget.
  • Add targeted tests for CPU-bound timeout behavior and host timeout clamping.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src-tauri/src/plugin_engine/runtime.rs Adds per-probe timeout orchestration and QuickJS interrupt handling; returns a consistent timeout error badge.
src-tauri/src/plugin_engine/host_api.rs Introduces ProbeDeadline and clamps host.http / host.ccusage timeouts to the remaining probe budget; adds deadline-related tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src-tauri/src/plugin_engine/host_api.rs Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@zergzorg zergzorg force-pushed the codex/resource-budget-probe-timeout branch from d5eeb2f to 767394b Compare May 24, 2026 07:03
@zergzorg
Copy link
Copy Markdown
Contributor Author

Addressed the review feedback in the latest push:

  • ProbeDeadline::clamp_duration now returns no timeout when the remaining budget is already exhausted instead of inventing an extra 1ms;
  • host.http and host.ccusage now short-circuit as timed out in that case before starting blocking work;
  • added coverage for the elapsed-budget case.

Verified locally with cargo test probe_deadline, cargo test run_probe_times_out_cpu_bound_script, and cargo test ccusage_timeout_stops_runner_fallback. The updated branch is also rebased on current main, and CI is green.

Copy link
Copy Markdown
Owner

@robinebers robinebers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robinebers robinebers merged commit 9a9f01d into robinebers:main May 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants