Skip to content

fix(tunnel): detect gateway with rewritten bare 'openclaw' argv (#4951)#4960

Open
abhi-0906 wants to merge 1 commit into
NVIDIA:mainfrom
abhi-0906:fix/issue-4951-tunnel-stop-gateway
Open

fix(tunnel): detect gateway with rewritten bare 'openclaw' argv (#4951)#4960
abhi-0906 wants to merge 1 commit into
NVIDIA:mainfrom
abhi-0906:fix/issue-4951-tunnel-stop-gateway

Conversation

@abhi-0906

@abhi-0906 abhi-0906 commented Jun 8, 2026

Copy link
Copy Markdown

Summary

nemoclaw tunnel stop (and its alias nemoclaw stop) reported success and exited 0 while the in-sandbox OpenClaw gateway — and its channel pollers — kept running. The stop script's process matcher never found the gateway because OpenClaw rewrites its own argv to a bare openclaw after startup. This teaches the matcher that third argv form so the gateway is actually detected and stopped.

Related Issue

Fixes #4951

Root cause

stopSandboxChannels runs an in-sandbox GATEWAY_STOP_SCRIPT (src/lib/tunnel/services.ts) whose find_gateway_pids scans ps -eo args= and matches the gateway by argv. It recognized only:

  • openclaw-gateway (the re-execed binary name), and
  • openclaw gateway run … (the launcher command nemoclaw-start runs).

But OpenClaw sets process.title = 'openclaw' after startup, so the live gateway's argv is just openclaw with no gateway suffix. As a result find_gateway_pids returned empty → the script exit 1reportStopResult interprets exit 1 as "gateway was not running" → the command prints success and exits 0, leaving the gateway and Slack/Telegram/Discord pollers alive. (Same process.title root cause as the sandbox HEALTHCHECK bug, NVB#6282411 / NVB#6282413.)

Changes

  • src/lib/tunnel/services.ts — add a third argv form to the find_gateway_pids awk matcher: a bare openclaw anchored to end-of-string, so it still rejects unrelated names like openclawish. Document the three forms and why example argv tokens must stay out of the awk program body (awk's own argv is captured by the concurrent ps snapshot, so an in-program literal like openclaw gateway makes awk match itself and the kill/verify scan never drains). Export GATEWAY_STOP_SCRIPT for end-to-end testing.
  • src/lib/tunnel/services-sandbox.test.ts — assert the matcher contains all three argv forms, and add two Linux-gated tests that execute the real GATEWAY_STOP_SCRIPT against live processes reproducing each argv form: the bare-argv gateway is found and killed (exit 0), and a non-gateway decoy is spared (exit 1).

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Ran locally on the affected files: biome check (clean), tsc -p jsconfig.json (clean), vitest run src/lib/tunnel/services-sandbox.test.ts (17 passed, 2 Linux-gated execution tests skipped on the Windows dev box), plus source-shape and test-size checks. The two Linux-gated execution tests were validated against the compiled script on Ubuntu 24.04 (WSL): bare openclaw killed → exit 0; decoy spared → exit 1.

Summary by CodeRabbit

  • Bug Fixes
    • Fixed tunnel stop command to properly terminate all gateway processes. Previously, the command would exit successfully but leave certain gateway processes running in the background.

…IA#4951)

`nemoclaw tunnel stop` could not stop the in-sandbox gateway: OpenClaw
rewrites its own argv via process.title after startup, so the running
gateway shows just `openclaw` with no `gateway` suffix. The awk matcher in
GATEWAY_STOP_SCRIPT only recognized `openclaw-gateway` and `openclaw gateway`,
so find_gateway_pids returned empty, reportStopResult misread exit 1 as
"not running", and the command exited 0 while the gateway (and its channel
pollers) kept running.

Add a third argv form to the matcher: a bare `openclaw` anchored to
end-of-string (so it still rejects names like `openclawish`). The example
argv tokens are kept out of the awk program text itself, because awk's argv
is captured by the concurrent `ps` snapshot and any such literal would make
awk match itself and prevent the scan from draining.

Export GATEWAY_STOP_SCRIPT and add Linux-gated tests that execute it against
real processes reproducing each argv form, asserting the bare gateway is
killed (exit 0) and a non-gateway decoy is spared (exit 1).

Signed-off-by: Abhimanyu Kumar <abhimanyukumar7290@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 8, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a80b5c17-6dcd-4503-b2ee-aa0fe74b9dbe

📥 Commits

Reviewing files that changed from the base of the PR and between ec408c8 and 97e3f1d.

📒 Files selected for processing (2)
  • src/lib/tunnel/services-sandbox.test.ts
  • src/lib/tunnel/services.ts

📝 Walkthrough

Walkthrough

This PR exports the embedded GATEWAY_STOP_SCRIPT and fixes its awk-based process matcher to detect the bare openclaw process form that emerges after OpenClaw rewrites process.title. Documentation is expanded and comprehensive test coverage—both unit assertions and a new Linux-only E2E test suite—validates the fix.

Changes

Gateway Stop Script Bug Fix

Layer / File(s) Summary
Stop script logic and documentation
src/lib/tunnel/services.ts
Expanded stopSandboxChannels documentation explains the failure mode where bare openclaw processes were missed. Awk matcher condition is extended to match bare openclaw in addition to openclaw-gateway and openclaw gateway forms. Comments clarify required argv forms and awk text constraints.
Unit and end-to-end test coverage
src/lib/tunnel/services-sandbox.test.ts
GATEWAY_STOP_SCRIPT is imported for end-to-end execution. Existing unit test assertions are updated to validate matching of additional gateway argv forms. New Linux-only E2E test suite spawns fake processes with controlled argv[0], runs the real stop script via sh -lc, and verifies correct termination of gateway processes and preservation of non-matching decoys.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Suggested labels

v0.0.60, area: cli, area: sandbox, bug-fix

Suggested reviewers

  • ericksoa
  • prekshivyas
  • cv

Poem

🐰 A rabbit hops through process trees,
Finding openclaw forms with ease—
Bare and gateway, all the same,
No more stop scripts miss their game!
E2E tests now guard the way,

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main fix: detecting the gateway process with rewritten bare 'openclaw' argv, which directly addresses the root cause of the bug described in the PR objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran

wscurran commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about detecting the gateway with rewritten bare 'openclaw' argv. This proposes a way to fix the regression in the tunnel stop functionality.


Related open issues:

1 similar comment
@wscurran

wscurran commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about detecting the gateway with rewritten bare 'openclaw' argv. This proposes a way to fix the regression in the tunnel stop functionality.


Related open issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[All Platforms][CLI&UX] nemoclaw tunnel stop reports gateway not running and exits 0 while in-sandbox openclaw gateway keeps running

2 participants