ROB-3570: Add holmes_validate_toolset action for toolset validation#2034
ROB-3570: Add holmes_validate_toolset action for toolset validation#2034
Conversation
Adds a new action that calls the HolmesGPT POST /api/toolsets/validate endpoint, allowing validation of toolset YAML configurations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
|
✅ Docker image ready for
Use this tag to pull the image for testing. 📋 Copy commandsgcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:7d1b92f
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:7d1b92f me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92f
docker push me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92fPatch Helm values in one line: helm upgrade --install robusta robusta/robusta \
--reuse-values \
--set runner.image=me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92f |
WalkthroughAdds Holmes toolset validate and refresh features: new request/response models and params, two websocket actions that discover Holmes URL and POST to Holmes endpoints, helm allowlist entries, and a conditional shutdown fallback in the config loader. Changes
Sequence DiagramsequenceDiagram
participant Event as ExecutionBaseEvent
participant Action as holmes_validate_toolset
participant Discovery as HolmesDiscovery
participant HolmesAPI as Holmes API
participant WS as WebSocket
Event->>Action: trigger with HolmesValidateToolsetParams
Action->>Discovery: find_holmes_url(params.holmes_url)
Discovery-->>Action: holmes_url or error
alt Holmes endpoint unavailable
Action->>Action: raise ActionException (HOLMES_DISCOVERY_FAILED)
else Holmes endpoint found
Action->>Action: build HolmesValidateToolsetRequest (yaml_config)
Action->>HolmesAPI: POST {holmes_url}/api/toolsets/validate
HolmesAPI-->>Action: HTTP response
Action->>Action: response.raise_for_status()
Action->>Action: parse HolmesValidateToolsetResponse
Action->>WS: set event.response with parsed JSON
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/robusta/core/playbooks/internal/ai_integration.py (1)
400-406: Add a timeout to therequests.postcall to prevent indefinite hangs.The static analysis tool (Ruff S113) correctly flags that this
requests.postcall lacks a timeout. Without a timeout, the call can hang indefinitely if the Holmes server becomes unresponsive, potentially blocking resources.While I note that other
requests.postcalls in this file also lack timeouts (a pre-existing pattern), it's worth addressing here to improve reliability.🔧 Suggested fix
holmes_req = HolmesValidateToolsetRequest(yaml_config=params.yaml_config) url = f"{holmes_url}/api/toolsets/validate" - result = requests.post(url, data=holmes_req.json()) + result = requests.post(url, data=holmes_req.json(), timeout=60) result.raise_for_status()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 400 - 406, The requests.post call that builds holmes_req (HolmesValidateToolsetRequest) and posts to f"{holmes_url}/api/toolsets/validate" should include a timeout to avoid indefinite hangs: add a timeout kwarg (e.g. timeout=10) to requests.post(...) and keep using result.raise_for_status() and HolmesValidateToolsetResponse(...) as before; optionally wrap the call to catch requests.Timeout or requests.RequestException and surface the error via event.ws or logging so the caller can handle failures gracefully.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Around line 400-406: The requests.post call that builds holmes_req
(HolmesValidateToolsetRequest) and posts to
f"{holmes_url}/api/toolsets/validate" should include a timeout to avoid
indefinite hangs: add a timeout kwarg (e.g. timeout=10) to requests.post(...)
and keep using result.raise_for_status() and HolmesValidateToolsetResponse(...)
as before; optionally wrap the call to catch requests.Timeout or
requests.RequestException and surface the error via event.ws or logging so the
caller can handle failures gracefully.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 86d0f306-a0f6-4acd-b910-872200d1ddbb
📒 Files selected for processing (3)
src/robusta/core/model/base_params.pysrc/robusta/core/playbooks/internal/ai_integration.pysrc/robusta/core/reporting/holmes.py
…atus field - Fix AttributeError by using event.response instead of event.ws() - Add holmes_validate_toolset to lightActions in Helm values - Update status field description to 'valid'/'invalid' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/robusta/core/playbooks/internal/ai_integration.py (1)
403-403: Consider adding a timeout to therequests.postcall.The static analysis tool (Ruff S113) flags this HTTP request as lacking a timeout. While this is consistent with other
requests.postcalls in this file, missing timeouts can cause the action to hang indefinitely if the Holmes service becomes unresponsive.⏱️ Proposed fix to add timeout
- result = requests.post(url, data=holmes_req.json()) + result = requests.post(url, data=holmes_req.json(), timeout=30)Note: This is a broader pattern in this file. Consider addressing timeouts across all
requestscalls as a follow-up improvement.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/robusta/core/playbooks/internal/ai_integration.py` at line 403, The requests.post call that sends holmes_req.json() (the line assigning result = requests.post(url, data=holmes_req.json())) should include a timeout to avoid hanging; update that call to pass a sensible timeout (e.g., timeout=10) and ensure the surrounding function (in ai_integration.py where holmes_req is prepared and result is used) can handle request timeouts by catching requests.Timeout or requests.RequestException and treating it the same way other network errors are handled in this module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Line 403: The requests.post call that sends holmes_req.json() (the line
assigning result = requests.post(url, data=holmes_req.json())) should include a
timeout to avoid hanging; update that call to pass a sensible timeout (e.g.,
timeout=10) and ensure the surrounding function (in ai_integration.py where
holmes_req is prepared and result is used) can handle request timeouts by
catching requests.Timeout or requests.RequestException and treating it the same
way other network errors are handled in this module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4e2bc8b0-82c6-4dda-aeba-a24a9c239b56
📒 Files selected for processing (3)
helm/robusta/values.yamlsrc/robusta/core/playbooks/internal/ai_integration.pysrc/robusta/core/reporting/holmes.py
✅ Files skipped from review due to trivial changes (1)
- src/robusta/core/reporting/holmes.py
Adds a new light action that calls HolmesGPT's POST /api/toolsets/refresh endpoint to trigger toolset status refresh and DB reporting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/robusta/core/playbooks/internal/ai_integration.py (2)
401-411: Add timeout to the HTTP request for reliability.The
requests.postcall lacks a timeout parameter, which could cause the action to hang indefinitely if the Holmes server becomes unresponsive. While existing calls in this file also lack timeouts, it's good practice to add one for new code.♻️ Suggested fix
try: holmes_req = HolmesValidateToolsetRequest(yaml_config=params.yaml_config) url = f"{holmes_url}/api/toolsets/validate" - result = requests.post(url, data=holmes_req.json()) + result = requests.post(url, data=holmes_req.json(), timeout=60) result.raise_for_status() holmes_response = HolmesValidateToolsetResponse(**json.loads(result.text)) event.response = {"success": True, **holmes_response.dict()}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 401 - 411, The requests.post call used to validate the Holmes toolset (where HolmesValidateToolsetRequest is built and HolmesValidateToolsetResponse is parsed) has no timeout and can hang; update the requests.post invocation that posts to f"{holmes_url}/api/toolsets/validate" to include a sensible timeout (e.g., timeout=5 or a configured constant) and propagate/handle timeout exceptions the same way other exceptions are handled (so handle_holmes_error(e) still runs on failure); change only the requests.post call and, if needed, add a small module-level constant for the timeout to keep the call readable.
423-431: Add timeout to the HTTP request for reliability.Same concern as
holmes_validate_toolset— therequests.postcall should include a timeout to prevent indefinite hangs.♻️ Suggested fix
try: url = f"{holmes_url}/api/toolsets/refresh" - result = requests.post(url) + result = requests.post(url, timeout=60) result.raise_for_status() event.response = {"success": True}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 423 - 431, The HTTP call that refreshes Holmes toolsets (the requests.post in the try block that builds url = f"{holmes_url}/api/toolsets/refresh") lacks a timeout and can hang; add a sensible timeout parameter (matching holmes_validate_toolset) to the requests.post(...) call so the request fails fast and still triggers the existing except block and handle_holmes_error(e) path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Around line 401-411: The requests.post call used to validate the Holmes
toolset (where HolmesValidateToolsetRequest is built and
HolmesValidateToolsetResponse is parsed) has no timeout and can hang; update the
requests.post invocation that posts to f"{holmes_url}/api/toolsets/validate" to
include a sensible timeout (e.g., timeout=5 or a configured constant) and
propagate/handle timeout exceptions the same way other exceptions are handled
(so handle_holmes_error(e) still runs on failure); change only the requests.post
call and, if needed, add a small module-level constant for the timeout to keep
the call readable.
- Around line 423-431: The HTTP call that refreshes Holmes toolsets (the
requests.post in the try block that builds url =
f"{holmes_url}/api/toolsets/refresh") lacks a timeout and can hang; add a
sensible timeout parameter (matching holmes_validate_toolset) to the
requests.post(...) call so the request fails fast and still triggers the
existing except block and handle_holmes_error(e) path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6ab01256-ff66-44c1-88de-7082c3c5c086
📒 Files selected for processing (3)
helm/robusta/values.yamlsrc/robusta/core/model/base_params.pysrc/robusta/core/playbooks/internal/ai_integration.py
os.killpg is not available on Windows, so fall back to os.kill with the current process ID. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/robusta/runner/config_loader.py (1)
284-284: Remove commented-out dead code.The commented-out line serves no purpose and should be removed to keep the codebase clean.
Proposed fix
- # os.killpg(os.getpgid(0), signal.SIGTERM)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/robusta/runner/config_loader.py` at line 284, Remove the dead commented-out line "os.killpg(os.getpgid(0), signal.SIGTERM)" from config_loader.py so the codebase is clean; simply delete that commented line (no code behaviour changes), leaving surrounding logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/robusta/runner/config_loader.py`:
- Line 284: Remove the dead commented-out line "os.killpg(os.getpgid(0),
signal.SIGTERM)" from config_loader.py so the codebase is clean; simply delete
that commented line (no code behaviour changes), leaving surrounding logic
intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4aa071a4-efc1-4809-9993-52b13d1e5092
📒 Files selected for processing (1)
src/robusta/runner/config_loader.py
Changes
HolmesValidateToolsetParamsclass to define parameters for toolset validation withyaml_configfieldholmes_validate_toolsetaction that validates toolset configuration via Holmes API endpoint/api/toolsets/validateHolmesValidateToolsetRequest: Request model with YAML configurationHolmesValidateToolsetResult: Individual toolset validation resultHolmesValidateToolsetResponse: Response wrapper containing list of validation resultsDetails
This enables validation of Holmes toolset configurations through the Holmes API, with proper error handling and response serialization.