Skip to content

ROB-3570: Add holmes_validate_toolset action for toolset validation#2034

Open
alonelish wants to merge 4 commits intomasterfrom
claude/nifty-vaughan
Open

ROB-3570: Add holmes_validate_toolset action for toolset validation#2034
alonelish wants to merge 4 commits intomasterfrom
claude/nifty-vaughan

Conversation

@alonelish
Copy link
Copy Markdown
Collaborator

Changes

  • base_params.py: Add HolmesValidateToolsetParams class to define parameters for toolset validation with yaml_config field
  • ai_integration.py: Implement holmes_validate_toolset action that validates toolset configuration via Holmes API endpoint /api/toolsets/validate
  • holmes.py: Add request/response models for toolset validation:
    • HolmesValidateToolsetRequest: Request model with YAML configuration
    • HolmesValidateToolsetResult: Individual toolset validation result
    • HolmesValidateToolsetResponse: Response wrapper containing list of validation results

Details

This enables validation of Holmes toolset configurations through the Holmes API, with proper error handling and response serialization.

Adds a new action that calls the HolmesGPT POST /api/toolsets/validate
endpoint, allowing validation of toolset YAML configurations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

Docker image ready for 7d1b92f (built in 1m 49s)

⚠️ Warning: does not support ARM (ARM images are built on release only - not on every PR)

Use this tag to pull the image for testing.

📋 Copy commands

⚠️ Temporary images are deleted after 30 days. Copy to a permanent registry before using them:

gcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:7d1b92f
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:7d1b92f me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92f
docker push me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92f

Patch Helm values in one line:

helm upgrade --install robusta robusta/robusta \
  --reuse-values \
  --set runner.image=me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:7d1b92f

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Walkthrough

Adds Holmes toolset validate and refresh features: new request/response models and params, two websocket actions that discover Holmes URL and POST to Holmes endpoints, helm allowlist entries, and a conditional shutdown fallback in the config loader.

Changes

Cohort / File(s) Summary
Params
src/robusta/core/model/base_params.py
Added HolmesValidateToolsetParams(HolmesParams) with required yaml_config: str and HolmesRefreshToolsetsParams(HolmesParams) (empty model).
Reporting Models
src/robusta/core/reporting/holmes.py
Added HolmesValidateToolsetRequest(yaml_config: str), HolmesValidateToolsetResult(toolset_name, status, error?, description?), and HolmesValidateToolsetResponse(results: List[...]).
Actions / AI integration
src/robusta/core/playbooks/internal/ai_integration.py
Added holmes_validate_toolset and holmes_refresh_toolsets actions. Both discover Holmes URL, POST to /api/toolsets/validate or /api/toolsets/refresh, call raise_for_status(), set event.response, and route exceptions through handle_holmes_error.
Runner shutdown behavior
src/robusta/runner/config_loader.py
Replaced unconditional os.killpg(...) with conditional: use os.killpg when available, otherwise fall back to os.kill(os.getpid(), SIGTERM).
Helm values
helm/robusta/values.yaml
Added holmes_validate_toolset and holmes_refresh_toolsets to global.lightActions allowlist.

Sequence Diagram

sequenceDiagram
    participant Event as ExecutionBaseEvent
    participant Action as holmes_validate_toolset
    participant Discovery as HolmesDiscovery
    participant HolmesAPI as Holmes API
    participant WS as WebSocket

    Event->>Action: trigger with HolmesValidateToolsetParams
    Action->>Discovery: find_holmes_url(params.holmes_url)
    Discovery-->>Action: holmes_url or error

    alt Holmes endpoint unavailable
        Action->>Action: raise ActionException (HOLMES_DISCOVERY_FAILED)
    else Holmes endpoint found
        Action->>Action: build HolmesValidateToolsetRequest (yaml_config)
        Action->>HolmesAPI: POST {holmes_url}/api/toolsets/validate
        HolmesAPI-->>Action: HTTP response
        Action->>Action: response.raise_for_status()
        Action->>Action: parse HolmesValidateToolsetResponse
        Action->>WS: set event.response with parsed JSON
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • arikalon1
  • moshemorad
  • RoiGlinik
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a holmes_validate_toolset action for toolset validation, which aligns with the primary objective of the changeset.
Description check ✅ Passed The description clearly outlines the changes across multiple files and explains the purpose of enabling Holmes toolset validation via the API, which is directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/nifty-vaughan

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/robusta/core/playbooks/internal/ai_integration.py (1)

400-406: Add a timeout to the requests.post call to prevent indefinite hangs.

The static analysis tool (Ruff S113) correctly flags that this requests.post call lacks a timeout. Without a timeout, the call can hang indefinitely if the Holmes server becomes unresponsive, potentially blocking resources.

While I note that other requests.post calls in this file also lack timeouts (a pre-existing pattern), it's worth addressing here to improve reliability.

🔧 Suggested fix
         holmes_req = HolmesValidateToolsetRequest(yaml_config=params.yaml_config)
         url = f"{holmes_url}/api/toolsets/validate"
-        result = requests.post(url, data=holmes_req.json())
+        result = requests.post(url, data=holmes_req.json(), timeout=60)
         result.raise_for_status()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 400 -
406, The requests.post call that builds holmes_req
(HolmesValidateToolsetRequest) and posts to
f"{holmes_url}/api/toolsets/validate" should include a timeout to avoid
indefinite hangs: add a timeout kwarg (e.g. timeout=10) to requests.post(...)
and keep using result.raise_for_status() and HolmesValidateToolsetResponse(...)
as before; optionally wrap the call to catch requests.Timeout or
requests.RequestException and surface the error via event.ws or logging so the
caller can handle failures gracefully.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Around line 400-406: The requests.post call that builds holmes_req
(HolmesValidateToolsetRequest) and posts to
f"{holmes_url}/api/toolsets/validate" should include a timeout to avoid
indefinite hangs: add a timeout kwarg (e.g. timeout=10) to requests.post(...)
and keep using result.raise_for_status() and HolmesValidateToolsetResponse(...)
as before; optionally wrap the call to catch requests.Timeout or
requests.RequestException and surface the error via event.ws or logging so the
caller can handle failures gracefully.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 86d0f306-a0f6-4acd-b910-872200d1ddbb

📥 Commits

Reviewing files that changed from the base of the PR and between 14d6ba5 and 4122b14.

📒 Files selected for processing (3)
  • src/robusta/core/model/base_params.py
  • src/robusta/core/playbooks/internal/ai_integration.py
  • src/robusta/core/reporting/holmes.py

…atus field

- Fix AttributeError by using event.response instead of event.ws()
- Add holmes_validate_toolset to lightActions in Helm values
- Update status field description to 'valid'/'invalid'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/robusta/core/playbooks/internal/ai_integration.py (1)

403-403: Consider adding a timeout to the requests.post call.

The static analysis tool (Ruff S113) flags this HTTP request as lacking a timeout. While this is consistent with other requests.post calls in this file, missing timeouts can cause the action to hang indefinitely if the Holmes service becomes unresponsive.

⏱️ Proposed fix to add timeout
-        result = requests.post(url, data=holmes_req.json())
+        result = requests.post(url, data=holmes_req.json(), timeout=30)

Note: This is a broader pattern in this file. Consider addressing timeouts across all requests calls as a follow-up improvement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/robusta/core/playbooks/internal/ai_integration.py` at line 403, The
requests.post call that sends holmes_req.json() (the line assigning result =
requests.post(url, data=holmes_req.json())) should include a timeout to avoid
hanging; update that call to pass a sensible timeout (e.g., timeout=10) and
ensure the surrounding function (in ai_integration.py where holmes_req is
prepared and result is used) can handle request timeouts by catching
requests.Timeout or requests.RequestException and treating it the same way other
network errors are handled in this module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Line 403: The requests.post call that sends holmes_req.json() (the line
assigning result = requests.post(url, data=holmes_req.json())) should include a
timeout to avoid hanging; update that call to pass a sensible timeout (e.g.,
timeout=10) and ensure the surrounding function (in ai_integration.py where
holmes_req is prepared and result is used) can handle request timeouts by
catching requests.Timeout or requests.RequestException and treating it the same
way other network errors are handled in this module.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4e2bc8b0-82c6-4dda-aeba-a24a9c239b56

📥 Commits

Reviewing files that changed from the base of the PR and between 4122b14 and e4ec031.

📒 Files selected for processing (3)
  • helm/robusta/values.yaml
  • src/robusta/core/playbooks/internal/ai_integration.py
  • src/robusta/core/reporting/holmes.py
✅ Files skipped from review due to trivial changes (1)
  • src/robusta/core/reporting/holmes.py

@alonelish alonelish changed the title Add holmes_validate_toolset action for toolset validation ROB-3570: Add holmes_validate_toolset action for toolset validation Mar 24, 2026
Adds a new light action that calls HolmesGPT's POST /api/toolsets/refresh
endpoint to trigger toolset status refresh and DB reporting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/robusta/core/playbooks/internal/ai_integration.py (2)

401-411: Add timeout to the HTTP request for reliability.

The requests.post call lacks a timeout parameter, which could cause the action to hang indefinitely if the Holmes server becomes unresponsive. While existing calls in this file also lack timeouts, it's good practice to add one for new code.

♻️ Suggested fix
     try:
         holmes_req = HolmesValidateToolsetRequest(yaml_config=params.yaml_config)
         url = f"{holmes_url}/api/toolsets/validate"
-        result = requests.post(url, data=holmes_req.json())
+        result = requests.post(url, data=holmes_req.json(), timeout=60)
         result.raise_for_status()
         holmes_response = HolmesValidateToolsetResponse(**json.loads(result.text))
         event.response = {"success": True, **holmes_response.dict()}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 401 -
411, The requests.post call used to validate the Holmes toolset (where
HolmesValidateToolsetRequest is built and HolmesValidateToolsetResponse is
parsed) has no timeout and can hang; update the requests.post invocation that
posts to f"{holmes_url}/api/toolsets/validate" to include a sensible timeout
(e.g., timeout=5 or a configured constant) and propagate/handle timeout
exceptions the same way other exceptions are handled (so handle_holmes_error(e)
still runs on failure); change only the requests.post call and, if needed, add a
small module-level constant for the timeout to keep the call readable.

423-431: Add timeout to the HTTP request for reliability.

Same concern as holmes_validate_toolset — the requests.post call should include a timeout to prevent indefinite hangs.

♻️ Suggested fix
     try:
         url = f"{holmes_url}/api/toolsets/refresh"
-        result = requests.post(url)
+        result = requests.post(url, timeout=60)
         result.raise_for_status()
         event.response = {"success": True}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/robusta/core/playbooks/internal/ai_integration.py` around lines 423 -
431, The HTTP call that refreshes Holmes toolsets (the requests.post in the try
block that builds url = f"{holmes_url}/api/toolsets/refresh") lacks a timeout
and can hang; add a sensible timeout parameter (matching
holmes_validate_toolset) to the requests.post(...) call so the request fails
fast and still triggers the existing except block and handle_holmes_error(e)
path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/robusta/core/playbooks/internal/ai_integration.py`:
- Around line 401-411: The requests.post call used to validate the Holmes
toolset (where HolmesValidateToolsetRequest is built and
HolmesValidateToolsetResponse is parsed) has no timeout and can hang; update the
requests.post invocation that posts to f"{holmes_url}/api/toolsets/validate" to
include a sensible timeout (e.g., timeout=5 or a configured constant) and
propagate/handle timeout exceptions the same way other exceptions are handled
(so handle_holmes_error(e) still runs on failure); change only the requests.post
call and, if needed, add a small module-level constant for the timeout to keep
the call readable.
- Around line 423-431: The HTTP call that refreshes Holmes toolsets (the
requests.post in the try block that builds url =
f"{holmes_url}/api/toolsets/refresh") lacks a timeout and can hang; add a
sensible timeout parameter (matching holmes_validate_toolset) to the
requests.post(...) call so the request fails fast and still triggers the
existing except block and handle_holmes_error(e) path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ab01256-ff66-44c1-88de-7082c3c5c086

📥 Commits

Reviewing files that changed from the base of the PR and between e4ec031 and d2c5412.

📒 Files selected for processing (3)
  • helm/robusta/values.yaml
  • src/robusta/core/model/base_params.py
  • src/robusta/core/playbooks/internal/ai_integration.py

os.killpg is not available on Windows, so fall back to os.kill
with the current process ID.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/robusta/runner/config_loader.py (1)

284-284: Remove commented-out dead code.

The commented-out line serves no purpose and should be removed to keep the codebase clean.

Proposed fix
-                # os.killpg(os.getpgid(0), signal.SIGTERM)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/robusta/runner/config_loader.py` at line 284, Remove the dead
commented-out line "os.killpg(os.getpgid(0), signal.SIGTERM)" from
config_loader.py so the codebase is clean; simply delete that commented line (no
code behaviour changes), leaving surrounding logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/robusta/runner/config_loader.py`:
- Line 284: Remove the dead commented-out line "os.killpg(os.getpgid(0),
signal.SIGTERM)" from config_loader.py so the codebase is clean; simply delete
that commented line (no code behaviour changes), leaving surrounding logic
intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4aa071a4-efc1-4809-9993-52b13d1e5092

📥 Commits

Reviewing files that changed from the base of the PR and between d2c5412 and 3bf63c6.

📒 Files selected for processing (1)
  • src/robusta/runner/config_loader.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants