WIP: Plan CodeQL sandbox recipe refactor#45
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Coverage Report
Generated by pytest-cov on |
17d5d25 to
f140f4b
Compare
Add tools/sandbox/recipe.py with load_recipe, validate_recipe, dump_recipe. Add recipe-validate and recipe-print subcommands to sandbox-bootstrap.py. Add templates/sandbox-recipe.yml.example (schema v1). Add tests/test_sandbox_recipe.py (24 tests covering valid/invalid schemas). Update docs/sandbox.md and templates/sandboxes/README.md with recipe docs. Update .project/codeql-sandbox-recipe-refactor-plan.md with locked plan.
tools/codeql/packs.py: load_codeql_plan requires schema_version: 2, v1 raises PackResolverError with clear upgrade guidance. templates/codeql-plan.yml: updated to v2 schema with build_provider, sandbox_build_target, and new commentary. tools/phases/phase_1_gates.py: gate 1a now calls load_codeql_plan() so v1 plans are rejected at the gate with the upgrade error message. prompts/phase-1a-profile.md: updated to v2 rules, including sandbox_build_target, build_provider, and new language about sandbox being built in Phase 1b. Tests: all plan fixtures bumped to v2, new test_schema_v1_rejected_at_gate_1a.
Rename prompts/phase-1c-sandbox.md -> prompts/phase-1b-sandbox.md. Update content to reflect new position (second sub-stage instead of third). Add sandbox-recipe.yml as a first-class required output alongside sandbox-plan.md. Add recipe schema rules, per-target guidance, and validation instructions. Update test_phase_1c_reads_threat_model to use the renamed prompt file.
tools/codeql/health.py: compute_health() with classifications covering disabled, skipped, failed, extraction-failed, analysis-failed, completed-empty-valid, completed-with-signals, and more. tools/codeql/pipeline.py: generate run_id (UTC timestamp + hash), create per-run directory under itemdb/codeql/runs/<id>/, write last-run-manifest.yml and current-run.txt, compute health before importing risk signals. tools/codeql/runner.py: accept optional run_dir parameter; write SARIF, databases, and selected-query-packs.yml into per-run dir. tools/codeql/artifacts.py: prefer last-run-manifest.yml, fall back to run-manifest.yml; consume health block for usability gating. tests/test_codeql_health.py: 11 tests covering all health classifications. tests/test_codeql_pipeline.py: updated for per-run layout. tests/test_codeql_artifacts.py: updated for manifest fallback.
tools/codeql/platform.py: host_platform() and container_platform() for detecting OS/arch mismatch between host and sandbox container. tools/codeql/in_docker.py: check_platform() and exec_codeql() for running CodeQL inside a Docker Compose service. Platform guard classifies units as unavailable when host (e.g. Darwin) and container (e.g. Linux) platforms differ under mount-host-bundle strategy. templates/sandboxes/_shared/codeql.sh: wrapper script invoked by the harness, resolving compose file + service from env or defaults. templates/sandboxes/*/docker-compose.yml: added read-only bind mount of host .tools/codeql/current at /opt/codeql inside every container. tests/test_codeql_platform.py: 7 tests for platform detection. tests/test_codeql_in_docker.py: 6 tests for Docker executor.
… repair Phase 1 orchestra tor reordered: 1a: Target Profile → gate check_phase_1a 1b: Sandbox Bootstrap → gate check_phase_1c CodeQL runs after sandbox (post-sandbox) 1c: Detailed Reconnaissance → gate check_phase_1b Removed _run_codeql_repair_if_needed and all CodeQL repair helpers. Removed codeql-plan auto-repair from subphase retry loop. Removed build_codeql_plan_resume_prompt and build_codeql_build_failure_resume_prompt from completion.py. Stopped referencing prompts/phase-1-codeql-repair.md (file kept in tree until cleanup). Renamed prompts/phase-1b-recon.md → prompts/phase-1c-recon.md. Updated gate messages and test references. Skipped 17 tests pending deletion/update in commits 7-8.
prompts/phase-1c-recon.md: updated to Phase 1c, added health-aware CodeQL artifact reading rules (read last-run-manifest.yml, skip signals when health.usable is false, record health summary in threat-model.md). docs/workflow.md: rewritten Phase 1 section with the new 1a→1b→1c order. Phase 1a produces target profile + build model + codeql plan. Phase 1b produces sandbox + sandbox-recipe.yml. Phase 1c produces full recon notes enriched with CodeQL when usable. Makefile help text: Sandbox bootstrap (Phase 1c) → (Phase 1b). .opencode/skills/sandbox-bootstrap/SKILL.md and .opencode/agents/recon.md already reference Phase 1b as sandbox bootstrap — aligned with new order.
Delete prompts/phase-1-codeql-repair.md and tests/test_phase_1_codeql_plan_repair.py. Remove three skipped repair-related tests from test_codecome_check_codeql.py. Full test suite: 730 passed, 0 failures.
9445216 to
7058e5c
Compare
There was a problem hiding this comment.
Pull request overview
This PR advances the planned “CodeQL + sandbox recipe refactor” by introducing the sandbox recipe contract + validator, reordering Phase 1 (sandbox bootstrap before CodeQL, detailed recon after), and reshaping CodeQL execution artifacts around per-run directories with health classification.
Changes:
- Add
itemdb/notes/sandbox-recipe.ymlschema example plus Python loader/validator and CLI helpers (recipe-validate,recipe-print). - Reorder Phase 1 to
1a → 1b (sandbox) → CodeQL → 1c (recon)and update gates/prompts/docs/tests accordingly. - Introduce per-run CodeQL output layout (
runs/<run-id>/...),last-run-manifest.yml,current-run.txt, and a health model used by artifact gating and recon prompts.
Reviewed changes
Copilot reviewed 54 out of 55 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/sandbox/recipe.py | New loader/validator for sandbox-recipe.yml. |
| tools/sandbox/init.py | Establish sandbox as an importable package. |
| tools/sandbox-bootstrap.py | Add recipe-validate / recipe-print subcommands and constants for recipe path. |
| tools/phases/phase_1_gates.py | Switch CodeQL plan loading to shared loader; update Phase 1 messaging for new order. |
| tools/phases/completion.py | Remove CodeQL plan/build failure resume prompt helpers tied to the old repair flow. |
| tools/codeql/runner.py | Add optional run_dir support to write artifacts per-run (legacy mode preserved). |
| tools/codeql/platform.py | New host/container platform detection helpers. |
| tools/codeql/pipeline.py | Create per-run directories, write last-run-manifest.yml/current-run.txt, compute health, gate risk import. |
| tools/codeql/packs.py | Enforce CodeQL plan schema_version: 2 with actionable upgrade error. |
| tools/codeql/in_docker.py | New helpers for CodeQL execution inside Docker Compose services + platform guard. |
| tools/codeql/health.py | New health classification model for “usable” vs “unusable” CodeQL runs. |
| tools/codeql/artifacts.py | Gate artifacts based on health when present; support last-run manifest. |
| tools/codecome/phase_1.py | Reorder Phase 1 execution; remove prior CodeQL repair loop and associated validation logic. |
| tests/test_sandbox_recipe.py | New unit tests for recipe load/validation rules. |
| tests/test_phase_failure_state_reset.py | Update prompt path usage consistent with new Phase 1 prompt naming. |
| tests/test_phase_1_prompts_threat_model.py | Update prompt existence/content checks for renamed Phase 1 prompts. |
| tests/test_phase_1_mid_turn_forgiveness.py | Update prompt path usage consistent with new Phase 1 prompt naming. |
| tests/test_phase_1_gates.py | Update CodeQL plan fixtures to schema v2; add test that v1 is rejected. |
| tests/test_phase_1_codeql_plan_repair.py | Remove tests for deleted CodeQL repair flow. |
| tests/test_codeql_runner.py | Update CodeQL plan fixtures to schema v2 where needed. |
| tests/test_codeql_platform.py | New tests for platform detection utilities. |
| tests/test_codeql_pipeline.py | Update pipeline tests for per-run layout + last-run manifest behavior. |
| tests/test_codeql_packs.py | Update pack/plan fixtures for CodeQL plan schema v2. |
| tests/test_codeql_in_docker.py | New tests for Docker execution helpers and platform guard behavior. |
| tests/test_codeql_health.py | New tests for health classification outcomes. |
| tests/test_codeql_artifacts.py | Update assertions to match new “manifest missing” wording. |
| tests/test_codecome_check_codeql.py | Update plan schema fixture (v2) and remove repair-related tests. |
| templates/sandboxes/web-static/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/rust/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/ruby/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/python/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/php/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/node/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/nested-virt/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/multi-service-compose/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/java-maven/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/iac-terraform/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/go/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/generic/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/erlang-otp/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/dotnet/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/c-cpp/docker-compose.yml | Mount host CodeQL bundle into sandbox container. |
| templates/sandboxes/_shared/codeql.sh | New shared wrapper for invoking CodeQL via docker compose exec. |
| templates/sandboxes/README.md | Document the new sandbox recipe artifact and CLI helpers. |
| templates/sandbox-recipe.yml.example | Add example schema for the sandbox recipe contract. |
| templates/codeql-plan.yml | Bump CodeQL plan template to schema v2 and document sandbox mapping fields. |
| prompts/phase-1c-recon.md | Update recon prompt to Phase 1c and use health-aware artifact consumption rules. |
| prompts/phase-1b-sandbox.md | Update sandbox bootstrap prompt to Phase 1b and require sandbox-recipe.yml. |
| prompts/phase-1a-profile.md | Update Phase 1a prompt rules for CodeQL plan schema v2. |
| prompts/phase-1-codeql-repair.md | Remove obsolete repair prompt (no longer used). |
| Makefile | Update help text to reflect sandbox bootstrap as Phase 1b. |
| docs/workflow.md | Update workflow docs for 3-substage Phase 1 and new outputs. |
| docs/sandbox.md | Document sandbox recipe artifact and how to validate/print it. |
| .project/codeql-sandbox-recipe-refactor-plan.md | Add/extend detailed implementation plan document. |
| .gitignore | Ignore token-usage-output.txt. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Branch purpose: define the target architecture and implementation plan before changing runtime code. | ||
|
|
||
| ## 1. Problem statement |
| def _generate_run_id() -> str: | ||
| ts = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ") | ||
| fingerprint = hashlib.sha256(ts.encode()).hexdigest()[:8] | ||
| return f"{ts}-{fingerprint}" |
| out.separator(tone=T.SUCCESS) | ||
| out.success("Ready to run Phase 1c (Sandbox Bootstrap).") | ||
| out.success("Ready to run Phase 1c (Detailed Reconnaissance).") | ||
| return 0 |
| def test_host_platform_known_os() -> None: | ||
| plat = host_platform() | ||
| assert any(kw in plat.lower() for kw in ("linux", "darwin", "windows")) | ||
|
|
| sandbox_path = Path(root) / sandbox_path_str | ||
| if not sandbox_path.exists(): | ||
| errors.append( | ||
| f"sandbox-recipe.yml: sandbox.path {sandbox_path_str!r} does not exist" | ||
| ) |
| manifest.setdefault("health", health) | ||
|
|
| result = subprocess.run( | ||
| ["uname", "-sm"], capture_output=True, text=True, timeout=10 | ||
| ) | ||
| return result.stdout.strip() |
| ], | ||
| capture_output=True, text=True, timeout=timeout, | ||
| ) | ||
| return result.stdout.strip() or result.stderr.strip() |
| f"CodeQL bundle is for {host_plat}; sandbox service " | ||
| f"{service!r} runs {container_plat}. " | ||
| "install_strategy=mount-host-bundle cannot cross platforms. " | ||
| "Use install_strategy=download-in-container or image-preinstalled " | ||
| "(not yet supported)." |
tests/test_prompts.py: add test_no_stale_phase_1b_1c_references_in_prompts that greps all prompt files for contradictory Phase 1b/1c patterns, plus self-consistency tests for phase-1b-sandbox.md and phase-1c-recon.md. prompts/README.md: update filenames and descriptions to new 1b/1c order. prompts/sweep.md: add 'CodeQL signals (conditional)' section instructing the sweep to check health.usable before importing CodeQL signals. prompts/phase-6-report.md: add health-aware rule in 'Reporting rules' section instructing the reporter to check last-run-manifest.yml health.
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
prompts/phase-1b-sandbox.md (1)
138-138:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix sub-stage reference in retry-exhaustion instruction.
This line says “stop Phase 1c” inside the Phase 1b prompt. It should refer to stopping Phase 1b (or stopping Phase 1 progression) to avoid contradictory operator guidance.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@prompts/phase-1b-sandbox.md` at line 138, Update the incorrect sub-stage reference in the retry-exhaustion instruction so it no longer tells the operator to "stop Phase 1c"; change the phrase to "stop Phase 1b" (or "stop Phase 1 progression") in the sentence that mentions retry budget and logging to sandbox-plan.md and references CODECOME_BOOTSTRAP_MAX_RETRIES, ensuring the instruction consistently refers to Phase 1b.prompts/phase-1c-recon.md (1)
178-180:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix stale CodeQL artifact path and stale Phase filename in Phase 1c prompt.
This prompt still contains pre-refactor references that conflict with the new Phase 1c/per-run model: Line 178 points to
itemdb/codeql/normalized/file-signals.ymlinstead of the per-run location, and Line 274 still writesphase-1b-summaryduring Phase 1c. Both can cause incorrect file reads/writes by the agent.🛠️ Proposed doc fix
-If CodeQL file signals exist (`itemdb/codeql/normalized/file-signals.yml`), incorporate them: +If CodeQL file signals exist (`itemdb/codeql/runs/<run-id>/normalized/file-signals.yml`), incorporate them: @@ - runs/phase-1b-summary-YYYY-MM-DD-HHMMSS.md + runs/phase-1c-summary-YYYY-MM-DD-HHMMSS.mdAlso applies to: 274-274
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@prompts/phase-1c-recon.md` around lines 178 - 180, Update the stale CodeQL artifact path and the Phase filename in the prompt: replace references to the legacy static path "itemdb/codeql/normalized/file-signals.yml" with the per-run CodeQL artifact location used by the new Phase 1c model (the per-run path your pipeline writes for CodeQL file signals), and change the summary output filename from "phase-1b-summary" to "phase-1c-summary" so the agent reads/writes the correct per-run artifacts; search for those exact strings in the prompt text and update them accordingly.tests/test_codeql_pipeline.py (1)
212-253:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDuplicate test name overrides the earlier test case.
Line 253 redefines the same function name from Line 212, so the first test is never executed.
Suggested fix
-def test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy(tmp_path: Path) -> None: +def test_pipeline_normalize_failure_sets_expected_last_manifest_for_soft_policy(tmp_path: Path) -> None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_codeql_pipeline.py` around lines 212 - 253, The file defines test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy twice, causing the first definition to be overridden; rename the duplicate test function to a unique, descriptive name (for example change the second occurrence to test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy_duplicate or a name reflecting its intent) so both tests are distinct, and update any references or fixtures if needed to match the new function name (look for the symbol test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy in this diff to locate both definitions).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.project/codeql-sandbox-recipe-refactor-plan.md:
- Around line 857-859: The doc uses a non-canonical key `Sandbox_build_target`
(capital S) instead of the agreed canonical `sandbox_build_target`, causing
copy/paste drift; update the text at the occurrence of `Sandbox_build_target` to
use `sandbox_build_target` so it matches the schema and examples (search for the
token `Sandbox_build_target` and replace with `sandbox_build_target` in the
description of Multi-target invocation).
In `@Makefile`:
- Line 109: You changed the Makefile (the printf line that prints "Sandbox
bootstrap (Phase 1b)"), which is disallowed; revert that modification so the
Makefile remains exactly as in main (remove that hunk from your commit or reset
the file), then amend the PR to exclude any Makefile/orchestration/config edits;
reference the changed printf line to locate and undo the change and ensure only
allowed source files are included in the branch before pushing.
In `@prompts/phase-1a-profile.md`:
- Around line 68-73: The doc contradicts itself about requiring build_command:
clarify that when build_provider is "sandbox-recipe"
analysis_units[].sandbox_build_target should point to the sandbox-recipe.yml
build_targets[].id and build_command must be left empty even if build_mode is
"manual"; update the guidance so that build_command is only required when
build_provider is "none" or when build_mode is "manual" and build_provider is
not "sandbox-recipe" (i.e., for manual, non-recipe builds supply build_command),
and state that codeql-plan.yml should never include a concrete build_command for
units using sandbox-recipe (use sandbox_build_target instead).
In `@prompts/sweep.md`:
- Around line 24-27: Update the prompt to use per-run CodeQL paths: read the run
id from itemdb/codeql/last-run-manifest.yml (as before) and then reference
enrichment files under itemdb/codeql/runs/<run_id>/normalized/alerts.yml and
itemdb/codeql/runs/<run_id>/normalized/file-signals.yml instead of the old
top-level itemdb/codeql/normalized/... paths; ensure any mention of
"normalized/alerts.yml" or "file-signals.yml" in this prompt is replaced with
the per-run path pattern using the manifest's run_id.
In `@tests/test_phase_1_prompts_threat_model.py`:
- Around line 64-66: The test test_phase_1c_reads_threat_model is reading the
wrong fixture ("phase-1b-sandbox.md"); change the call to _read_prompt to open
the Phase 1c recon prompt (e.g., "phase-1c-recon.md") so the test actually
verifies the Phase 1c recon requirement, leaving the assert ("threat-model.md"
in content) as-is; update the string literal in the test function
test_phase_1c_reads_threat_model to the correct prompt filename.
In `@tests/test_prompts.py`:
- Around line 30-53: The regex checks in stale_patterns can miss matches
spanning multiple lines because re.search is called without DOTALL; update the
search to allow '.' to match newlines (e.g., use re.DOTALL / re.S) so patterns
like "Phase 1b.*Detailed Reconnaissance" will match across line breaks. Locate
the loop that iterates prompts_dir.glob("*.md") and the call to
re.search(pattern, content, re.IGNORECASE) and change the flags to re.IGNORECASE
| re.DOTALL (or compile the patterns with re.S) while preserving the existing
exceptions logic and pattern strings.
In `@tests/test_sandbox_recipe.py`:
- Around line 81-85: Replace the use of the bare `assert False, "Expected
ValueError"` in the exception tests that call load_recipe(path) with an explicit
exception raise to avoid removal under optimized execution; specifically, change
those failure branches to `raise AssertionError("Expected ValueError")` (and
apply the same replacement for the similar block around the second exception
test) so the test still fails clearly when no ValueError is raised.
In `@tools/codeql/health.py`:
- Around line 183-191: The current logic returns "skipped" whenever
has_languages is false even if status == "failed", hiding real failures; update
the conditional so that failures are preserved—either move the status ==
"failed" block above the has_languages check or change the has_languages branch
to only return "skipped" when status != "failed"; make sure to still use db_ok
and analyze_ok (the existing variables) to return the appropriate "failed"
message ("CodeQL database creation failed.", "CodeQL analysis failed.", or
"CodeQL pipeline failed.").
In `@tools/codeql/in_docker.py`:
- Around line 25-37: The guard only treats "mount-host-bundle" as a host-bundle
case but misses "copy-host-bundle", so update the conditional that checks
install_strategy to include "copy-host-bundle" (e.g., install_strategy not in
("mount-host-bundle", "copy-host-bundle")) and ensure the subsequent validation
uses host_platform(), container_platform(service, compose_file), and
platforms_compatible() to reject cross-platform combinations with an error
message that mentions both mount-host-bundle and copy-host-bundle as unsupported
for cross-platform installs.
In `@tools/codeql/pipeline.py`:
- Around line 16-19: The _generate_run_id function currently derives fingerprint
solely from ts, causing identical run IDs for starts within the same second;
update _generate_run_id to add non-deterministic entropy (e.g., include
microseconds, process id, or a random value such as uuid.uuid4().hex or
secrets.token_hex) into the hashed input or as part of the fingerprint so each
invocation is unique; keep the return format f"{ts}-{fingerprint}" and update
the fingerprint calculation in _generate_run_id to hash timestamp plus the
chosen random/entropy source (or replace hashing with a truncated uuid4) to
avoid collisions.
In `@tools/codeql/platform.py`:
- Around line 15-20: The probe currently returns result.stdout.strip() even when
the command fails; change the subprocess.run invocation handling in the probe
(the call that uses ["uname", "-sm"] and the similar probe at lines 31-40) to
treat non-zero exit codes as failures: after subprocess.run(...) check
result.returncode and if it's non-zero (or result.stdout is empty/unreliable)
return "unknown" instead of returning stdout/stderr; update the code paths
around the subprocess.run call and the existing return result.stdout.strip() so
both probes consistently return "unknown" on non-zero exits or empty output.
In `@tools/sandbox-bootstrap.py`:
- Around line 1562-1606: The cmd_recipe_validate and cmd_recipe_print functions
implement business logic in the top-level tools/sandbox-bootstrap.py; per policy
this file must be a thin delegating wrapper—move the implementation into a
package module (e.g., create tools.sandbox.recipe_commands or similar) and keep
these functions as one-line delegators that import and call the package
functions (e.g., recipe_commands.validate_recipe_cmd and
recipe_commands.print_recipe_cmd) passing through args; ensure all imports and
heavy logic (load_recipe, validate_recipe, dump_recipe, _emit, error handling
and printing) live in the new package module and update the wrapper functions
cmd_recipe_validate and cmd_recipe_print to simply import and return the package
function results.
In `@tools/sandbox/recipe.py`:
- Around line 175-192: The helper _validate_codeql_hints currently skips
validating the top-level codeql key default_execution_mode; add a validation
block (similar to the preferred_mode check) that reads
codeql.get("default_execution_mode"), ensures it's a string and one of
VALID_EXECUTION_MODES, and appends an error with the same format
(f"sandbox-recipe.yml: {prefix}.default_execution_mode {value!r} invalid
(allowed: {valid})") when invalid so validate_recipe() will catch bad
default_execution_mode values.
---
Outside diff comments:
In `@prompts/phase-1b-sandbox.md`:
- Line 138: Update the incorrect sub-stage reference in the retry-exhaustion
instruction so it no longer tells the operator to "stop Phase 1c"; change the
phrase to "stop Phase 1b" (or "stop Phase 1 progression") in the sentence that
mentions retry budget and logging to sandbox-plan.md and references
CODECOME_BOOTSTRAP_MAX_RETRIES, ensuring the instruction consistently refers to
Phase 1b.
In `@prompts/phase-1c-recon.md`:
- Around line 178-180: Update the stale CodeQL artifact path and the Phase
filename in the prompt: replace references to the legacy static path
"itemdb/codeql/normalized/file-signals.yml" with the per-run CodeQL artifact
location used by the new Phase 1c model (the per-run path your pipeline writes
for CodeQL file signals), and change the summary output filename from
"phase-1b-summary" to "phase-1c-summary" so the agent reads/writes the correct
per-run artifacts; search for those exact strings in the prompt text and update
them accordingly.
In `@tests/test_codeql_pipeline.py`:
- Around line 212-253: The file defines
test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy twice, causing
the first definition to be overridden; rename the duplicate test function to a
unique, descriptive name (for example change the second occurrence to
test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy_duplicate or a
name reflecting its intent) so both tests are distinct, and update any
references or fixtures if needed to match the new function name (look for the
symbol test_pipeline_normalize_failure_marks_soft_failed_for_soft_policy in this
diff to locate both definitions).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 1f7f88ac-4287-42f3-9df6-7c4db6983e79
📒 Files selected for processing (59)
.gitignore.project/codeql-sandbox-recipe-refactor-plan.mdMakefiledocs/sandbox.mddocs/workflow.mdprompts/README.mdprompts/phase-1-codeql-repair.mdprompts/phase-1a-profile.mdprompts/phase-1b-sandbox.mdprompts/phase-1c-recon.mdprompts/phase-6-report.mdprompts/sweep.mdtemplates/codeql-plan.ymltemplates/sandbox-recipe.yml.exampletemplates/sandboxes/README.mdtemplates/sandboxes/_shared/codeql.shtemplates/sandboxes/c-cpp/docker-compose.ymltemplates/sandboxes/dotnet/docker-compose.ymltemplates/sandboxes/erlang-otp/docker-compose.ymltemplates/sandboxes/generic/docker-compose.ymltemplates/sandboxes/go/docker-compose.ymltemplates/sandboxes/iac-terraform/docker-compose.ymltemplates/sandboxes/java-maven/docker-compose.ymltemplates/sandboxes/multi-service-compose/docker-compose.ymltemplates/sandboxes/nested-virt/docker-compose.ymltemplates/sandboxes/node/docker-compose.ymltemplates/sandboxes/php/docker-compose.ymltemplates/sandboxes/python/docker-compose.ymltemplates/sandboxes/ruby/docker-compose.ymltemplates/sandboxes/rust/docker-compose.ymltemplates/sandboxes/web-static/docker-compose.ymltests/test_codecome_check_codeql.pytests/test_codeql_artifacts.pytests/test_codeql_health.pytests/test_codeql_in_docker.pytests/test_codeql_packs.pytests/test_codeql_pipeline.pytests/test_codeql_platform.pytests/test_codeql_runner.pytests/test_phase_1_codeql_plan_repair.pytests/test_phase_1_gates.pytests/test_phase_1_mid_turn_forgiveness.pytests/test_phase_1_prompts_threat_model.pytests/test_phase_failure_state_reset.pytests/test_prompts.pytests/test_sandbox_recipe.pytools/codecome/phase_1.pytools/codeql/artifacts.pytools/codeql/health.pytools/codeql/in_docker.pytools/codeql/packs.pytools/codeql/pipeline.pytools/codeql/platform.pytools/codeql/runner.pytools/phases/completion.pytools/phases/phase_1_gates.pytools/sandbox-bootstrap.pytools/sandbox/__init__.pytools/sandbox/recipe.py
💤 Files with no reviewable changes (3)
- prompts/phase-1-codeql-repair.md
- tests/test_phase_1_codeql_plan_repair.py
- tools/phases/completion.py
| 7. **Multi-target invocation**: the runner resolves `Sandbox_build_target` from | ||
| the recipe per (analysis unit, language) pair and invokes CodeQL with the | ||
| resolved `build_command`. Identical commands across targets are allowed; |
There was a problem hiding this comment.
Use the canonical key name sandbox_build_target consistently.
Line 857 uses Sandbox_build_target (capital S), but the schema examples and surrounding contract use sandbox_build_target. This can cause implementation drift from copy/paste.
✏️ Proposed doc fix
-7. **Multi-target invocation**: the runner resolves `Sandbox_build_target` from
+7. **Multi-target invocation**: the runner resolves `sandbox_build_target` from📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 7. **Multi-target invocation**: the runner resolves `Sandbox_build_target` from | |
| the recipe per (analysis unit, language) pair and invokes CodeQL with the | |
| resolved `build_command`. Identical commands across targets are allowed; | |
| 7. **Multi-target invocation**: the runner resolves `sandbox_build_target` from | |
| the recipe per (analysis unit, language) pair and invokes CodeQL with the | |
| resolved `build_command`. Identical commands across targets are allowed; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.project/codeql-sandbox-recipe-refactor-plan.md around lines 857 - 859, The
doc uses a non-canonical key `Sandbox_build_target` (capital S) instead of the
agreed canonical `sandbox_build_target`, causing copy/paste drift; update the
text at the occurrence of `Sandbox_build_target` to use `sandbox_build_target`
so it matches the schema and examples (search for the token
`Sandbox_build_target` and replace with `sandbox_build_target` in the
description of Multi-target invocation).
| @printf " $(BOLD)make sandbox-test$(RESET) Test the target inside the sandbox\n" | ||
| @printf "\n" | ||
| @printf " $(BOLD)$(CYAN)Sandbox bootstrap (Phase 1c):$(RESET)\n" | ||
| @printf " $(BOLD)$(CYAN)Sandbox bootstrap (Phase 1b):$(RESET)\n" |
There was a problem hiding this comment.
Do not modify Makefile without explicit instruction.
This PR changes Makefile, which violates the repository rule forbidding orchestration/config-file edits unless explicitly requested.
As per coding guidelines, "{codecome.yml,AGENTS.md,Makefile,.opencode/**}: Never modify project orchestration or configuration files ... unless explicitly instructed".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` at line 109, You changed the Makefile (the printf line that prints
"Sandbox bootstrap (Phase 1b)"), which is disallowed; revert that modification
so the Makefile remains exactly as in main (remove that hunk from your commit or
reset the file), then amend the PR to exclude any Makefile/orchestration/config
edits; reference the changed printf line to locate and undo the change and
ensure only allowed source files are included in the branch before pushing.
Source: Coding guidelines
| - For compiled languages (c-cpp, go, csharp, java-kotlin, swift) set `analysis_units[].sandbox_build_target` to the `build_targets[].id` from `sandbox-recipe.yml` that provides the build command for this unit. If the recipe has not been generated yet (this is Phase 1a), pick a sensible id such as `root` — Phase 1b will flesh out the recipe and the id can be updated if needed. | ||
| - For each language, set `build_provider`: | ||
| - `"sandbox-recipe"` — for compiled languages whose build command should be resolved from `sandbox-recipe.yml` after Phase 1b. Leave `build_command` empty (the runner resolves it from the recipe). | ||
| - `"none"` — for no-build languages (python, javascript-typescript, ruby). | ||
| - Avoid concrete build shell snippets in `build_command` unless the build is obvious and stable and no recipe is available. Prefer `build_provider: sandbox-recipe` for everything that needs a build. | ||
| - For each language in each analysis unit, select the appropriate pack profiles: |
There was a problem hiding this comment.
Resolve contradictory build_command guidance for manual compiled units.
Line 70 says to leave build_command empty for build_provider: sandbox-recipe, but Line 85 still requires build_command when build_mode is manual. This ambiguity can produce inconsistent codeql-plan.yml output.
Proposed wording tweak
- For each language, set `build_provider`:
- - `"sandbox-recipe"` — for compiled languages whose build command should be resolved from `sandbox-recipe.yml` after Phase 1b. Leave `build_command` empty (the runner resolves it from the recipe).
+ - `"sandbox-recipe"` — for compiled languages whose build command should be resolved from `sandbox-recipe.yml` after Phase 1b. Leave `build_command` empty even when `build_mode` is `manual` (the runner resolves it from the recipe).
- `"none"` — for no-build languages (python, javascript-typescript, ruby).
-- Avoid concrete build shell snippets in `build_command` unless the build is obvious and stable and no recipe is available. Prefer `build_provider: sandbox-recipe` for everything that needs a build.
+- Avoid concrete build shell snippets in `build_command` unless the build is obvious/stable and `build_provider` is not `"sandbox-recipe"`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - For compiled languages (c-cpp, go, csharp, java-kotlin, swift) set `analysis_units[].sandbox_build_target` to the `build_targets[].id` from `sandbox-recipe.yml` that provides the build command for this unit. If the recipe has not been generated yet (this is Phase 1a), pick a sensible id such as `root` — Phase 1b will flesh out the recipe and the id can be updated if needed. | |
| - For each language, set `build_provider`: | |
| - `"sandbox-recipe"` — for compiled languages whose build command should be resolved from `sandbox-recipe.yml` after Phase 1b. Leave `build_command` empty (the runner resolves it from the recipe). | |
| - `"none"` — for no-build languages (python, javascript-typescript, ruby). | |
| - Avoid concrete build shell snippets in `build_command` unless the build is obvious and stable and no recipe is available. Prefer `build_provider: sandbox-recipe` for everything that needs a build. | |
| - For each language in each analysis unit, select the appropriate pack profiles: | |
| - For compiled languages (c-cpp, go, csharp, java-kotlin, swift) set `analysis_units[].sandbox_build_target` to the `build_targets[].id` from `sandbox-recipe.yml` that provides the build command for this unit. If the recipe has not been generated yet (this is Phase 1a), pick a sensible id such as `root` — Phase 1b will flesh out the recipe and the id can be updated if needed. | |
| - For each language, set `build_provider`: | |
| - `"sandbox-recipe"` — for compiled languages whose build command should be resolved from `sandbox-recipe.yml` after Phase 1b. Leave `build_command` empty even when `build_mode` is `manual` (the runner resolves it from the recipe). | |
| - `"none"` — for no-build languages (python, javascript-typescript, ruby). | |
| - Avoid concrete build shell snippets in `build_command` unless the build is obvious/stable and `build_provider` is not `"sandbox-recipe"`. | |
| - For each language in each analysis unit, select the appropriate pack profiles: |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@prompts/phase-1a-profile.md` around lines 68 - 73, The doc contradicts itself
about requiring build_command: clarify that when build_provider is
"sandbox-recipe" analysis_units[].sandbox_build_target should point to the
sandbox-recipe.yml build_targets[].id and build_command must be left empty even
if build_mode is "manual"; update the guidance so that build_command is only
required when build_provider is "none" or when build_mode is "manual" and
build_provider is not "sandbox-recipe" (i.e., for manual, non-recipe builds
supply build_command), and state that codeql-plan.yml should never include a
concrete build_command for units using sandbox-recipe (use sandbox_build_target
instead).
| If `itemdb/codeql/last-run-manifest.yml` exists, read its `health` block. | ||
| When `health.usable` is `true`, use `itemdb/codeql/normalized/alerts.yml` | ||
| and `file-signals.yml` as enrichment for the target file. When | ||
| `health.usable` is `false`, the CodeQL run did not produce trustworthy |
There was a problem hiding this comment.
CodeQL signal paths are outdated for per-run layout.
After this refactor, normalized outputs are per-run (itemdb/codeql/runs/<run_id>/normalized/...), not top-level itemdb/codeql/normalized/.... This prompt can point sweep to missing/stale files.
Suggested fix
-If `itemdb/codeql/last-run-manifest.yml` exists, read its `health` block.
-When `health.usable` is `true`, use `itemdb/codeql/normalized/alerts.yml`
-and `file-signals.yml` as enrichment for the target file. When
+If `itemdb/codeql/last-run-manifest.yml` exists, read its `health` block and `run_id`.
+When `health.usable` is `true`, read:
+- `itemdb/codeql/runs/<run_id>/normalized/alerts.yml`
+- `itemdb/codeql/runs/<run_id>/normalized/file-signals.yml`
+as enrichment for the target file. When
`health.usable` is `false`, the CodeQL run did not produce trustworthy
output — do not import its signals.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@prompts/sweep.md` around lines 24 - 27, Update the prompt to use per-run
CodeQL paths: read the run id from itemdb/codeql/last-run-manifest.yml (as
before) and then reference enrichment files under
itemdb/codeql/runs/<run_id>/normalized/alerts.yml and
itemdb/codeql/runs/<run_id>/normalized/file-signals.yml instead of the old
top-level itemdb/codeql/normalized/... paths; ensure any mention of
"normalized/alerts.yml" or "file-signals.yml" in this prompt is replaced with
the per-run path pattern using the manifest's run_id.
| def test_phase_1c_reads_threat_model() -> None: | ||
| content = _read_prompt("phase-1c-sandbox.md") | ||
| content = _read_prompt("phase-1b-sandbox.md") | ||
| assert "threat-model.md" in content |
There was a problem hiding this comment.
Test target file mismatches the test intent.
test_phase_1c_reads_threat_model currently reads phase-1b-sandbox.md, so it no longer verifies the Phase 1c recon prompt requirement it is named for.
✅ Proposed fix
def test_phase_1c_reads_threat_model() -> None:
- content = _read_prompt("phase-1b-sandbox.md")
+ content = _read_prompt("phase-1c-recon.md")
assert "threat-model.md" in content🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_phase_1_prompts_threat_model.py` around lines 64 - 66, The test
test_phase_1c_reads_threat_model is reading the wrong fixture
("phase-1b-sandbox.md"); change the call to _read_prompt to open the Phase 1c
recon prompt (e.g., "phase-1c-recon.md") so the test actually verifies the Phase
1c recon requirement, leaving the assert ("threat-model.md" in content) as-is;
update the string literal in the test function test_phase_1c_reads_threat_model
to the correct prompt filename.
| if install_strategy not in ("mount-host-bundle",): | ||
| return True, "" | ||
|
|
||
| host_plat = host_platform() | ||
| container_plat = container_platform(service, compose_file) | ||
|
|
||
| if not platforms_compatible(host_plat, container_plat): | ||
| return False, ( | ||
| f"CodeQL bundle is for {host_plat}; sandbox service " | ||
| f"{service!r} runs {container_plat}. " | ||
| "install_strategy=mount-host-bundle cannot cross platforms. " | ||
| "Use install_strategy=download-in-container or image-preinstalled " | ||
| "(not yet supported)." |
There was a problem hiding this comment.
copy-host-bundle bypasses platform compatibility checks.
Line 25 only guards mount-host-bundle, but copy-host-bundle also carries a host-built CodeQL bundle and can fail the same way on cross-platform host/container pairs. This should be validated identically.
🔧 Proposed fix
- if install_strategy not in ("mount-host-bundle",):
+ if install_strategy not in ("mount-host-bundle", "copy-host-bundle"):
return True, ""
@@
- "install_strategy=mount-host-bundle cannot cross platforms. "
- "Use install_strategy=download-in-container or image-preinstalled "
- "(not yet supported)."
+ f"install_strategy={install_strategy} cannot cross platforms. "
+ "Use install_strategy=image-preinstalled."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/codeql/in_docker.py` around lines 25 - 37, The guard only treats
"mount-host-bundle" as a host-bundle case but misses "copy-host-bundle", so
update the conditional that checks install_strategy to include
"copy-host-bundle" (e.g., install_strategy not in ("mount-host-bundle",
"copy-host-bundle")) and ensure the subsequent validation uses host_platform(),
container_platform(service, compose_file), and platforms_compatible() to reject
cross-platform combinations with an error message that mentions both
mount-host-bundle and copy-host-bundle as unsupported for cross-platform
installs.
| def _generate_run_id() -> str: | ||
| ts = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ") | ||
| fingerprint = hashlib.sha256(ts.encode()).hexdigest()[:8] | ||
| return f"{ts}-{fingerprint}" |
There was a problem hiding this comment.
Run ID generation can collide for runs started in the same second.
Line 18 hashes only ts, so it’s deterministic per second. Two starts within one second produce identical run_id, breaking per-run isolation.
Suggested fix
import hashlib
+import uuid
@@
def _generate_run_id() -> str:
ts = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
- fingerprint = hashlib.sha256(ts.encode()).hexdigest()[:8]
+ fingerprint = uuid.uuid4().hex[:8]
return f"{ts}-{fingerprint}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/codeql/pipeline.py` around lines 16 - 19, The _generate_run_id function
currently derives fingerprint solely from ts, causing identical run IDs for
starts within the same second; update _generate_run_id to add non-deterministic
entropy (e.g., include microseconds, process id, or a random value such as
uuid.uuid4().hex or secrets.token_hex) into the hashed input or as part of the
fingerprint so each invocation is unique; keep the return format
f"{ts}-{fingerprint}" and update the fingerprint calculation in _generate_run_id
to hash timestamp plus the chosen random/entropy source (or replace hashing with
a truncated uuid4) to avoid collisions.
| result = subprocess.run( | ||
| ["uname", "-sm"], capture_output=True, text=True, timeout=10 | ||
| ) | ||
| return result.stdout.strip() | ||
| except Exception: | ||
| return "unknown" |
There was a problem hiding this comment.
Handle non-zero probe exits as unknown instead of returning arbitrary stderr text.
Both probes only fall back on exceptions, but subprocess.run(...) with default check=False does not raise on command failure. A failed probe can therefore return empty/diagnostic text and be treated as a real platform string, causing false incompatibility in downstream gating.
Suggested fix
def host_platform() -> str:
@@
- result = subprocess.run(
+ result = subprocess.run(
["uname", "-sm"], capture_output=True, text=True, timeout=10
)
- return result.stdout.strip()
+ if result.returncode != 0:
+ return "unknown"
+ value = result.stdout.strip()
+ return value or "unknown"
except Exception:
return "unknown"
@@
- result = subprocess.run(
+ result = subprocess.run(
[
"docker", "compose", "-f", str(compose_file),
"exec", "-T", service,
"uname", "-sm",
],
capture_output=True, text=True, timeout=timeout,
)
- return result.stdout.strip() or result.stderr.strip()
+ if result.returncode != 0:
+ return "unknown"
+ value = result.stdout.strip()
+ return value or "unknown"
except Exception:
return "unknown"Also applies to: 31-40
🧰 Tools
🪛 Ruff (0.15.15)
[error] 16-16: Starting a process with a partial executable path
(S607)
[warning] 19-19: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/codeql/platform.py` around lines 15 - 20, The probe currently returns
result.stdout.strip() even when the command fails; change the subprocess.run
invocation handling in the probe (the call that uses ["uname", "-sm"] and the
similar probe at lines 31-40) to treat non-zero exit codes as failures: after
subprocess.run(...) check result.returncode and if it's non-zero (or
result.stdout is empty/unreliable) return "unknown" instead of returning
stdout/stderr; update the code paths around the subprocess.run call and the
existing return result.stdout.strip() so both probes consistently return
"unknown" on non-zero exits or empty output.
| def cmd_recipe_validate(args: argparse.Namespace) -> int: | ||
| path = Path(args.path) if hasattr(args, "path") and args.path else SANDBOX_RECIPE_PATH | ||
| if not path.is_file(): | ||
| print(C.fail(f"Sandbox recipe not found at {path}"), file=sys.stderr) | ||
| return 1 | ||
|
|
||
| try: | ||
| from sandbox.recipe import load_recipe, validate_recipe | ||
| recipe = load_recipe(path) | ||
| except Exception as exc: | ||
| print(C.fail(f"Failed to load recipe: {exc}"), file=sys.stderr) | ||
| return 1 | ||
|
|
||
| errors = validate_recipe(recipe, root=str(ROOT)) | ||
| if errors: | ||
| print(C.fail(f"Sandbox recipe at {path} has {len(errors)} validation error(s):"), file=sys.stderr) | ||
| for err in errors: | ||
| print(f" {C.SYM_BULLET} {err}") | ||
| return 1 | ||
|
|
||
| print(C.ok(f"Sandbox recipe at {path} is valid.")) | ||
| return 0 | ||
|
|
||
|
|
||
| def cmd_recipe_print(args: argparse.Namespace) -> int: | ||
| path = Path(args.path) if hasattr(args, "path") and args.path else SANDBOX_RECIPE_PATH | ||
| if not path.is_file(): | ||
| print(C.fail(f"Sandbox recipe not found at {path}"), file=sys.stderr) | ||
| return 1 | ||
|
|
||
| try: | ||
| from sandbox.recipe import load_recipe | ||
| recipe = load_recipe(path) | ||
| except Exception as exc: | ||
| print(C.fail(f"Failed to load recipe: {exc}"), file=sys.stderr) | ||
| return 1 | ||
|
|
||
| if args.format == "json": | ||
| _emit(recipe, "json") | ||
| else: | ||
| from sandbox.recipe import dump_recipe | ||
| print(dump_recipe(recipe).rstrip()) | ||
|
|
||
| return 0 | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Keep tools/sandbox-bootstrap.py as delegation-only; move recipe command logic into a package module.
These new command bodies add implementation in a root tools/*.py script. Per repo policy, this file should stay a thin wrapper and delegate to implementation under a package (e.g., tools/sandbox/...).
As per coding guidelines: “Standalone scripts at the tools/ root … should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script.”
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 1571-1571: Do not catch blind exception: Exception
(BLE001)
[warning] 1595-1595: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/sandbox-bootstrap.py` around lines 1562 - 1606, The cmd_recipe_validate
and cmd_recipe_print functions implement business logic in the top-level
tools/sandbox-bootstrap.py; per policy this file must be a thin delegating
wrapper—move the implementation into a package module (e.g., create
tools.sandbox.recipe_commands or similar) and keep these functions as one-line
delegators that import and call the package functions (e.g.,
recipe_commands.validate_recipe_cmd and recipe_commands.print_recipe_cmd)
passing through args; ensure all imports and heavy logic (load_recipe,
validate_recipe, dump_recipe, _emit, error handling and printing) live in the
new package module and update the wrapper functions cmd_recipe_validate and
cmd_recipe_print to simply import and return the package function results.
Source: Coding guidelines
| def _validate_codeql_hints(codeql: dict[str, Any], prefix: str) -> list[str]: | ||
| errors: list[str] = [] | ||
|
|
||
| install_strategy = codeql.get("install_strategy") | ||
| if install_strategy is not None: | ||
| if not isinstance(install_strategy, str) or install_strategy not in VALID_INSTALL_STRATEGIES: | ||
| valid = ", ".join(sorted(VALID_INSTALL_STRATEGIES)) | ||
| errors.append( | ||
| f"sandbox-recipe.yml: {prefix}.install_strategy {install_strategy!r} invalid (allowed: {valid})" | ||
| ) | ||
|
|
||
| preferred_mode = codeql.get("preferred_execution_mode") | ||
| if preferred_mode is not None: | ||
| if not isinstance(preferred_mode, str) or preferred_mode not in VALID_EXECUTION_MODES: | ||
| valid = ", ".join(sorted(VALID_EXECUTION_MODES)) | ||
| errors.append( | ||
| f"sandbox-recipe.yml: {prefix}.preferred_execution_mode {preferred_mode!r} invalid (allowed: {valid})" | ||
| ) |
There was a problem hiding this comment.
Top-level codeql.default_execution_mode is currently unvalidated.
validate_recipe() passes top-level codeql into _validate_codeql_hints(), but this helper only validates preferred_execution_mode. That leaves invalid codeql.default_execution_mode values undetected even though the example schema uses that key.
🔧 Proposed fix
def _validate_codeql_hints(codeql: dict[str, Any], prefix: str) -> list[str]:
errors: list[str] = []
@@
- preferred_mode = codeql.get("preferred_execution_mode")
- if preferred_mode is not None:
- if not isinstance(preferred_mode, str) or preferred_mode not in VALID_EXECUTION_MODES:
+ # Build-target hints use preferred_execution_mode; top-level codeql uses default_execution_mode.
+ mode_key = "preferred_execution_mode" if "preferred_execution_mode" in codeql else "default_execution_mode"
+ mode_value = codeql.get(mode_key)
+ if mode_value is not None:
+ if not isinstance(mode_value, str) or mode_value not in VALID_EXECUTION_MODES:
valid = ", ".join(sorted(VALID_EXECUTION_MODES))
errors.append(
- f"sandbox-recipe.yml: {prefix}.preferred_execution_mode {preferred_mode!r} invalid (allowed: {valid})"
+ f"sandbox-recipe.yml: {prefix}.{mode_key} {mode_value!r} invalid (allowed: {valid})"
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/sandbox/recipe.py` around lines 175 - 192, The helper
_validate_codeql_hints currently skips validating the top-level codeql key
default_execution_mode; add a validation block (similar to the preferred_mode
check) that reads codeql.get("default_execution_mode"), ensures it's a string
and one of VALID_EXECUTION_MODES, and appends an error with the same format
(f"sandbox-recipe.yml: {prefix}.default_execution_mode {value!r} invalid
(allowed: {valid})") when invalid so validate_recipe() will catch bad
default_execution_mode values.
Summary
Adds a detailed WIP implementation plan for refactoring the CodeQL integration around sandbox-generated build recipes.
The plan proposes:
itemdb/notes/sandbox-recipe.ymlas the machine-readable contract between sandbox and CodeQL;codeql-plan.ymlas the CodeQL analysis-intent document;Scope
This PR is intentionally documentation-only for review. No implementation code is changed yet.
Review focus
Please review especially:
sandbox-recipe.ymlschema.codeql-plan.ymlanalysis units and sandbox build targets.Summary by CodeRabbit
New Features
Refactoring
Improvements