feat(Release): 2026-05-12#58
Merged
Merged
Conversation
21cd56f to
463e9d0
Compare
Co-authored-by: Alexander Raihelgaus <[email protected]>
Collaborator
Author
|
closes #6 |
Covers the seven PRs in this release: - #35: parse --level default → reachable (CLI consistency with scan + Python CLI) - #36: auto-detect dep changes via ~/.openant/venv/.deps-hash - #37: lazy JS parser npm bootstrap on first use - #39: TypeScript/NestJS DI-aware call resolution (constructor + field + functional inject()) - #40: --language auto opt-in for openant init + non-git path support + shared config/languages.json - #49: Express anonymous route handler extraction (route_handler / route_middleware) - #50: --llm-reachability opt-in stage + cross-parser call_graph.json contract Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…line.py When test_pipeline.py is invoked as a subprocess by core/parser_adapter.py (which is the production path for openant scan/parse), the cwd may not include the openant-core root on sys.path. The `from utilities.file_io import ...` line was running before the sys.path.insert(...) that adds the core root, causing ModuleNotFoundError under any environment that didn't already have openant-core installed via `pip install -e` (e.g. local dev). CI passes via the python-tests workflow's `pip install -e .` step, so the issue was invisible there. Production also works for the same reason. But local pytest from openant-core/ without PYTHONPATH=. surfaced the bug (6 cross-parser tests failed with ModuleNotFoundError). Move the sys.path.insert(...) above the utilities imports so all utilities imports resolve via the explicit path mod, matching the JS/Go pattern established in #56. Verified locally: 15 passed / 2 Docker-skipped (was 9 passed / 6 failed) on tests/test_call_graph_output.py without PYTHONPATH. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…stage The four file I/O sites in the LLM-reachability stage of scan_repository were using bare `with open(..., encoding="utf-8")` instead of the read_json / write_json helpers from utilities.file_io introduced in #56. Functionally equivalent (both go through open_utf8 under the hood) but inconsistent with the post-#56 convention used elsewhere in the file (line 30 already imports read_json; the rest of scanner.py at lines 167, 696, 704 already uses it). Convert all four sites: - dataset load (active_dataset_path) → read_json - app context load (app_context_path) → read_json - signals write (llm_reachability.json) → write_json - dataset persist (active_dataset_path) → write_json Verified locally: 44 passed / 2 Docker-skipped on test_llm_reachability.py + test_call_graph_output.py. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Per-unit code-blob truncation in the LLM reachability stage was hardcoded
at MAX_CODE_BYTES = 1500. That caps the LLM at ~30-50 lines per unit and
silently drops entry-point indicators past that cutoff in long handlers,
generated code, and class methods where the security-relevant pattern is
embedded mid-body.
Default stays 1500 (no behaviour change for existing users) but power
users can opt into a larger budget via --llm-reachability-max-code-bytes
on `openant scan`. Common values:
- 1500 (default): cheapest, ~30-50 lines per unit
- 4096: ~$1-4 extra per scan, fits a full handler body + utility funcs
- 8192: ~$2-8 extra per scan, edge cases with very long handlers
Surface:
- Go CLI: --llm-reachability-max-code-bytes int (default 1500),
forwarded to the Python CLI only when non-default.
- Python CLI: matching --llm-reachability-max-code-bytes argparse flag,
threaded into scan_repository(llm_reachability_max_code_bytes=...).
- core.scanner.scan_repository: new param, passed into
analyze_reachability(max_code_bytes=...).
- core.llm_reachability: max_code_bytes parameter chained through
analyze_reachability → build_prompt → _unit_for_prompt → _trim_code.
Backward compatibility:
- Module constant MAX_CODE_BYTES = 1500 kept as alias of new
DEFAULT_MAX_CODE_BYTES so any external caller importing the old name
still works.
- All function signatures default the new param, so existing callers
(including tests) work unchanged.
Test: tests/test_llm_reachability.py adds
test_max_code_bytes_override_keeps_more_context which verifies a
FINAL_MARKER past byte 1500 is dropped at default but preserved at
max_code_bytes=4096.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
shahar-davidson
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Release bundle for 2026-05-12. Combines seven independently-reviewed PRs into one RC for cleaner master history. Scope is parser depth (TypeScript DI, Express anonymous handlers), dependency UX (auto-reinstall on
pyproject.tomlchange, JS parser bootstrap), CLI consistency (parsedefault), and a new opt-in LLM reachability stage.Included PRs
fix: default parse --level to reachable to match scan and Python CLI #35 —
fix: default parse --level to reachable to match scan and Python CLIBrings the Go CLI's
parsecommand into alignment withscanand the Python CLI, both of which already defaulted toreachable. The documentation has always said the default isreachable— PR 35 makes the code match the documented contract.feat: auto-detect dependency changes and reinstall openant #36 —
feat: auto-detect dependency changes and reinstall openantSHA-256 hash of
pyproject.tomlstored at~/.openant/venv/.deps-hash. Every CLI invocation compares the stored hash and re-runspip install -e <core>when they differ. Catches stale-venv aftergit pull.fix: lazy-install JS parser npm deps on first use #37 —
fix: lazy-install JS parser npm deps on first useopenant parseon a JS/TS repo no longer fails withCannot find module 'ts-morph'. Auto-runsnpm installonce on first JS parse withnode_modules/.package-lock.jsonas the completion sentinel. Cross-platform file lock prevents concurrent install corruption. Closes JS/TS parser fails on fresh install: missing npm dependencies #6.feat: DI-aware call resolution with nominal type matching for TypeScript/NestJS #39 —
feat: DI-aware call resolution with nominal type matching for TypeScript/NestJSResolves
this.userService.findById()style calls in NestJS / Angular codebases. Covers constructor injection, field-decorator injection (@Inject/@InjectRepository/ etc.), and Angular's functionalinject()API. Resolution priority: exact type → nominal (implements/extends) → unambiguous prefix. All steps returnnullon ambiguity. Class-level metadata file-qualified byrelativePath:classNamefor multi-module monorepos.feat: auto-detect language in init #40 —
feat: auto-detect language in initopenant initnow works on non-git directories (commit_sha = "nogit"placeholder). Sharedconfig/languages.jsonconsumed by both Go CLI and Python parser adapter (eliminates Go↔Python extension-list drift). Language auto-detection exposed as opt-in via-l auto(experimental dominance heuristic; see Validate language auto-detection accuracy before defaulting to it #61 for the validation work needed before it becomes the default).fix: extract Express.js anonymous route handler callbacks #49 —
fix: extract Express.js anonymous route handler callbacksrouter.post('/x', auth, async (req, res) => {...})style handlers are now extracted as units. Synth units carryroute_handler(last callback) orroute_middleware(earlier callbacks); both registered inENTRY_POINT_TYPESso the reachability filter doesn't drop them. Receiver filter prevents false positives on cache/query-builder.get/.post(...)calls. Named middleware identifiers become call-graph edges soauthenticateTokenshows up as an upstream dependency. Closes [bug] JavaScript parser misses Express.js anonymous route handler callbacks #21.feat: LLM review stage for enhanced reachability detection #50 —
feat: LLM review stage for enhanced reachability detectionNew opt-in
--llm-reachabilitypass that uses Opus to surface entry points the structural analysis misses (framework handlers, plugin/CLI registrations, message queues, external input sites). Promote-only semantics — never demotes structurally-detected units. Bundle includescall_graph.jsonbeing written by all 5 previously-missing parsers (C/Ruby/PHP/JS/Go) so the post-LLM re-filter works across languages.Changelog
CHANGELOG.mdhas a top entry for[2026-05-12] — Parser depth, dependency UX, and LLM reachability (opt-in)covering the user-facing impact of each PR. (Not yet committed — landing in a separate commit before merge.)Test plan
Closes #6
Closes #21