Skip to content

feat(python): initial support for cross-file function resolution via function registry#395

Open
sachin9058 wants to merge 4 commits into
cbomkit:mainfrom
sachin9058:feat/python-cross-file-resolution
Open

feat(python): initial support for cross-file function resolution via function registry#395
sachin9058 wants to merge 4 commits into
cbomkit:mainfrom
sachin9058:feat/python-cross-file-resolution

Conversation

@sachin9058
Copy link
Copy Markdown
Contributor

@sachin9058 sachin9058 commented May 5, 2026

Adds initial support for cross-file function resolution in Python analysis using a registry-based approach.

Motivation

Issue #9 highlights that cryptographic operations are not detected when wrapped inside functions defined in other files. Previous approaches relied on re-scanning files, which is not compatible with Sonar’s production model.

This PR introduces a safer, production-compatible approach.

What this PR does

  • Collects function definitions across files using a shared in-memory registry
  • Resolves function calls by matching names against registered definitions
  • Traverses function bodies using the existing visitor instead of re-scanning files
  • Avoids any dependency on test utilities or manual file scanning

Implementation Details

  • Function definitions are stored in a registry: Map<String, List<Tree>>
  • Resolution is performed inside visitCallExpression
  • A visited set prevents recursive re-traversal of the same function
  • Registry is cleared per top-level scan to avoid state leakage
  • Feature is guarded behind ENABLE_REGISTRY = false to ensure no regression

Example

# file1.py
from helper import custom_sign
sig = custom_sign(data)

# helper.py
def custom_sign(data):
    return private_key.sign(data, ec.ECDSA(hashes.SHA256()))

With this change, the sign call inside helper.py is detected.

Known Limitations

  • Resolution is name-based only (no symbol-level disambiguation)
  • Does not handle shadowing, aliases, or complex import paths
  • Depends on analysis order (definitions must be visited before calls)
  • Uses a static registry (prototype design, not scoped per analysis context)

These limitations are intentional to keep the implementation safe and incremental.

Validation

  • Added regression test for cross-file RSA sign detection

  • All Python tests pass:

    • 46 tests, 0 failures
  • No regressions observed with registry disabled by default

Notes

This PR focuses on validating a production-compatible approach without introducing breaking changes. Further improvements (context-scoped registry, symbol resolution) can be addressed in follow-up work.

…ple imports)

- Track imported functions via visitImportFrom
- Resolve and analyze imported modules on function calls
- Prevent recursive scanning using visited file tracking
- Add regression test for cross-file RSA sign detection
- Limit scope to same-directory modules (prototype implementation)

Signed-off-by: Sachin Kumar <sachinkumar905846@gmail.com>
Copilot AI review requested due to automatic review settings May 5, 2026 13:29
@sachin9058 sachin9058 requested a review from a team as a code owner May 5, 2026 13:29
@sachin9058
Copy link
Copy Markdown
Contributor Author

@n1ckl0sk0rtge

I’ve implemented an initial version of cross-file function resolution
supporting simple "from x import y" cases within the same directory.

This allows detection of cryptographic operations across files for basic scenarios.

This PR is intentionally scoped as a first step. More advanced cases
(package resolution, symbol linking, Sonar API integration) can be built on top.

Would appreciate feedback on whether this direction aligns with the intended design.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an initial prototype for Python cross-file crypto detection by following simple from ... import ... calls into another module and scanning that module during analysis. This fits the existing Python detection pipeline by extending PythonBaseDetectionRule and adding a regression test around an imported signing helper.

Changes:

  • Track ImportFrom symbols and trigger imported-module analysis when an imported function is called.
  • Add path resolution / visited-file guarding for recursive module scanning.
  • Add a regression test fixture covering an imported RSA signing helper.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Adds import tracking, imported-module resolution, recursive scan guarding, and cross-file scan invocation.
python/src/test/java/com/ibm/plugin/rules/resolve/ResolveImportedSignTest.java Adds a regression test that captures and asserts the detection tree for an imported signing helper.
python/src/test/files/rules/resolve/ResolveImportedSignTestFile.py Provides the caller-side Python fixture that imports and invokes the helper function.
python/src/test/files/rules/resolve/imports/ResolveImportedSignImport.py Provides the imported helper module containing RSA key generation and signing logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
- Fix lifecycle bug using scanDepth
- Guard TestPythonVisitorRunner via reflection
- Prevent repeated scans
- Document limitations clearly

Signed-off-by: Sachin Kumar <sachinkumar905846@gmail.com>
@sachin9058
Copy link
Copy Markdown
Contributor Author

  • Fixed lifecycle issue by resetting visited-file state per top-level scan using scanDepth.
  • Replaced direct usage of TestPythonVisitorRunner with a guarded reflective call to avoid production dependency issues.
  • Retained visited-file guard to prevent repeated scans.
  • Documented current limitations (name-based resolution, full-module scanning).

This keeps the implementation safe while validating the cross-file resolution approach.

Happy to iterate further based on feedback.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
Comment thread python/src/main/java/com/ibm/plugin/rules/detection/PythonBaseDetectionRule.java Outdated
- clear registry per top-level scan
- add null-safe iteration
- prevent duplicate function definitions

Signed-off-by: Sachin Kumar <sachinkumar905846@gmail.com>
@sachin9058
Copy link
Copy Markdown
Contributor Author

@n1ckl0sk0rtge

I’ve replaced the previous prototype (file re-scanning) with a registry-based approach
that leverages Sonar’s AST traversal instead of invoking scans manually.

The implementation is intentionally scoped and guarded behind a feature flag
to avoid impacting existing behavior.

I’ve also addressed code-level concerns:

  • cleared registry per top-level scan to prevent state leakage
  • added null-safe iteration
  • prevented duplicate function registrations

Current limitations (name-based resolution, analysis order dependency, static registry)
are documented in code and PR, and can be improved in follow-up work.

Happy to iterate further based on guidance.

…e prototype

Signed-off-by: Sachin Kumar <sachinkumar905846@gmail.com>
@sachin9058 sachin9058 changed the title feat(python): initial support for cross-file function resolution (simple imports) feat(python): initial support for cross-file function resolution via function registry May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants