Skip to content

fix(ruby-extractor): capture keyword params and members of nested classes/modules#439

Open
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/ruby-extractor-edge-cases
Open

fix(ruby-extractor): capture keyword params and members of nested classes/modules#439
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/ruby-extractor-edge-cases

Conversation

@tirth8205

Copy link
Copy Markdown
Contributor

Problem

The Ruby extractor dropped two common constructs from structural analysis:

  1. Keyword parameters. extractParams switched on the parameter node type but had no case for keyword_parameter — the tree-sitter node for Ruby keyword args (name:, age: 30). So def configure(name:, age: 30) produced params: [], and a mixed signature def build(a, b, name:, opts: {}) produced params: ["a","b"], losing every keyword arg. Keyword arguments are pervasive in modern Ruby, so this dropped a large fraction of real method signatures.

  2. Members of nested classes/modules. extractClassBody only inspected method, singleton_method, and call members of a class/module body. It never handled nested class or module members, so module Outer; class Inner; def foo; end; end; end produced classes: ['Outer'] and functions: [] — the Inner class and its foo method vanished. Nesting a class inside a module is the standard Ruby namespacing idiom.

Fix

In understand-anything-plugin/packages/core/src/plugins/extractors/ruby-extractor.ts:

  • Added a keyword_parameter case in extractParams that pushes ident.text + ":" (the trailing : distinguishes keyword args from positional ones).
  • Threaded the shared classes array into extractClassBody and added class/module member handling that recurses through the existing extractClass/extractModule helpers, appending nested declarations and their methods to the top-level classes/functions arrays.

Testing

Added three test cases to ruby-extractor.test.ts (extracts keyword parameters, extracts mixed positional and keyword parameters, extracts classes/modules nested inside a module). All three fail before the fix (keyword params dropped; Inner missing) and pass after. The full core Vitest suite (695 tests across all extractors) is green, tsc --noEmit exits 0, and ESLint is clean on both changed files.

🤖 Generated with Claude Code

…sses/modules

extractParams had no case for `keyword_parameter`, so Ruby keyword args
(`name:`, `age: 30`) were silently dropped from method signatures.
extractClassBody also only inspected method/singleton_method/call members,
so classes/modules nested inside a module (the standard namespacing idiom)
and their methods vanished from the structural analysis.

Add a `keyword_parameter` case (emitting `name:`) and thread the shared
`classes` array through extractClassBody so nested class/module members
recurse into extractClass/extractModule.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thejesh23 thejesh23 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few concerns before this lands.

1. Nested classes lose namespace context in the flat classes[]. module Outer; class Inner; end; end now produces two entries named "Outer" and "Inner" — not "Outer::Inner". Two modules each defining class Config will both register as "Config" and become indistinguishable downstream (no way to attribute methods/functions back to the right namespace). Compare with the existing namespaced-class path which preserves "Foo::Bar" via scope_resolution. Either qualify on push ("${outerName}::${innerName}") or document the ambiguity. Same shape will hit the Dart extractor in #435 once it grows nested-class handling.

2. Exports drift from classes[]. extractStructure's top-level switch is the only place that appends to exports. Pre-fix that was fine because nested classes were dropped from classes too; now classes contains Inner but exports does not. Anything that joins exports against the class list (or treats classes[] as the export surface for Ruby) will see a mismatch. Worth either also surfacing nested decls as exports or noting this is intentional.

3. Test uses arrayContaining, so ordering regressions slip through. Because classes.push happens after the body recursion, the inner class is pushed before its enclosing module — order in classes[] is [Inner, Outer], not source order. The new test is order-agnostic, so a future change that fixes (or further breaks) ordering won't be caught. Also no assertion that Outer.methods does NOT contain foo — the current code is correct on this point but it's the most likely regression vector if someone later folds the class/module branches back into the method path.

Nit: the extractParams JSDoc still lists the handled cases exhaustively; please add keyword_parameter to the comment so the next reader doesn't think it's missing.

…k test invariants

Document at the nested class/module recursion site that nested declarations
are surfaced into classes[] with their bare name only (no namespace
qualification, matching python-extractor) and are intentionally omitted from
exports[] (only top-level decls are exports). Add keyword_parameter to the
extractParams JSDoc handled-cases list.

Strengthen the nested-class test: assert exact classes[] order [Inner, Outer]
and that the nested method `foo` lands on Inner.methods, not Outer.methods.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tirth8205

Copy link
Copy Markdown
Contributor Author

Addressed all four points.

1 — Nested classes lose namespace context. Kept bare names but documented the contract at the nested recursion site in extractClassBody. Namespace qualification (Outer::Inner) would introduce a new cross-extractor naming convention: python-extractor pushes bare nameNode.text and doesn't surface nested classes at all, and the existing Foo::Bar case only preserves the qualified form because the source literally writes a scope_resolution node — it is not runtime-nesting qualification. Threading an enclosing-name prefix through extractClass/extractModule/extractClassBody is a behavior expansion beyond this edge-case fix, so I took the documented-ambiguity option you offered. The comment spells out that two same-named nested decls in different scopes are indistinguishable in the flat classes[].

2 — Exports drift from classes[]. Documented as intentional in the same comment: per the extractor's stated contract only top-level declarations are exports (Ruby has no formal export syntax, and a class nested inside a module is not a top-level export surface), so classes[] and exports[] can legitimately diverge for nested decls. No consumer treats classes[] as the Ruby export surface, so I did not expand exports[].

3 — Order-agnostic test + missing negative assertion. Replaced arrayContaining with an exact toEqual(['Inner', 'Outer']) (verified by running the test — extractClassBody recurses into and pushes Inner before the enclosing Outer is pushed). Added assertions that Inner.methods contains foo and Outer.methods does not, locking the method-attribution invariant you flagged as the likely regression vector.

Nit. Added keyword_parameter (name:) to the extractParams JSDoc handled-cases list.

Full core suite green (695 passed); tsc --noEmit clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants