feat: GraphQL SDL extraction + federation, operation→resolver & call-site graph links#1438
Open
daniil-kzn wants to merge 3 commits into
Open
feat: GraphQL SDL extraction + federation, operation→resolver & call-site graph links#1438daniil-kzn wants to merge 3 commits into
daniil-kzn wants to merge 3 commits into
Conversation
graphify has no tree-sitter grammar for GraphQL, so .graphqls schema files are silently skipped — types, inputs and mutations defined in SDL never enter the graph. This adds a focused, graphql-core based per-file extractor wired through the existing `_get_extractor` dispatch (same mechanism as the blade/mcp/manifest special cases). Zero behavior change unless a .graphqls / .graphql file is present. Emits structured nodes (not a plain-text sidecar): types, inputs, interfaces, enums (+ values), scalars, unions, their fields, and root Mutation/Query fields as operations. Edges: type --contains--> field field --references--> field's named type operation --references--> argument input type operation --returns--> return type - New module graphify/graphql_sdl.py; adds graphql-core dependency. - Degrades to a no-op if graphql-core is unavailable; malformed schemas return an error marker instead of raising. - tests/test_graphql_sdl.py covers types/inputs/enums/operations, the operation->input/return edges, and malformed-input safety. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
a2258de to
e1b8f62
Compare
Adds a call-site extractor that captures where code *invokes* a GraphQL operation — gql`...` / graphql`...` tagged template literals in TS/JS and graphql:"..." struct tags in Go — which tree-sitter indexes as opaque string literals and the SDL pass therefore can't reach. Each call site becomes a `gql_call` node; a per-repo pass links same-repo calls and a global stitch links cross-repo calls to the owning `gql_operation` by name, so a frontend's call to a backend mutation becomes a real edge. This closes the loop opened by the SDL extractor: with calls linked, a query for a backend operation surfaces every consumer it would affect across repos (frontend documents and Go service clients alike). - graphify/graphql_calls.py: pure scanner (root-selection parse for TS, tag parse for Go) + gql_call node builder. graphql-core not required. - graphify/extract.py: fold call-site extraction into the per-file code extractor (so it caches/incrementally updates like AST) + per-repo call->operation linking. - graphify/global_graph.py: _stitch_gql_calls — idempotent cross-repo linking, re-run on every global_add alongside _stitch_federation. - graphify/cache.py: bump extractor salt so the AST cache regenerates. - tests/test_gql_calls.py: scanner edge cases (args, aliases, inline-object args, fragments, Go tags) + per-repo and cross-repo linking + idempotency. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What / why
graphify's structural extraction is tree-sitter based, and there is no
tree-sitter grammar for GraphQL. As a result
.graphqls/.graphqlschemafiles are silently skipped — the types, inputs and mutations defined in SDL
never enter the graph — and the call sites that invoke those operations live
inside
gql...`` /graphql:"..."string literals, which tree-sitterindexes as opaque text. So the GraphQL contract layer, first-class structure in
a lot of codebases (gqlgen, Apollo, federation), is invisible to the graph.
This adds GraphQL support along the existing extractor seams: an SDL extractor
wired through
_get_extractor, a call-site extractor folded into the per-filecode extraction, plus graph-stitching passes that connect the new nodes to the
code and across repos. There is zero behavior change unless GraphQL is present
— every new pass early-returns when no
gql_*nodes exist and the call-sitescan yields nothing for files without a GraphQL literal.
The end result closes a loop graphify couldn't close before: from a backend
operation, traverse to every consumer across repos — frontend documents and
Go service clients alike — that a change to it would affect.
Design
1. SDL extractor —
graphify/graphql_sdl.py(new)graphql-coreinto the standard per-file{"nodes": [...], "edges": [...]}shape, wired in via_get_extractor(extract.py) +
.graphqls/.graphqladded toCODE_EXTENSIONS.gql_type,gql_input,gql_interface,gql_enum(+ values),gql_scalar,gql_union, fields, androot Mutation/Query fields as
gql_operation. Apollo Federationtype X @keyis tagged
gql_entity(federation=entity),extend type X @keyasfederation=extends.type --contains--> field,field --references--> named type,operation --references--> input,operation --returns--> return type.2. Operation → resolver links + dedup —
graphify/extract.py_consolidate_gql_duplicates: SDL nodes use name-keyed ids (gql_<name>) so atype split across files collapses to one node; a federated owner outranks a
plain stub.
_link_gql_operations_to_resolvers: matches agql_operationto the resolverfunction that implements it by normalized name, bridging schema ↔ code.
3. Operation call sites —
graphify/graphql_calls.py(new) +extract.pyfind_gql_operation_callsscans a source file for where code invokes anoperation: the root selections of
gql.../ `graphql`...taggedtemplate literals in TS/JS (a mutation's root field is the operation it calls),
and the operation named in a Go
graphql:"..."struct tag. Nested fields,aliases, inline-object arguments and fragment spreads are skipped, so the
result is the operations the document actually calls — no string-content
guessing, no
graphql-coredependency._compose_with_gql_callsfolds this into the per-file code extractor returnedby
_get_extractor, so eachgql_callnode is cached and incrementallyupdated exactly like the AST nodes, and is anchored to the nearest enclosing
symbol with a
referencesedge._link_gql_calls_to_operations: per-repo, links agql_callto thegql_operationof the same name (a service calling its own operation).4. Cross-repo stitch —
graphify/global_graph.py_stitch_federation: same entity name across two repos is the same federatedentity, so each
extend-side reference gets afederation_keyedge to theowning service.
_stitch_gql_calls: agql_callin one repo (e.g. a frontend) links with acallsedge to thegql_operationof the same name in the repo that definesit (e.g. a backend) — the cross-repo frontend→mutation link.
global_add.Footprint / safety
graphql-coreimport keeps the SDL pass optional: missing lib → no-op;malformed schema →
errormarker instead of raising (one bad file can't aborta run). The call-site scan is pure-text and dependency-free.
graphql-core>=3.2,<4.build.pyis unchanged. AST-cache salt bumped soexisting caches regenerate against the new extractor output.
Tests
tests/test_graphql_sdl.py(types/inputs/enums/operations, op→input/returnedges,
@keyentity tagging, malformed-input safety),tests/test_gql_federation.py(consolidation, op→resolver matching, cross-repofederation stitch + idempotency), and
tests/test_gql_calls.py(root-selectionparse with args/aliases/inline-object args/fragments, Go tag parse, gql_call node
shape, per-repo and cross-repo call→operation linking + idempotency). All green;
full suite passes.
Scope note
Intentionally narrow: GraphQL is one structured contract you already have in the
repo, wired through existing seams, mirroring the merged
--cargo(#1271) andPowerShell
.psd1(#1341) extractors. It does not broaden graphify into ageneral ingestion layer (no new doc formats, archives, or network fetching) and
does not turn SDL or gql literals into opaque text — it produces the same kind of
structured, code-linked graph graphify already builds. The four passes are
additive and independently reviewable; happy to split them into separate PRs if
you'd prefer.