Generic SQLite v0 terminology providers with batch-optimized expansion#135
Closed
jmandel wants to merge 5 commits intoHealthIntersections:mainfrom
Closed
Generic SQLite v0 terminology providers with batch-optimized expansion#135jmandel wants to merge 5 commits intoHealthIntersections:mainfrom
jmandel wants to merge 5 commits intoHealthIntersections:mainfrom
Conversation
98b50b1 to
a444c0d
Compare
Add SQLite-backed code system providers for RxNorm, LOINC, and SNOMED CT that use a shared v0 schema with closure tables, FTS5 search indexes, and a unified SQL filter pipeline for both includes and excludes. Key features: - Single #buildV0FilterSql code path handles all filter types (concept hierarchy, property filters, code regex, value set membership) - Excludes reuse the same filter SQL wrapped in NOT EXISTS - Streaming pagination for large expansions (124K+ SNOMED codes) - Batch designation fetching for efficient display/property loading - SNOMED expression constraint language support via adapter - RxNorm archived concept import from RXNATOMARCHIVE - STY registered as filterable property for RxNorm - Opt-in perf counters (no-op when disabled) Integrates with Grahame's CS provider API (PR HealthIntersections#133): - getPrepContext, filterExcludeFilters, filterExcludeConcepts - scanValueSet, handlesExcludes, handlesOffset - Unified intent path with includeConcepts + filter + exclude Also fixes method name bugs on legacy filter path (getExtensions -> extensions, getCodeStatus -> getStatus, getProperties -> properties). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7cf7f4a to
fc82680
Compare
Contributor
Author
|
Superseded by new PR targeting the decomposed CS filter API branch (PR #133). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Generic SQLite-based CodeSystem providers that can serve any terminology imported into a unified v0 schema — replacing per-terminology custom providers for RxNorm, LOINC, and SNOMED CT. Builds on the decomposed CS filter API from PR #133, adding new backward-compatible provider methods and worker optimizations.
Real-World ValueSet Expansion Performance
Tested against real ValueSets from FHIR R4 Core and IPS specifications. All queries use the v0 SQLite providers.
Full expansion (no count limit)
descendent-of 404684003)descendent-of 71388002)descendent-of 123037004)TTY in SBD,SCD)TTY = SBD)IPS/FHIR R4 ValueSets (count=100)
descendent-of Procedureminus admin, bloodbank, community health, etc.CLASSTYPE=1 AND STATUS=ACTIVEminus 4 CLASS valuesis-aincludesdescendent-of+ 1is-adescendent-of 56265001+ filter textheartCross-system LIMIT optimization
Previously, cross-system ValueSets never got LIMIT passed to providers. Now
cs.handlesOffset()gates LIMIT passdown per-CS, which is safe because excludes are system-scoped.Architecture
v0 Schema
A single normalized SQLite schema (
tx/importers/sqlite-v2/schema-v0.sql) stores any code system:Importers for RxNorm, LOINC, and SNOMED CT transform source data into this schema.
Query Pipeline
The v0 provider implements a declarative intent pipeline:
includeConcepts(),filter(),excludeConcepts(),filterExclude(),prepareDesignations()executeFilters()builds one combined SQL query from all registered intentBatch Designation Pre-fetch
The biggest performance win. Before:
designations()ran oneSELECT FROM designation WHERE concept_id=?per code during iteration — 27K queries for RxNorm SBD+SCD, 124K for Clinical Findings.After:
prepareDesignations()tells the provider what designation data will be needed.executeFilters()batch-fetches all designations in one query (chunked in batches of 500). During iteration,designations()reads from the pre-fetched Map.New Provider API Methods
All backward-compatible — providers that don't implement them get original behavior.
includeConcepts(ctx, codes)excludeConcepts(ctx, codes)locateBatch(codes, filterSet)prepareDesignations(ctx, options)Test Results
Files
tx/cs/cs-sqlite-runtime-v0.js— Core v0 provider (~3,500 lines)tx/cs/cs-sqlite-snomed-v0.js— SNOMED specialization (expressions, ECL, hierarchy)tx/cs/cs-sqlite-expression-adapter.js— SNOMED expression → v0 adaptertx/cs/cs-sqlite-v0-specializers.js— Per-terminology specialization registrytx/importers/sqlite-v2/— v0 schema + importers for RxNorm, LOINC, SNOMEDtx/cs/cs-api.js— New API method declarationstx/workers/expand.js— Unified intent pipeline, LIMIT passdown, prepareDesignationsscripts/test-expand-cross-system.js— 80-test expansion test suitedocs/open-questions.md— API changes documentation and open questions