SQLite v0 terminology providers with unified filter pipeline#136
Open
jmandel wants to merge 2 commits intoHealthIntersections:2026-02-gg-cs-api-proposalfrom
Open
SQLite v0 terminology providers with unified filter pipeline#136jmandel wants to merge 2 commits intoHealthIntersections:2026-02-gg-cs-api-proposalfrom
jmandel wants to merge 2 commits intoHealthIntersections:2026-02-gg-cs-api-proposalfrom
Conversation
fc82680 to
fb01ee0
Compare
Add SQLite-backed code system providers for RxNorm, LOINC, and SNOMED CT that use a shared v0 schema with closure tables, FTS5 search indexes, and a unified SQL filter pipeline for both includes and excludes. Key features: - Single #buildV0FilterSql code path handles all filter types (concept hierarchy, property filters, code regex, value set membership) - Excludes reuse the same filter SQL wrapped in NOT EXISTS - Streaming pagination for large expansions (124K+ SNOMED codes) - Batch designation fetching for efficient display/property loading - SNOMED expression constraint language support via adapter - RxNorm archived concept import from RXNATOMARCHIVE - STY registered as filterable property for RxNorm - Opt-in perf counters (no-op when disabled) Integrates with Grahame's CS provider API (PR HealthIntersections#133): - getPrepContext, filterExcludeFilters, filterExcludeConcepts - scanValueSet, handlesExcludes, handlesOffset - Unified intent path with includeConcepts + filter + exclude Also fixes method name bugs on legacy filter path (getExtensions -> extensions, getCodeStatus -> getStatus, getProperties -> properties). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fb01ee0 to
02d11bf
Compare
Add time-based query effort limiting for v0 SQLite terminology providers. Uses sqlite3_progress_handler to interrupt queries exceeding a configurable wall-clock time limit (default 5s, configurable via tx.effortLimitMs). Uses a fork of better-sqlite3 (jmandel/better-sqlite3#progress-handler) that exposes db.progressHandler(interval, callback). The fork is an optional dependency — if native compilation fails (no build tools), falls back to stock better-sqlite3 and queries run without effort limits. The progress handler callback checks performance.now() every 10,000 VM instructions (~0.2ms granularity). Queries exceeding the limit throw SQLITE_INTERRUPT, which propagates as a standard error. Config: set modules.tx.effortLimitMs in config.json (default: 5000ms). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f3ae920 to
b04fa69
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Generic SQLite-based CodeSystem providers for RxNorm, LOINC, and SNOMED CT using a unified v0 schema. Builds on the decomposed CS filter API from PR #133 — single commit on top of
c4d4590.Supersedes #135 (which targeted
main). Now rebased onto the2026-02-gg-cs-api-proposalbranch, adopting Grahame's API naming/signatures where they overlap with ours.What's new vs #135
#buildV0FilterSqlcode path handles all filter types for both includes and excludes (was duplicated before)conceptis a virtual property not inproperty_def; exclude validation now uses SQL trial instead of property lookupactive=0getExtensions→extensions,getCodeStatus→getStatus,getProperties→propertieson legacy filter path_propsIfRequested— removed duplicate wrapper_v0Excludesleak — if provider sayshandlesExcludes()=true, trust itCS Provider Interface: What We Add Beyond PR #133
PR #133 introduces the decomposed filter API (
filter()→filterExcludeFilters()→executeFilters()→filterMore()/filterConcept()). We use all of those as-is. Here is what we add or change:New method:
includeConcepts(filterContext, codes)Added to
cs-api.js— no-op default, backward-compatible.Why: PR #133 has no way for the provider to see explicit concept codes from
compose.include[].concept. The worker handles those in a separate per-codelocate()loop, completely outside the filter pipeline. This means the provider can't build one optimal SQL query that covers both concept enumeration and filters together.What it does: Records intent to include specific codes. No SQL execution —
executeFilters()incorporates these asWHERE code IN (...)in the combined query.Worker side: The worker checks
typeof cs.includeConcepts === 'function'before calling. If absent, falls back to the original per-codelocate()loop.Changed:
handlesOffset()return valueIn PR #133:
handlesOffset()body is empty (returnsundefined/falsy).Our fix: Returns
falseexplicitly. Our v0 provider returnstrue.Worker side: We changed the LIMIT passdown gate from
vsInfo.csDoOffset(only true for simple single-CS ValueSets) tocs.handlesOffset()(true for any provider that supports paging). This is safe because excludes are system-scoped — an exclude on system B can't drain results from system A. Result: cross-system ValueSet expansion dropped from ~4s to ~12ms withcount=10.Changed: unified intent block in worker
In PR #133: Separate
if (cset.concept)andif (cset.filter)blocks — the provider never sees the complete picture.Our change: When the provider supports
includeConcepts, the worker creates one prep context and registers all intent (concepts + filters + excludes) before callingexecuteFilters()once. Falls back to the original separate-block behavior whenincludeConceptsis absent.Changed: skip
excludeCodes()iteration when provider handles excludesIn PR #133:
handleCompose()always iterates all excluded codes viaexcludeCodes()→ per-codeisExcluded(), even when the provider'shandlesExcludes()returned true andfilterExcludeFilters()already registered them in SQL.Our change: When
csDoExcludesis true, skip theexcludeCodes()iteration entirely. The provider handles excludes in its SQL.Bug fix: method names on legacy filter path
Lines 778-779 called
cs.getExtensions(c),cs.getCodeStatus(c),cs.getProperties(c)— none of which exist on any provider. The base class hasextensions(),getStatus(),properties(). Fixed to use correct names.Worker:
listDisplaysFromProviderfast pathAdded display fast path: when
workingLanguagesis set and designations aren't requested, usecs.display(context)directly instead ofcs.designations(context, displays). Avoids per-code DB queries when only the primary display is needed.Worker: designation batch pre-fetch via
getPrepContext()Grahame's updated
getPrepContext(iterate, params, excludeInactive, offset, count)passes the fullTxParameters. Our v0 provider readsparams.includeDesignations,params.workingLanguages(), andparams.designationsto determine designation needs.executeFilters()then batch-fetches all designations in one query instead of per-code queries.Architecture
Unified filter pipeline
Batch designation pre-fetch
getPrepContext()tells the provider what designation data will be needed.executeFilters()batch-fetches all designations in one query. During iteration,designations()reads from a pre-fetched Map.Performance
Full expansion (no count limit)
IPS/FHIR R4 ValueSets (count=100)
Cross-system LIMIT optimization
Test results
Files
tx/cs/cs-api.js—includeConcepts()+handlesOffset()fixtx/cs/cs-sqlite-runtime-v0.js— Core v0 provider (~3,400 lines)tx/cs/cs-sqlite-snomed-v0.js— SNOMED specialization (expressions, ECL, hierarchy)tx/cs/cs-sqlite-expression-adapter.js— SNOMED expression → v0 adaptertx/importers/sqlite-v2/— v0 schema + importers for RxNorm, LOINC, SNOMED CTtx/workers/expand.js— Unified intent pipeline, LIMIT passdown, bug fixesscripts/test-expand-cross-system.js— 60-test expansion test suitedocs/open-questions.md— Open questions and resolved items