feat(extract): Phase 1 — extraction write validation with count-before/after (issue #693)#747
feat(extract): Phase 1 — extraction write validation with count-before/after (issue #693)#747jlin53882 wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 601a1b415a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pattern: /\b(def\s|class\s|import\s|from\s|async\s+def\s|print\()/, | ||
| lang: 'python', |
There was a problem hiding this comment.
Prioritize JS/TS imports over Python keyword matching
The Python detector currently matches import/from before the JavaScript pattern, so common JS/TS modules like import x from 'y' are classified as Python. In that path astChunk attempts a Python parse and usually falls back to chunkDocument, which silently disables the AST chunking fix for many real JS/TS files and reintroduces mid-function splits.
Useful? React with 👍 / 👎.
| const fullText = pendingNonDecl.length > 0 ? pendingNonDecl + '\n' + text : text; | ||
| pendingNonDecl = ''; // reset | ||
|
|
||
| if (fullText.length <= config.maxChunkSize) { | ||
| chunks.push(fullText); | ||
| metadatas.push({ | ||
| startIndex: child.startIndex, | ||
| endIndex: child.endIndex, | ||
| length: fullText.length, |
There was a problem hiding this comment.
Preserve source offsets when prepending non-declaration text
When pendingNonDecl is prepended to a declaration, the emitted chunk contains text that starts earlier than child.startIndex, but metadata still reports startIndex: child.startIndex/endIndex: child.endIndex. This produces incorrect offsets for returned chunks, so any consumer that maps chunk metadata back to original source positions will get wrong spans.
Useful? React with 👍 / 👎.
| options.onExtractionValidationFailed?.({ | ||
| expected: expectedCreated, | ||
| actual: actualCreated, | ||
| mismatch, | ||
| sessionKey, | ||
| }); |
There was a problem hiding this comment.
Guard validation callback failures from aborting extraction
The mismatch callback is invoked directly without isolation, so if onExtractionValidationFailed throws (or an async callback rejects), extractAndPersist rejects even though bulkStore already completed. This turns an optional observability hook into a write-path failure mode and can cause callers to treat successful writes as failed extractions.
Useful? React with 👍 / 👎.
|
Please rebase this branch on current The branch is behind current |
fad0fc2 to
4a05a7a
Compare
PR747 Rebased — Issue #693 Content OnlyAfter your review comment about overlap with PR745, I have rebuilt the branch to isolate only the Issue #693 extraction write validation content. Removed (overlaps with PR745)
Retained (Issue #693 — extraction write validation)
Addressing the Codex Review FeedbackThe three review comments (P1 priority + 2x P2) all pertain to the The branch now has 1 commit ( Please re-review. |
…#693) - Adds countBefore/countAfter validation in smart-extractor.ts after bulkStore - Adds ExtractionValidation type and callback interface in memory-categories.ts - Adds extraction-validation.test.mjs (6 scenarios, all pass) - Adds dedup-false-alarm.test.mjs (P0 regression: bulkStore does NOT dedup) - Registers new tests in CI manifest - Adds store.count() mock to existing tests (preference-slots, batch-embed, bulk-store)
4a05a7a to
2226403
Compare
rwmjhb
left a comment
There was a problem hiding this comment.
PR #747 Review: feat(extract): Phase 1 — extraction write validation with count-before/after (issue #693)
Verdict: REQUEST-CHANGES | 6 rounds completed | Value: 61% (codex: 70% / claude: 55%) | Size: XL | Author: jlin53882
Value Assessment
Problem: Extraction writes can silently under-persist memories because extractAndPersist reports intended creations without verifying that bulkStore actually increased the LanceDB row count. This PR adds count-before/count-after validation to detect mismatches after extraction persistence.
| Dimension | Assessment |
|---|---|
| Value Score | 61% (codex: 70% / claude: 55%) |
| Value Verdict | review |
| Issue Linked | true |
| Project Aligned | true |
| Duplicate | false |
| AI Slop Score | 1/6 |
| User Impact | high |
| Urgency | high |
Scope Drift: 2 flag(s)
- scripts/ci-test-manifest.mjs removes existing tests for issue-690-cross-call-batch, issue606_sdk-migration, infer-provider-from-baseurl, is-recall-used, and tier1-counters, which is not justified by extraction write validation
- src/memory-categories.ts exports ExtractionValidation, but src/smart-extractor.ts defines an inline callback payload type instead of using the exported type
AI Slop Signals:
- The PR has some claim/code mismatch risk: the comment says callback handling is addressed, but the diff only uses optional chaining and does not isolate callback exceptions.
Open Questions:
- Should onExtractionValidationFailed be allowed to throw and fail extraction, or should validation callbacks be isolated/logged?
- Was the removal of existing scripts/ci-test-manifest.mjs entries intentional, or an accidental rebase artifact?
- Should the exported ExtractionValidation type be used by ExtractPersistOptions instead of duplicating the shape inline?
Summary
Extraction writes can silently under-persist memories because extractAndPersist reports intended creations without verifying that bulkStore actually increased the LanceDB row count. This PR adds count-before/count-after validation to detect mismatches after extraction persistence.
Evaluation Signals
| Signal | Value |
|---|---|
| Blockers | 0 |
| Warnings | 0 |
| PR Size | XL |
| Verdict Floor | approve |
| Risk Level | normal |
| Value Model | codex |
| Primary Model | codex |
| Adversarial Model | claude |
Must Fix
- F1: SIGKILL/OOM scenario is still not detectable
- F2: Validation mismatches are silent in production by default
Nice to Have
- F3: Validation callback failures are not isolated or awaited
- F4: Supersede-created memories bypass the validation path
- F5: Unrelated CI regression tests were removed from the manifest
- F6: Count failures now fail extraction writes
- MR1: Mismatch direction documentation contradicts the math (over-write attributed to concurrent DELETE)
- MR2: ExtractionValidation type is exported but ExtractPersistOptions defines an inline duplicate
- MR3: test/dedup-false-alarm.test.mjs's makeNearDuplicateVector produces a vector with cosine similarity 1.0, not 0.95 as labeled
- MR4: No test asserts the specific direction-of-mismatch invariant (negative mismatch via concurrent ADD)
Recommended Action
Good direction — problem is worth solving. Author should address must-fix findings, then this is ready to merge.
Reviewed at 2026-05-10T07:26:46Z | 6 rounds | Value: codex | Primary: codex | Adversarial: claude
7f62090 to
2965364
Compare
F2: abortOnExtractionMismatch option in ExtractPersistOptions F3: wrap onExtractionValidationFailed callback in try/catch F4: clarify supersede entries counted in Phase 1 validation F6: throw on mismatch>0 when abortOnExtractionMismatch=true; log mismatch<0 as WARNING MR1: clarify mismatch direction doc (positive=under-write/SIGKILL, negative=over-write) MR2: use exported type from memory-categories.ts MR3: fix makeNearDuplicateVector orthogonal component to produce proper cosine similarity MR4: add T7 test for abortOnExtractionMismatch behavior F5: manifest already has all 5 cited tests (tier1-counters, issue606, is-recall-used, infer-provider, issue-690)
PR747 Fixes — Addressing All Review PointsThank you for the detailed review. All must-fix and nice-to-have items have been addressed. Must FixF1 — SIGKILL/OOM detectable via countBefore/after + Phase 2 UUID list Phase 1 now uses // Phase 2 will address concurrent-safe validation (UUID list compare):
// - Collect UUIDs of entries passed to bulkStore
// - After bulkStore, query store for those UUIDs to verify each exists
// - This catches both SIGKILL/OOM partial writes AND concurrent compactor deletionsF2 — Production validation silent by default Added if (mismatch > 0 && options.abortOnExtractionMismatch === true) {
throw new Error(
"memory-pro: smart-extractor: extraction aborted: " + mismatch + " entries failed to persist (expected=" + expectedCreated + ", actual=" + actualCreated + ")"
);
}Default is Nice to HaveF3 — Callback exception not isolated Callback is now wrapped in try {
options.onExtractionValidationFailed?.({ expected, actual, mismatch, sessionKey });
} catch (cbErr) {
this.log("memory-pro: smart-extractor: onExtractionValidationFailed callback threw: " + String(cbErr));
}F4 — Supersede bypass validation Added explicit comment clarifying that supersede entries are included in // NOTE: supersede entries are included in createEntries and therefore counted
// in expectedCreated - the count check validates them normally (F4).F5 — CI manifest removed tests Verified via F6 — Count failure fails extraction Two-mode behavior:
if (mismatch < 0) {
this.log("memory-pro: smart-extractor: WARNING - over-write detected: " + Math.abs(mismatch) + " more entries persisted than expected (expected=" + expectedCreated + ", actual=" + actualCreated + "). This is rare and may indicate a concurrent compactor or duplicate session run.");
}MR1 — Mismatch direction doc contradicts math Updated doc to correctly attribute both directions: /**
* expected - actual; positive = under-write (SIGKILL/OOM partial write, fewer rows),
* negative = over-write (concurrent compactor DELETED rows before our count, more rows).
* positive (under-write): callback invoked; if abortOnExtractionMismatch=true throws.
* negative (over-write): always logged as WARNING and never throws.
*/MR2 — Duplicate inline type vs exported type The callback payload in MR3 — makeNearDuplicateVector cosine similarity 1.0 bug Scaling a vector preserves direction → cosine similarity always equals 1.0 regardless of scale. Fixed by adding an orthogonal component: const scale = similarity;
const orthScale = Math.sqrt(Math.max(0, 1 - scale * scale));
return baseVec.map((v, i) => v * scale + orth[i] * orthScale);MR4 — No negative mismatch test Added T7 test covering three sub-cases:
Local Test ResultsAll tests pass locally (verified at
Branch: |
Issue #693 — Phase 1: Extraction Write Validation
Problem
After
extractAndPersist()callsbulkStore(), there is no verification that writes actually succeeded. A SIGKILL/OOM duringtable.add()leaves silent data loss.Solution (Phase 1)
Count-before/after verification after
bulkStore()with an optional callback for mismatch notification.Changes
src/memory-categories.tsExtractionValidationtype +OnExtractionValidationFailedcallback interfacesrc/smart-extractor.tscountBefore/countAftervalidation afterbulkStore(), callback invoked on mismatchtest/extraction-validation.test.mjstest/dedup-false-alarm.test.mjsbulkStoredoes NOT deduplicatescripts/ci-test-manifest.mjsPhase 2 (not included)
Note
Count-diff validation assumes single-process extraction (guaranteed by plugin architecture). Concurrent writes can inflate
actualCreated— acceptable for Phase 1, will be addressed in Phase 2.