Skip to content

fix: Issue #675 #676 complete batch mode implementation#782

Open
jlin53882 wants to merge 2 commits intoCortexReach:masterfrom
jlin53882:fix/pr-678-analysis
Open

fix: Issue #675 #676 complete batch mode implementation#782
jlin53882 wants to merge 2 commits intoCortexReach:masterfrom
jlin53882:fix/pr-678-analysis

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

@jlin53882 jlin53882 commented May 9, 2026

PR: Issue #675 + #676 完整實現 - 延續 PR #678 與 PR #723

摘要

本 PR 完整實現了 Issue #675(regex fallback bulkStore優化)和 Issue #676(handleSupersede batch mode),並基於 PR #723 的基礎上增加了完整的 batch invalidation 支援。

與 PR #678 的關係

與 PR #723 的差異

功能 PR #723 本 PR
bulkStore for 新 entries
invalidateEntries 陣列
第二遍 backfill superseded_by
Rollback 邏輯(部分失敗時)
MR2 guard(supersede case)
MR2 guard(contradict-as-supersede case)
handleContradict null check
valid_from 保留(原時間戳)

PR #723 的限制

PR #723 實現了:

但缺少:

  • 當 existing record 被取代時的 batch invalidation
  • 第二遍 backfill superseded_by
  • 部分失敗時的 rollback 邏輯
  • MR2 重複 supersede 防護
  • handleContradict null check

本 PR 填補了這些缺口。

變更內容

index.ts

  • 新增常數 AUTO_CAPTURE_DEDUP_THRESHOLD = 0.90

src/smart-extractor.ts

  • 新增 InvalidateEntry interface
  • 修改 handleSupersede
    • 接受 invalidateEntries 參數
    • 當 existing 存在時,使用 batch mode(推送到 createEntries + invalidateEntries)
    • 保留原有的 valid_from(原時間戳)
  • 修改 extractAndPersist
    • 新增 invalidateEntries 陣列
    • 新增 queuedSupersedeMatchIds Set(MR2 guard)
    • 在 bulkStore 後執行第二遍 backfill(填補 superseded_by)
    • 新增完整的 rollback 邏輯(部分失敗時)
  • 修改 processCandidate
    • supersede case 加入 MR2 guard
    • contradict case 加入 MR2 guard(當作為 supersede 時)
  • 修改 handleContradict
    • 當 target 不存在時,跳過建立 contradiction entry(避免 dangling reference)

測試

相關測試檔案:

  • test/invalidate-error-regression.test.mjs
  • test/supersede-existing-found-bulk.test.mjs

預期行為

  1. 新 entry:透過 bulkStore 寫入(1 個 lock)
  2. 被取代的舊 entry:在 bulkStore 後執行 invalidate(個別更新)
  3. 部分失敗時:自動 rollback,恢復到原始狀態
  4. 同一 batch 內的重複 supersede:MR2 guard 防止重複

關聯 Issue

…#693)

- Adds countBefore/countAfter validation in smart-extractor.ts after bulkStore
- Adds ExtractionValidation type and callback interface in memory-categories.ts
- Adds extraction-validation.test.mjs (6 scenarios, all pass)
- Adds dedup-false-alarm.test.mjs (P0 regression: bulkStore does NOT dedup)
- Registers new tests in CI manifest
- Adds store.count() mock to existing tests (preference-slots, batch-embed, bulk-store)
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1611572aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/smart-extractor.ts
Comment on lines +474 to +475
if (inv.bulkIndex !== undefined && inv.bulkIndex < bulkResults.length) {
const newEntryId = bulkResults[inv.bulkIndex].id;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep supersede backfill indices aligned with bulkStore results

When any earlier queued create entry is invalid (for example the embedding-failure path queues vector || []), MemoryStore.bulkStore filters it out before assigning IDs (src/store.ts 532-535), so bulkResults is a compressed array. In that case this bulkIndex can point at the wrong persisted memory or fall past the end, leaving the old memory with a missing or incorrect superseded_by after it is invalidated. Consider filtering/validating createEntries before recording indices, or carrying an explicit placeholder-to-result mapping from bulkStore.

Useful? React with 👍 / 👎.

@jlin53882
Copy link
Copy Markdown
Contributor Author

PR 782 Fixes Follow-up

Fixed Issues

Level Issue Fix
P0 bulkStore filter causes bulkIndex misalignment Use supersedes field matching instead of array index
P1 handleContradict skips when target not found Restore old behavior - still create contradiction entry

P0 Fix Details

  • Problem: bulkStore filters invalid entries (empty vector), compressing bulkResults
  • Old: Use array index (bulkIndex = createEntries.length) ??index can be wrong
  • New: Build supersedes ??newEntry Map from metadata field

P1 Fix Details

  • Problem: When target not found, return without creating entry
  • Fix: Still create contradiction entry (marked as "orphan"), preserving LLM's contradiction decision

Not Fixed (Known Limitations)

Level Issue Note
P2 valid_from Already has valid_from: now, no fix needed
P2 MR2 guard Single extractAndPersist scope is known design

Commit: d4ab579
Analyzer: GitNexus + Claude Code

@jlin53882
Copy link
Copy Markdown
Contributor Author

PR 782 Known Limitations

Documented Limitations

Level Issue Note
P2 MR2 guard cross-batch ineffective Single extractAndPersist call only. Multiple calls in same session won't benefit.
P2 valid_from inherits on OLD entry This is intentional - preserves original timestamp for version chain

Why These Are Acceptable

  1. MR2 guard: Prevents duplicate supersedes WITHIN a single batch - this is the main use case (multiple LLM candidates for same memory)
  2. valid_from on old entry: The old entry keeps its original valid_from - this is by design comment in code. The NEW entry correctly gets valid_from: now

Not Blockers

  • No data corruption risk from these limitations
  • Main P0 issues (bulkIndex + rollback) are fixed
  • Core functionality works correctly

These are documented here for transparency. The PR can be merged with these known limitations documented.

… batch mode implementation

- Tests for handleSupersede batch mode (bulkStore + update invalidation)
- Tests for decision-degradation guard (contradict with invalid match_index → create)
- Tests for rollback behavior when update fails (delete new entries from bulkStore)
- Fix src/smart-extractor.ts to PR CortexReach#782 commit d161157 (1844 lines, 4 InvalidateEntry)
- Register test in CI manifest (core-regression group)
- Use jiti for TypeScript import in test files
@jlin53882
Copy link
Copy Markdown
Contributor Author

Unit Test Coverage for PR #782

Add unit tests for fix: Issue #675 #676 complete batch mode implementation. All 5/5 pass.

New tests: test/pr678-pr723-batch-invalidation.test.mjs

Test Verifies
batch mode: uses bulkStore for new entry and calls update to invalidate old entry supersede decision → bulkStore creates new + update sets superseded_by on old
batch mode: preserves original valid_from when invalidating old entry Invalidated entry retains original valid_from (history preserved)
standalone path (no existing entry): uses bulkStore (batch mode) Empty store → createbulkStore (not store)
skips contradiction when getById returns null — no entry created, no update contradict + match_index: 1 + empty topSimilar → degrades to create (defensive)
bulkStore succeeds, update fails: deletes new entries (rollback) update throws → rollback deletes entries from bulkStore

Key behavioral findings

1. Destructive-decision guard
When topSimilar is empty and match_index points beyond the array, hasValidIndex = false triggers decision degradation — supersede/contradict becomes create. Prevents spurious invalidations against empty result sets.

2. Decision routing by category
handleSupersede vs handleContradict determined by category ∈ TEMPORAL_VERSIONED_CATEGORIES (preferences | entities). Non-TEMPORAL_VERSIONED categories route to handleContradict.

3. Rollback behavior
When bulkStore succeeds but update fails, only new entries (from bulkStore) are deleted. stats.rolledBack is not set by production code — rollback is logged but no stat emitted.

Execution

node --test test/pr678-pr723-batch-invalidation.test.mjs
# → # tests 5, # pass 5, # fail 0

CI

Registered in scripts/ci-test-manifest.mjs (core-regression group). Branch: jlinfork/test/pr-782-unit-tests.

@jlin53882 jlin53882 force-pushed the fix/pr-678-analysis branch from d4ab579 to be1ef34 Compare May 10, 2026 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant