Skip to content

Timeboxing Refine: dual constraint extraction per turn (graphflow_turn + refine_background_memory both extracting) #104

@hugolytics

Description

@hugolytics

Problem

During Refine turns, the same constraint is extracted twice in the same turn — once by graphflow_turn (the main stage graph execution) and again by refine_background_memory (a concurrent background task). This causes:

  1. local_count growing faster than expected per turn
  2. Duplicate LLM extraction calls (wasted tokens + latency)
  3. Semantic near-duplicates accumulating (e.g., "Hockey Today" + "Hockey Training" both added for the same event)

Evidence

Session: 1772956522.019509 (2026-03-08) | log logs/timeboxing_session_20260308_075522_1772956522.019509_66949.log

  • Refine turn 3: constraints 'Evening Routines' and 'Sleep Prep' appeared in both graphflow_turn and refine_background_memory extraction logs for the same turn
  • local_count grew from 193 → 201 across the session (8 additions), but only ~4 semantically distinct constraints were introduced by the user
  • "Hockey Today" and "Hockey Training" each added as separate rows despite referring to the same event

Partial mitigation already in place

The add_constraints method (merged 2026-03-08, branch issue/96-calendar-sync-idempotency) now deduplicates shared-scope constraints at the store level using _group_constraints_by_semantics. So even if both background tasks extract the same constraint, only one canonical row will be persisted/updated.

Remaining gap

The dual LLM extraction still fires — two separate LLM calls for constraint extraction per Refine turn, consuming tokens and adding latency. Ideally refine_background_memory should skip extraction if graphflow_turn already extracted constraints in the same turn.

Proposed fix directions

  1. Coordinator-level deduplication gate: track a _last_extraction_turn_id in the session state; refine_background_memory checks if extraction already ran for this turn and skips
  2. Merge refine_background_memory extraction into graphflow_turn: make background memory a post-hook of the stage graph rather than a separate concurrent task
  3. Token-bucket / extraction-once-per-turn semaphore: shared async lock keyed by (session_key, turn_id) that only one extractor acquires

Acceptance criteria

  • Constraint extraction fires at most once per Refine turn per session
  • local_count growth per turn matches approximately the number of distinct new user constraints introduced in that turn
  • No duplicate LLM extraction events for the same constraint in the same turn (verifiable in timebox_log_query.py llm --session-key ...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions