Timeboxing Refine: dual constraint extraction per turn (graphflow_turn + refine_background_memory both extracting)

## Problem

During Refine turns, the same constraint is extracted twice in the same turn — once by `graphflow_turn` (the main stage graph execution) and again by `refine_background_memory` (a concurrent background task). This causes:

1. `local_count` growing faster than expected per turn
2. Duplicate LLM extraction calls (wasted tokens + latency)
3. Semantic near-duplicates accumulating (e.g., "Hockey Today" + "Hockey Training" both added for the same event)

## Evidence

**Session:** `1772956522.019509` (2026-03-08) | log `logs/timeboxing_session_20260308_075522_1772956522.019509_66949.log`

- Refine turn 3: constraints 'Evening Routines' and 'Sleep Prep' appeared in both `graphflow_turn` and `refine_background_memory` extraction logs for the same turn
- `local_count` grew from 193 → 201 across the session (8 additions), but only ~4 semantically distinct constraints were introduced by the user
- "Hockey Today" and "Hockey Training" each added as separate rows despite referring to the same event

## Partial mitigation already in place

The `add_constraints` method (merged 2026-03-08, branch `issue/96-calendar-sync-idempotency`) now deduplicates shared-scope constraints at the **store level** using `_group_constraints_by_semantics`. So even if both background tasks extract the same constraint, only one canonical row will be persisted/updated.

## Remaining gap

The dual LLM extraction still fires — two separate LLM calls for constraint extraction per Refine turn, consuming tokens and adding latency. Ideally `refine_background_memory` should skip extraction if `graphflow_turn` already extracted constraints in the same turn.

## Proposed fix directions

1. **Coordinator-level deduplication gate:** track a `_last_extraction_turn_id` in the session state; `refine_background_memory` checks if extraction already ran for this turn and skips
2. **Merge `refine_background_memory` extraction into `graphflow_turn`:** make background memory a post-hook of the stage graph rather than a separate concurrent task
3. **Token-bucket / extraction-once-per-turn semaphore:** shared async lock keyed by `(session_key, turn_id)` that only one extractor acquires

## Acceptance criteria

- [ ] Constraint extraction fires at most once per Refine turn per session
- [ ] `local_count` growth per turn matches approximately the number of distinct new user constraints introduced in that turn
- [ ] No duplicate LLM extraction events for the same constraint in the same turn (verifiable in `timebox_log_query.py llm --session-key ...`)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeboxing Refine: dual constraint extraction per turn (graphflow_turn + refine_background_memory both extracting) #104

Problem

Evidence

Partial mitigation already in place

Remaining gap

Proposed fix directions

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Timeboxing Refine: dual constraint extraction per turn (graphflow_turn + refine_background_memory both extracting) #104

Description

Problem

Evidence

Partial mitigation already in place

Remaining gap

Proposed fix directions

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions