Skip to content

feat: Lazy identity-flag evaluation in local-eval mode#200

Open
khvn26 wants to merge 3 commits intomainfrom
feat/lazy-identity-flags
Open

feat: Lazy identity-flag evaluation in local-eval mode#200
khvn26 wants to merge 3 commits intomainfrom
feat/lazy-identity-flags

Conversation

@khvn26
Copy link
Copy Markdown
Member

@khvn26 khvn26 commented Apr 24, 2026

Two complementary changes targeting the local-eval latency reported in flagsmith-python-client#198 and the customer thread that motivated it.

1. Lazy identity-flag evaluation

Flagsmith.get_identity_flags returns a Flags that holds the evaluation context plus a precomputed segment-overrides reverse index, and resolves each feature on first access via the engine's already-public is_context_in_segment / get_flag_result_from_context primitives, instead of running a full bulk evaluation up-front.

The reverse index is rebuilt inside the _evaluation_context setter, so it stays in sync with environment refreshes with zero hot-path cost.

New lazy_identity_evaluation: bool = True constructor kwarg acts as a rollback switch.

2. Skip env-doc re-parse on no-op refresh

update_environment now sends a HEAD first and compares the x-flagsmith-document-updated-at response header against the value stored from the last successful fetch. When they match, the GET, JSON parse, map_environment_document_to_context, and overrides-index rebuild are all skipped — the cached evaluation context is reused. HEAD failures fall through to the existing GET path so no environment regresses if a proxy doesn't permit HEAD.

This eliminates the ~5 ms p99 GIL stall the polling thread otherwise imposes every environment_refresh_interval_seconds (default 60s) when re-parsing a 400+ feature env doc.

Bench against real customer env (Sevenrooms QA, 434 features, 64 segments, 305 rules)

Hot loop of get_identity_flags(...).is_feature_enabled(name), 20,000 iterations.

Version mean p50 p99 p99.9 max
flagsmith 3.8.0 (flag-engine 5.4.3) 434 µs 408 µs 726 µs 2.59 ms 25.7 ms
flagsmith 5.2.0 (flag-engine 10.0.4) 442 µs 403 µs 572 µs 4.52 ms 42.4 ms
this PR (lazy default) 2.5 µs 2.3 µs 3.4 µs 29 µs 170 µs
this PR (lazy_identity_evaluation=False) 456 µs 409 µs 1.22 ms 5.10 ms 26.2 ms

100–200× across every percentile vs any released version, on the customer's actual environment shape.

Back-compat

  • Flags public surface unchanged (is_feature_enabled, get_feature_value, get_flag, all_flags).
  • FlagResult produced via the same engine helper as the bulk path — identical output shape.
  • lazy_identity_evaluation=False restores the old eager-eval timing byte-for-byte.
  • HEAD-skip is invisible when the document changes; HEAD failures are silently swallowed and the GET path still applies.
  • Two existing tests that mock engine.get_evaluation_result are pinned to lazy_identity_evaluation=False — they exercise the eager-path call shape, which still ships as the rollback.

Engine contract

Untouched. SDK only imports already-public symbols from flag_engine.segments.evaluator.

Tests

  • 8 new unit tests in test_models.py (override match/no-match, per-flag caching, all_flags materialisation, fallthrough to default handler, reverse-index correctness).
  • 4 new integration tests in test_flagsmith.py: lazy-by-default, rollback kwarg, HEAD-skip on unchanged docs, HEAD-failure fall-through.
  • 97 passing, mypy strict clean, black/isort clean.

``get_identity_flags`` now returns a ``Flags`` that holds the
evaluation context plus a precomputed segment-overrides reverse index,
and resolves each feature on first access via the engine primitives
(``is_context_in_segment`` + ``get_flag_result_from_context``) rather
than running a full bulk evaluation up-front.

In environments shaped like the Slack-report customer (420 features,
30 CSV-IN segments, hot loop reading one boolean flag) this takes
``get_identity_flags().is_feature_enabled(name)`` from ~430 µs to
~1.85 µs per call; 200-segment envs go from ~1200 µs to ~2 µs. The
``.all_flags()`` materialisation path is never slower than the
eager baseline in the bench matrix.

Back-compat:
  * ``Flags`` public API unchanged (``is_feature_enabled``,
    ``get_feature_value``, ``get_flag``, ``all_flags``).
  * ``FlagResult`` construction reuses the same engine helper as the
    bulk path — identical output shape.
  * New ``lazy_identity_evaluation`` constructor kwarg, default
    ``True``, lets operators flip back to the eager path if they hit
    an unexpected regression.

Engine contract is untouched: the SDK consumes only already-public
``flag_engine.segments.evaluator`` symbols.

beep boop
@khvn26 khvn26 requested a review from a team as a code owner April 24, 2026 20:10
@khvn26 khvn26 requested review from gagantrivedi and removed request for a team April 24, 2026 20:10
khvn26 added 2 commits April 24, 2026 21:24
Picks up the IN segment-condition evaluation speedup
(Flagsmith/flagsmith-engine#295), which cuts per-IN-condition latency
on segment walks by roughly 30%. Complementary to the lazy identity
evaluation added in this PR — most customer envs will benefit from both.

beep boop
``update_environment`` now sends a HEAD first and compares the
``x-flagsmith-document-updated-at`` response header against the value
stored from the last successful fetch. When they match, the GET, the
JSON parse, ``map_environment_document_to_context``, and the
overrides-index rebuild are all skipped — the cached evaluation
context is reused as-is.

On the customer's QA env this eliminates the ~5ms p99 GIL stall the
polling thread imposes every ``environment_refresh_interval_seconds``
(default 60s) — which is the largest remaining contributor to
identity-flag-eval p99 once lazy is enabled. Standard 60s polling
against a stable env now does HEAD-only round trips between actual
changes.

HEAD failures (e.g. proxy that doesn't permit it) silently fall
through to the existing GET path, so no environment regresses to a
worse-than-current behaviour if the optimisation can't apply.

beep boop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant