Cache CircuitToEinsum metadata for repeated observable sweeps by AbdelStark · Pull Request #219 · NVIDIA/cuQuantum

AbdelStark · 2026-04-14T17:00:25Z

Summary

This change caches CircuitToEinsum forward/inverse metadata for repeated observable-oriented calls and adds focused cache tests plus a small benchmark script.

Problem

CircuitToEinsum already caches the base forward parse through _get_inputs(), but repeated calls to:

expectation(...)
reduced_density_matrix(...)
marginal_probability(...)

still rebuild the forward/inverse metadata in _get_forward_inverse_metadata(...).

For repeated observable sweeps on the same circuit, this adds avoidable Python-side overhead even when:

the circuit topology is unchanged
the parser options are unchanged
the lightcone support is unchanged

What changed

added a private _forward_inverse_metadata_cache on CircuitToEinsum
cache key is based on:
- lightcone
- normalized coned-qubit positions
- decompose_gates
- check_diagonal
cache entries store:
- parsed input metadata
- a copied qubits_frontier
- next_frontier
- inverse gates
- inverse-gate diagonal flags
cache hits return qubits_frontier.copy() so each call still gets an isolated mutable frontier map
added focused tests for:
- repeated identical calls reusing one cache entry
- equivalent where orderings reusing one cache entry
- distinct supports creating distinct entries
- lightcone=True and lightcone=False creating separate entries
added python/samples/tensornet/circuit_to_einsum_cache_benchmark.py for repeated-call timing

User impact

reduces repeated metadata construction overhead for observable sweeps
preserves current public APIs
preserves numerical behavior
makes repeated CircuitToEinsum use closer to a reusable compiled artifact instead of a one-shot translator

Validation

Targeted validation was done locally against the pure-Python converter path using real Qiskit and Cirq circuits.

Semantic checks covered:

cache reuse for repeated identical expectation(...) calls
cache-key normalization for reordered where arguments
cache separation by distinct support sets
cache separation by lightcone mode
output stability on cache hits

Local timing on a 12-qubit / depth-20 Qiskit circuit:

expectation(lightcone=True): cached mean 0.000939 s, uncached mean 0.012993 s, 13.84x speedup
reduced_density_matrix(lightcone=True): cached mean 0.000901 s, uncached mean 0.011013 s, 12.23x speedup

Cache CircuitToEinsum metadata reuse

bbe82ab

AbdelStark changed the title ~~[codex] Cache CircuitToEinsum metadata for repeated observable sweeps~~ [Cache CircuitToEinsum metadata for repeated observable sweeps Apr 14, 2026

AbdelStark changed the title ~~[Cache CircuitToEinsum metadata for repeated observable sweeps~~ Cache CircuitToEinsum metadata for repeated observable sweeps Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache CircuitToEinsum metadata for repeated observable sweeps#219

Cache CircuitToEinsum metadata for repeated observable sweeps#219
AbdelStark wants to merge 1 commit intoNVIDIA:mainfrom
AbdelStark:codex/cache-circuit-to-einsum-metadata

AbdelStark commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AbdelStark commented Apr 14, 2026

Summary

Problem

What changed

User impact

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant