test_gpu_orbit_bench is flaky: exact loss-step equality between CPU and GPU paths is not a stable criterion

## Observation

The CI run for commit abf6ca9 passed Build and Test at 2026-06-10T13:30 (run 27279802255) and failed at 2026-06-10T15:00 (run 27285400079) with no change to code or dependencies. The failing test was test_gpu_orbit_bench:

```
max |z_cpu - z_gpu| (final state) =   5.2859E+00
loss-step mismatches = 1 / 8
CPU lost = 1 / 8   confined frac =  0.8750
GPU lost = 1 / 8   confined frac =  0.8750
lost<->confined flips = 0 / 8
```

## Why it is flaky

The test passes only on `loss-step mismatches = 0` (PASS_REGULAR_EXPRESSION in test/tests/CMakeLists.txt). The CPU reference (procedure-pointer dispatch, OpenMP) and the GPU kernel (separately compiled `gpu_timestep_euler`) are numerically different code paths; their trajectories diverge chaotically over the 51-macrostep trace (final-state difference is O(1) even on passing runs, e.g. 2.76 locally). For a particle that is marginal against the s=1 loss boundary, the macrostep at which it crosses is then effectively a coin flip that depends on the runner's FP environment. In the failing run above both paths lose the same particle (flips = 0) but at different macrosteps.

## Suggested fix

Keep the strong checks where they are well-posed and drop the ill-posed one:

- short-horizon equivalence: after a few microsteps, require max |z_cpu - z_gpu| below a tight tolerance (catches genuine kernel bugs before chaos amplifies),
- long-horizon statistics: require `lost<->confined flips = 0` (classification agreement) instead of exact loss-step equality.

Until then the test will fail sporadically on any marginal particle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_gpu_orbit_bench is flaky: exact loss-step equality between CPU and GPU paths is not a stable criterion #380

Observation

Why it is flaky

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

test_gpu_orbit_bench is flaky: exact loss-step equality between CPU and GPU paths is not a stable criterion #380

Description

Observation

Why it is flaky

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions