Skip to content

PIAL Phase 0.1 — Montgomery reduction + property tests for Miller-Rabin kernel #357

@shaal

Description

@shaal

Context

Phase 0 of PIAL (ADR-151) landed in #PR-TBD. The Sinclair-12 Miller-Rabin kernel is correct on all pinned A014233 strong-pseudoprime regressions and runs in ~15.6 µs worst case on M-series (`u64::MAX − 58`).

During Phase 0 we ran an apples-to-apples competitor benchmark — `num-prime 0.4.4`'s `is_prime64` in the same binary on the same machine — and measured 884 ns vs our 15.63 µs, i.e. ~17.7× headroom. The PRD §6 target of 50 ns turned out to be structurally unachievable in safe Rust on Apple-silicon (criterion sanity baseline ≈ 467 ps; the harness is honest). The target was relaxed to ≤ 1 µs M-series / ≤ 4 µs WASM in the PRD §6 amendment and ADR-151 "Phase 0 Findings".

Phase 0.1's job is to close the 17.7× gap by adopting Montgomery-form modular multiplication (the actual hot-path differentiator with `num-prime`), and to harden the test suite with property-based fuzzing before we begin to take consumer dependencies on the kernel.

Scope (in)

  1. Montgomery reduction for `mr_mulmod_u64` / `mr_powmod_u64` in `crates/ruvector-collections/src/primality_kernel.rs`. Replace the current u128-product `% m` with a Montgomery-form CIOS or REDC implementation. Constraints:

    • Same Sinclair-12 witness set — do NOT change the deterministic witness list (see rejection rationale below).
    • All A014233 pinned regressions must still pass.
    • All 114 build-time table entries must still match (table_cross_check).
    • Must still be safe Rust; no inline assembly; no `unsafe` blocks unless absolutely required and justified inline.
  2. Property-based fuzz tests added under `crates/ruvector-collections/tests/`:

    • `primality_proptest.rs`: random odd `n` agreement test against a slow trial-division oracle for `n < 10^9` (or against the existing kernel for larger `n`).
    • 10K iterations minimum on CI; configurable seed for repro.
    • Must run in the regular `cargo test` suite, not behind a feature flag.
  3. Bench gate: `is_prime_u64 worst case` criterion result must drop into the ≤ 1 µs M-series band (PRD §6 revised target). Re-run the num-prime apples-to-apples baseline in the same change to confirm the gap closes.

Out of scope

  • AVX-512 or NEON-specific paths.
  • WASM micro-optimisations (Phase 1 work).
  • Public-API surface changes (the kernel signatures and the public functions in `src/primality.rs` stay unchanged).
  • Touching the build-time prime tables (already validated; should remain byte-identical).

Explicit rejection: 7-witness swap

External Grok review (see `docs/research/miller-rabin-optimizations/GROK-REVIEW-REQUEST.md`) suggested swapping the Sinclair-12 witness set for the empirical 7-witness set found in some literature, claiming a ~40% speedup. Reject this for Phase 0.1 for two reasons:

  1. Correctness pedigree: Sinclair (2011) proves base `{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37}` is deterministic over all `n < 2^64`. The 7-witness sets in the wild are empirically tested up to `2^64` but not theorem-proven. We are not willing to swap a proven set for an empirical one in a primality utility that may be used in security-adjacent code paths.

  2. Test canary regression: `tests/primality_pseudoprimes.rs::SPP_FIRST_11 = 3_825_123_056_546_413_051` is the A014233(11) entry — the smallest strong pseudoprime to bases 2..37. It was added precisely to make "silently dropping base 37" a hard test failure. Swapping witness sets would force us to delete that canary, weakening regression coverage.

The 17.7× gap is a constant-factor / mulmod issue, not a witness-count issue. Montgomery reduction targets the actual bottleneck.

Acceptance criteria

  • All existing tests still pass (`cargo test -p ruvector-collections --features unstable-u128` green).
  • New `primality_proptest.rs` runs ≥ 10K iterations and passes.
  • Worst-case bench drops to ≤ 1 µs on M-series (re-run on the same machine class as Phase 0).
  • num-prime apples-to-apples re-baseline shows the gap closed (within ~2× is acceptable).
  • Sinclair-12 witness set unchanged.
  • Zero new clippy warnings on the collections crate.
  • No `unsafe` introduced without inline justification comment.

References

  • ADR-151 Phase 0 Findings: `docs/adr/ADR-151-miller-rabin-prime-optimizations.md`
  • PRD §6 / §6.1 revision: `docs/research/miller-rabin-optimizations/PRD.md`
  • Grok review request + response context: `docs/research/miller-rabin-optimizations/GROK-REVIEW-REQUEST.md`
  • Phase 0 PR: TBD (will link once opened)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions