Skip to content

Control flow and memory access targeting differential cases#1164

Draft
greenhat wants to merge 5 commits into
nextfrom
more-fuzza-cc
Draft

Control flow and memory access targeting differential cases#1164
greenhat wants to merge 5 commits into
nextfrom
more-fuzza-cc

Conversation

@greenhat

@greenhat greenhat commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

TODO:

  • file new issues for the new ignored tests;

greenhat added 2 commits June 5, 2026 13:31
Coverage analysis of the differential corpus showed large untouched areas
in the control flow pipeline: the cf.cond_br and ub.unreachable MASM
lowerings, cfg-to-scf exit multiplexing, scf yield canonicalization, the
spill machinery, and the wide-arithmetic translation chain.

Add thirteen cases shaped to reach those paths through rustc-generated
wasm at opt-level 3, raising compiler region coverage from 30.7% to 32.5%:

- trap_branch / switch_trap_arm / ret_args: dynamically-impossible panic
  guards built on cross-modulus contradictions LLVM cannot fold, leaving
  branches between ret and unreachable exits that survive cfg-to-scf
  (Case 2) and exercise the cf.cond_br and ub.unreachable lowerings,
  including successor-block-argument scheduling.
- multi_exit_loop / continue_paths / midloop_exit / switch_shapes:
  multi-exit and multi-backedge loops plus dense and sparse dispatch,
  driving cfg-to-scf discriminator threading, scf.index_switch region
  interfaces, and the linear/binary-search switch lowerings.
- u64_exits / loop_results / calls_selects / select_sched: u64-returning
  helpers with tail-merged early returns (multi-word successor operands,
  scf yield folding), loop-carried values, non-inlined call signatures,
  and select emitter variants including the 64-bit path.
- stack_pressure / u128_mix: a deep single-use expression tree that
  pushes the operand stack past its 16-slot limit into the spill
  analysis/transform, and u128 arithmetic covering the wide-arithmetic
  wasm ops.

Loop bounds use moduli like % 97 instead of small power-of-two masks,
which LLVM would otherwise peel away entirely; all cases are
deterministic and panic-free on every u32 input pair.
Add eight differential cases exercising the memory read/write paths of the
compiler: dynamic load/store addressing (prepare_addr, word load/store
emitters), runtime-length memory.copy (memcpy fast path + fallback loop),
the rodata/data-segment pipeline (single static tables and multi-segment
atomic statics), signed sub-word and unaligned loads/stores, and the
memory.size/memory.grow emitters. Together they raise memory-area region
coverage from 352 to 1115 (~17% -> ~55%).

Three cases are marked #[ignore] as they surface native/MASM divergences:

- mem_overlap: memory.copy with overlapping dst > src ranges copies forward
  instead of honoring memmove semantics.
- mem_grow: ::intrinsics::mem::memory_grow fails in the VM with a non-binary
  if-condition value on every input.
- switch_shapes (pre-existing): flaky divergence on inputs
  (1669775643, 1062584501), plus a separate "value does not fit in i32" VM
  assert.
greenhat added 2 commits June 9, 2026 11:13
Add area-focused coverage reporting to the fuzza case-generation tooling so an
agent growing coverage in one part of the compiler can judge productivity
against just that area instead of the global numbers.

- cov.py: add `--area <comma-separated path prefixes>`, emitting a "Target
  area" section with an area-only headline, an area-only delta, and the full
  list of cold (untouched / partially-covered) functions in the area. Without
  the flag the output is unchanged.
- Makefile.toml: thread an optional `FUZZA_AREA` env var into the cov.py call
  in `fuzza-cov` and `fuzza-cov-step` (via a non-empty args array, safe under
  `set -u`).
- AGENT-PROMPT.md: the target area is now a free-worded description; the agent
  resolves it to source paths and exports `FUZZA_AREA` as its first step. Also
  document the coverage-accumulation model, the mutable-`static` determinism
  trap, early termination by unreachability argument, and MIDENC_EMIT/TRACE for
  investigating divergences.
- README.md: matching workflow/doc updates.
- scratch/: gitignored working-notes dir (kept via .gitkeep) that lives outside
  target/, so it survives `fuzza-cov-clean`.
Differential cases whose native `cdylib` retains a panic path (an impossible
trap, a guarded index, …) failed to load on Linux CI with:

    undefined symbol: rust_eh_personality

Even though each case is built with `panic = "abort"`, the precompiled `core`
library is built with `panic = "unwind"`, so referencing `core`'s panic
machinery links in unwind tables that reference `rust_eh_personality`. With no
`std` to define that symbol, the `cdylib` is left with an undefined symbol that
`dlopen` rejects on Linux; macOS tolerates it, which is why it passed locally.

Add a no-op `rust_eh_personality` to the prepended case header so the native
library is self-contained. It is never invoked, because panics abort. The stub
is gated to non-wasm so the cargo-miden (wasm -> MASM) build is unchanged.

Fixes the `ret_args`, `switch_trap_arm`, `trap_branch`, and `u64_exits`
failures in the "midenc integration tests" job.
Factor the differential harness into a shared `run_case_inner` driven by an
`Inputs` enum, and add `run_case_with_inputs`, which compares the native and
MASM `entrypoint` outputs against an explicit, deterministic list of
`(input1, input2)` pairs instead of 16 random proptest draws. The existing
`run_case` is unchanged (it delegates with `Inputs::Random16`).

Use it to pin the `switch_shapes` divergence as its own reproducer,
`switch_shapes_repro`, on the exact input the fuzzer flagged. On that input the
MASM VM aborts with "value does not fit in i32", so the test is `#[ignore]`d
like `switch_shapes` until the bug is fixed; it now fails reliably on that
input rather than only when proptest happens to draw it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant