Control flow and memory access targeting differential cases#1164
Draft
greenhat wants to merge 5 commits into
Draft
Control flow and memory access targeting differential cases#1164greenhat wants to merge 5 commits into
greenhat wants to merge 5 commits into
Conversation
Coverage analysis of the differential corpus showed large untouched areas in the control flow pipeline: the cf.cond_br and ub.unreachable MASM lowerings, cfg-to-scf exit multiplexing, scf yield canonicalization, the spill machinery, and the wide-arithmetic translation chain. Add thirteen cases shaped to reach those paths through rustc-generated wasm at opt-level 3, raising compiler region coverage from 30.7% to 32.5%: - trap_branch / switch_trap_arm / ret_args: dynamically-impossible panic guards built on cross-modulus contradictions LLVM cannot fold, leaving branches between ret and unreachable exits that survive cfg-to-scf (Case 2) and exercise the cf.cond_br and ub.unreachable lowerings, including successor-block-argument scheduling. - multi_exit_loop / continue_paths / midloop_exit / switch_shapes: multi-exit and multi-backedge loops plus dense and sparse dispatch, driving cfg-to-scf discriminator threading, scf.index_switch region interfaces, and the linear/binary-search switch lowerings. - u64_exits / loop_results / calls_selects / select_sched: u64-returning helpers with tail-merged early returns (multi-word successor operands, scf yield folding), loop-carried values, non-inlined call signatures, and select emitter variants including the 64-bit path. - stack_pressure / u128_mix: a deep single-use expression tree that pushes the operand stack past its 16-slot limit into the spill analysis/transform, and u128 arithmetic covering the wide-arithmetic wasm ops. Loop bounds use moduli like % 97 instead of small power-of-two masks, which LLVM would otherwise peel away entirely; all cases are deterministic and panic-free on every u32 input pair.
Add eight differential cases exercising the memory read/write paths of the compiler: dynamic load/store addressing (prepare_addr, word load/store emitters), runtime-length memory.copy (memcpy fast path + fallback loop), the rodata/data-segment pipeline (single static tables and multi-segment atomic statics), signed sub-word and unaligned loads/stores, and the memory.size/memory.grow emitters. Together they raise memory-area region coverage from 352 to 1115 (~17% -> ~55%). Three cases are marked #[ignore] as they surface native/MASM divergences: - mem_overlap: memory.copy with overlapping dst > src ranges copies forward instead of honoring memmove semantics. - mem_grow: ::intrinsics::mem::memory_grow fails in the VM with a non-binary if-condition value on every input. - switch_shapes (pre-existing): flaky divergence on inputs (1669775643, 1062584501), plus a separate "value does not fit in i32" VM assert.
Add area-focused coverage reporting to the fuzza case-generation tooling so an agent growing coverage in one part of the compiler can judge productivity against just that area instead of the global numbers. - cov.py: add `--area <comma-separated path prefixes>`, emitting a "Target area" section with an area-only headline, an area-only delta, and the full list of cold (untouched / partially-covered) functions in the area. Without the flag the output is unchanged. - Makefile.toml: thread an optional `FUZZA_AREA` env var into the cov.py call in `fuzza-cov` and `fuzza-cov-step` (via a non-empty args array, safe under `set -u`). - AGENT-PROMPT.md: the target area is now a free-worded description; the agent resolves it to source paths and exports `FUZZA_AREA` as its first step. Also document the coverage-accumulation model, the mutable-`static` determinism trap, early termination by unreachability argument, and MIDENC_EMIT/TRACE for investigating divergences. - README.md: matching workflow/doc updates. - scratch/: gitignored working-notes dir (kept via .gitkeep) that lives outside target/, so it survives `fuzza-cov-clean`.
Differential cases whose native `cdylib` retains a panic path (an impossible
trap, a guarded index, …) failed to load on Linux CI with:
undefined symbol: rust_eh_personality
Even though each case is built with `panic = "abort"`, the precompiled `core`
library is built with `panic = "unwind"`, so referencing `core`'s panic
machinery links in unwind tables that reference `rust_eh_personality`. With no
`std` to define that symbol, the `cdylib` is left with an undefined symbol that
`dlopen` rejects on Linux; macOS tolerates it, which is why it passed locally.
Add a no-op `rust_eh_personality` to the prepended case header so the native
library is self-contained. It is never invoked, because panics abort. The stub
is gated to non-wasm so the cargo-miden (wasm -> MASM) build is unchanged.
Fixes the `ret_args`, `switch_trap_arm`, `trap_branch`, and `u64_exits`
failures in the "midenc integration tests" job.
Factor the differential harness into a shared `run_case_inner` driven by an `Inputs` enum, and add `run_case_with_inputs`, which compares the native and MASM `entrypoint` outputs against an explicit, deterministic list of `(input1, input2)` pairs instead of 16 random proptest draws. The existing `run_case` is unchanged (it delegates with `Inputs::Random16`). Use it to pin the `switch_shapes` divergence as its own reproducer, `switch_shapes_repro`, on the exact input the fuzzer flagged. On that input the MASM VM aborts with "value does not fit in i32", so the test is `#[ignore]`d like `switch_shapes` until the bug is fixed; it now fails reliably on that input rather than only when proptest happens to draw it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TODO: