Skip to content

feat: merge-train/barretenberg#21365

Merged
AztecBot merged 11 commits intonextfrom
merge-train/barretenberg
Mar 11, 2026
Merged

feat: merge-train/barretenberg#21365
AztecBot merged 11 commits intonextfrom
merge-train/barretenberg

Conversation

@AztecBot
Copy link
Collaborator

@AztecBot AztecBot commented Mar 11, 2026

BEGIN_COMMIT_OVERRIDE
chore: Native curve audit - pairing (#21104)
feat: batch hiding kernel and translator proofs (#21246)
fix: remove obsolete KZG:masking_challenge from batched translator expected manifest (#21371)
fix: add TLS alignment pad to fix x86_64-macos segfault (#21372)
END_COMMIT_OVERRIDE

### 🧾 Audit Context

ecc/curve audit: pairing. In this PR we add documentation and testing
for pairings. We also rewrite the implementation of the Miller loop to
follow standard references.

### 🛠️ Changes Made

- Add testing for pairings
- Add benchmarks for pairings
- Add documentation + md file
- Rewrite Miller loop implementation to follow standard reference
(linked in code) --> No change in performance

Note: I tried implementing cyclotomic squaring but the result was not
faster than the current implementation (it was slower). The problem is
the inversion that is required to decompress the squaring (which were
computed in compressed form). Even using a batch inversion trick didn't
help. The paper I tried to implement is:
https://eprint.iacr.org/2010/542.pdf

### ✅ Checklist

- [X] Audited all methods of the relevant module/class
- [X] Audited the interface of the module/class with other (relevant)
components
- [ ] Documented existing functionality and any changes made (as per
Doxygen requirements)
- [X] Resolved and/or closed all issues/TODOs pertaining to the audited
files
- [ ] Confirmed and documented any security or other issues found (if
applicable)
- [X] Verified that tests cover all critical paths (and added tests if
necessary)
- [X] Updated audit tracking for the files audited (check the start of
each file you audited)

### 📌 Notes for Reviewers

(Optional) Call out anything that reviewers should pay close attention
to — like logic changes, performance implications, or potential
regressions.
iakovenkos and others added 10 commits March 11, 2026 16:11
**Reduce Chonk proof size by combining translator and hiding kernel
sumcheck and PCS rounds.**

Previously, the hiding kernel and translator circuits were proven
independently, each running its own sumcheck and PCS opening phase. This
change merges these phases by proving a **single aggregated relation**
for both circuits.

Let $R_H$ denote the hiding kernel relation multivariate and $R_T(x)$
the translator relation multivariate. After all commitments are bound
into the transcript, the verifier samples a challenge $\alpha$. If the
hiding kernel has $K_H$ subrelations, we define the joint relation

$$
R_{\text{joint}} = R_H + \alpha^{K_H} R_T.
$$

This is equivalent to taking a random linear combination of all
subrelations across both circuits. The prover and verifier then run a
**single sumcheck** on

$$
H(x) = G \cdot R_{\text{joint}},
$$

where $G$ is the usual Honk gate-separator polynomial. Each round
univariate is the sum of the hiding and translator contributions, with
the translator part scaled by $\alpha^{K_H}$. This has the same
distribution as running sumcheck on the concatenation of the two
circuits, but avoids materializing a combined trace.

The PCS phase is also shared: commitments and evaluations from both
circuits are aggregated into a single batch opening claim, which is
reduced via the usual Shplonk/Gemini procedure to one pairing check.

Zero-knowledge is preserved by using a **single Libra masking
polynomial** over the joint sumcheck domain. The masking contribution is
added to each round univariate exactly as in the standalone protocol,
ensuring the final relation value remains perfectly masked.

Overall, the protocol produces a single sumcheck transcript and a single
PCS opening for both circuits, reducing proof size while remaining
equivalent to proving the concatenated circuit under a random linear
combination of its constraints.
…pected manifest (#21371)

## Summary
The `BatchedHonkTranslatorTests.ProverManifestConsistency` test was
failing because its hardcoded expected manifest included a
`KZG:masking_challenge` entry in round 26 (the KZG opening round), but
the KZG protocol no longer sends this challenge (removed in #21040).

## Fix
Removed the stale `m.add_challenge(round, "KZG:masking_challenge")` from
`build_expected_batched_manifest()` and updated the corresponding
comment.

## Test plan
- [x] `BatchedHonkTranslatorTests.ProverManifestConsistency` passes
- [x] All 4 `batched_honk_translator_tests` pass locally

ClaudeBox log: https://claudebox.work/s/a26a429f5a7f3d3d?run=1
…s-compilation

LLVM's Mach-O linker (used by both Zig and ld64.lld) misaligns __thread_bss
TLS template offsets when __thread_data is also present from Rust static
libraries. This causes EXC_I386_GPFLT on any thread_local requiring 16-byte
alignment (e.g. std::mutex inside ThreadPool).

Adding an alignas(16) initialized thread_local forces __thread_data section
alignment to 16, which makes the linker pad it to a 16-byte boundary before
__thread_bss, fixing the offset alignment.

Verified on macOS VM: binary no longer segfaults on startup.
## Summary

- Fixes x86_64-macos segfault (`EXC_I386_GPFLT`) when `bb` is
cross-compiled with Zig and linked with a Rust static library (AVM
transpiler)
- Root cause: LLVM's Mach-O linker misaligns `__thread_bss` TLS template
offsets when `__thread_data` (from Rust) is also present, causing
16-byte-aligned `thread_local` objects (like `std::mutex`) to be placed
at 8-byte-aligned addresses
- Fix: a single `alignas(16) thread_local` variable forces
`__thread_data` section alignment to 16, making the linker pad it
correctly

Fixes #21225
Fixes #19769

## Details

Both Zig's built-in Mach-O linker and `ld64.lld-20` share the same LLVM
code for laying out TLS sections. When `__thread_data` (align 8, from
Rust objects) precedes `__thread_bss` (align 16, from C++
`thread_local`), the linker aligns the `__thread_bss` virtual address to
16 but the TLS template offset remains misaligned because
`__thread_data` starts at an 8-aligned VA.

At runtime, `dyld` allocates a 16-aligned TLS block and copies the
template at the recorded offsets. Variables that should be at `block +
0x40` (16-aligned) end up at `block + 0x38` (8-aligned), causing
`MOVAPS` instructions to fault.

The fix adds an `alignas(16)` initialized `thread_local` that forces the
`__thread_data` section alignment to 16, which makes the linker pad the
section end to a 16-byte boundary.

Upstream bug: https://codeberg.org/ziglang/zig/issues/31461

## Test plan

- [x] Cross-compiled `bb` binary with Zig for x86_64-macos
- [x] Verified TLS section alignment: `__thread_data` align 2^4 (16),
offset to `__thread_bss` is 0x40 (mod 16 = 0)
- [x] Tested on macOS VM: `bb prove --scheme ultra_honk` runs without
segfault
- [x] Previous binary (without pad) segfaults immediately with
`EXC_I386_GPFLT`

Supersedes #21253
Copy link
Collaborator

@ludamad ludamad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved

@AztecBot AztecBot added this pull request to the merge queue Mar 11, 2026
Any commits made after this event will not be merged.
@AztecBot
Copy link
Collaborator Author

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

@AztecBot
Copy link
Collaborator Author

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/c92bfb06a989063d�c92bfb06a989063d8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/reqresp/reqresp_no_handshake.test.ts (236s) (code: 0) group:e2e-p2p-epoch-flakes

Merged via the queue into next with commit 16c83bf Mar 11, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants