Skip to content

fix: add TLS alignment pad to fix x86_64-macos segfault#21372

Merged
johnathan79717 merged 5 commits intomerge-train/barretenbergfrom
jh/fix-tls-alignment
Mar 11, 2026
Merged

fix: add TLS alignment pad to fix x86_64-macos segfault#21372
johnathan79717 merged 5 commits intomerge-train/barretenbergfrom
jh/fix-tls-alignment

Conversation

@johnathan79717
Copy link
Contributor

@johnathan79717 johnathan79717 commented Mar 11, 2026

Summary

  • Fixes x86_64-macos segfault (EXC_I386_GPFLT) when bb is cross-compiled with Zig and linked with a Rust static library (AVM transpiler)
  • Root cause: LLVM's Mach-O linker misaligns __thread_bss TLS template offsets when __thread_data (from Rust) is also present, causing 16-byte-aligned thread_local objects (like std::mutex) to be placed at 8-byte-aligned addresses
  • Fix: a single alignas(16) thread_local variable forces __thread_data section alignment to 16, making the linker pad it correctly

Fixes #21225
Fixes #19769

Details

Both Zig's built-in Mach-O linker and ld64.lld-20 share the same LLVM code for laying out TLS sections. When __thread_data (align 8, from Rust objects) precedes __thread_bss (align 16, from C++ thread_local), the linker aligns the __thread_bss virtual address to 16 but the TLS template offset remains misaligned because __thread_data starts at an 8-aligned VA.

At runtime, dyld allocates a 16-aligned TLS block and copies the template at the recorded offsets. Variables that should be at block + 0x40 (16-aligned) end up at block + 0x38 (8-aligned), causing MOVAPS instructions to fault.

The fix adds an alignas(16) initialized thread_local that forces the __thread_data section alignment to 16, which makes the linker pad the section end to a 16-byte boundary.

Upstream bug: https://codeberg.org/ziglang/zig/issues/31461

Test plan

  • Cross-compiled bb binary with Zig for x86_64-macos
  • Verified TLS section alignment: __thread_data align 2^4 (16), offset to __thread_bss is 0x40 (mod 16 = 0)
  • Tested on macOS VM: bb prove --scheme ultra_honk runs without segfault
  • Previous binary (without pad) segfaults immediately with EXC_I386_GPFLT

Supersedes #21253

…s-compilation

LLVM's Mach-O linker (used by both Zig and ld64.lld) misaligns __thread_bss
TLS template offsets when __thread_data is also present from Rust static
libraries. This causes EXC_I386_GPFLT on any thread_local requiring 16-byte
alignment (e.g. std::mutex inside ThreadPool).

Adding an alignas(16) initialized thread_local forces __thread_data section
alignment to 16, which makes the linker pad it to a 16-byte boundary before
__thread_bss, fixing the offset alignment.

Verified on macOS VM: binary no longer segfaults on startup.
@johnathan79717 johnathan79717 added the ci-barretenberg-full Run all barretenberg checks. label Mar 11, 2026
@johnathan79717 johnathan79717 added ci-full Run all master checks. and removed ci-barretenberg-full Run all barretenberg checks. labels Mar 11, 2026
@johnathan79717 johnathan79717 requested a review from ludamad March 11, 2026 16:14
@johnathan79717 johnathan79717 merged commit c56d02d into merge-train/barretenberg Mar 11, 2026
10 checks passed
@johnathan79717 johnathan79717 deleted the jh/fix-tls-alignment branch March 11, 2026 17:04
github-merge-queue bot pushed a commit that referenced this pull request Mar 11, 2026
BEGIN_COMMIT_OVERRIDE
chore: Native curve audit - pairing (#21104)
feat: batch hiding kernel and translator proofs (#21246)
fix: remove obsolete KZG:masking_challenge from batched translator
expected manifest (#21371)
fix: add TLS alignment pad to fix x86_64-macos segfault (#21372)
END_COMMIT_OVERRIDE
AztecBot pushed a commit that referenced this pull request Mar 13, 2026
## Summary

- Fixes x86_64-macos segfault (`EXC_I386_GPFLT`) when `bb` is cross-compiled with Zig and linked with a Rust static library (AVM transpiler)
- Root cause: LLVM's Mach-O linker misaligns `__thread_bss` TLS template offsets when `__thread_data` (from Rust) is also present, causing 16-byte-aligned `thread_local` objects (like `std::mutex`) to be placed at 8-byte-aligned addresses
- Fix: a single `alignas(16) thread_local` variable forces `__thread_data` section alignment to 16, making the linker pad it correctly

Fixes #21225
Fixes #19769

## Details

Both Zig's built-in Mach-O linker and `ld64.lld-20` share the same LLVM code for laying out TLS sections. When `__thread_data` (align 8, from Rust objects) precedes `__thread_bss` (align 16, from C++ `thread_local`), the linker aligns the `__thread_bss` virtual address to 16 but the TLS template offset remains misaligned because `__thread_data` starts at an 8-aligned VA.

At runtime, `dyld` allocates a 16-aligned TLS block and copies the template at the recorded offsets. Variables that should be at `block + 0x40` (16-aligned) end up at `block + 0x38` (8-aligned), causing `MOVAPS` instructions to fault.

The fix adds an `alignas(16)` initialized `thread_local` that forces the `__thread_data` section alignment to 16, which makes the linker pad the section end to a 16-byte boundary.

Upstream bug: https://codeberg.org/ziglang/zig/issues/31461

## Test plan

- [x] Cross-compiled `bb` binary with Zig for x86_64-macos
- [x] Verified TLS section alignment: `__thread_data` align 2^4 (16), offset to `__thread_bss` is 0x40 (mod 16 = 0)
- [x] Tested on macOS VM: `bb prove --scheme ultra_honk` runs without segfault
- [x] Previous binary (without pad) segfaults immediately with `EXC_I386_GPFLT`

Supersedes #21253
@AztecBot
Copy link
Collaborator

✅ Successfully backported to backport-to-v4-next-staging #21453.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-v4-next ci-full Run all master checks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants