fix: add TLS alignment pad to fix x86_64-macos segfault#21372
Merged
johnathan79717 merged 5 commits intomerge-train/barretenbergfrom Mar 11, 2026
Merged
fix: add TLS alignment pad to fix x86_64-macos segfault#21372johnathan79717 merged 5 commits intomerge-train/barretenbergfrom
johnathan79717 merged 5 commits intomerge-train/barretenbergfrom
Conversation
…s-compilation LLVM's Mach-O linker (used by both Zig and ld64.lld) misaligns __thread_bss TLS template offsets when __thread_data is also present from Rust static libraries. This causes EXC_I386_GPFLT on any thread_local requiring 16-byte alignment (e.g. std::mutex inside ThreadPool). Adding an alignas(16) initialized thread_local forces __thread_data section alignment to 16, which makes the linker pad it to a 16-byte boundary before __thread_bss, fixing the offset alignment. Verified on macOS VM: binary no longer segfaults on startup.
ludamad
approved these changes
Mar 11, 2026
github-merge-queue bot
pushed a commit
that referenced
this pull request
Mar 11, 2026
BEGIN_COMMIT_OVERRIDE chore: Native curve audit - pairing (#21104) feat: batch hiding kernel and translator proofs (#21246) fix: remove obsolete KZG:masking_challenge from batched translator expected manifest (#21371) fix: add TLS alignment pad to fix x86_64-macos segfault (#21372) END_COMMIT_OVERRIDE
AztecBot
pushed a commit
that referenced
this pull request
Mar 13, 2026
## Summary - Fixes x86_64-macos segfault (`EXC_I386_GPFLT`) when `bb` is cross-compiled with Zig and linked with a Rust static library (AVM transpiler) - Root cause: LLVM's Mach-O linker misaligns `__thread_bss` TLS template offsets when `__thread_data` (from Rust) is also present, causing 16-byte-aligned `thread_local` objects (like `std::mutex`) to be placed at 8-byte-aligned addresses - Fix: a single `alignas(16) thread_local` variable forces `__thread_data` section alignment to 16, making the linker pad it correctly Fixes #21225 Fixes #19769 ## Details Both Zig's built-in Mach-O linker and `ld64.lld-20` share the same LLVM code for laying out TLS sections. When `__thread_data` (align 8, from Rust objects) precedes `__thread_bss` (align 16, from C++ `thread_local`), the linker aligns the `__thread_bss` virtual address to 16 but the TLS template offset remains misaligned because `__thread_data` starts at an 8-aligned VA. At runtime, `dyld` allocates a 16-aligned TLS block and copies the template at the recorded offsets. Variables that should be at `block + 0x40` (16-aligned) end up at `block + 0x38` (8-aligned), causing `MOVAPS` instructions to fault. The fix adds an `alignas(16)` initialized `thread_local` that forces the `__thread_data` section alignment to 16, which makes the linker pad the section end to a 16-byte boundary. Upstream bug: https://codeberg.org/ziglang/zig/issues/31461 ## Test plan - [x] Cross-compiled `bb` binary with Zig for x86_64-macos - [x] Verified TLS section alignment: `__thread_data` align 2^4 (16), offset to `__thread_bss` is 0x40 (mod 16 = 0) - [x] Tested on macOS VM: `bb prove --scheme ultra_honk` runs without segfault - [x] Previous binary (without pad) segfaults immediately with `EXC_I386_GPFLT` Supersedes #21253
Collaborator
|
✅ Successfully backported to backport-to-v4-next-staging #21453. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EXC_I386_GPFLT) whenbbis cross-compiled with Zig and linked with a Rust static library (AVM transpiler)__thread_bssTLS template offsets when__thread_data(from Rust) is also present, causing 16-byte-alignedthread_localobjects (likestd::mutex) to be placed at 8-byte-aligned addressesalignas(16) thread_localvariable forces__thread_datasection alignment to 16, making the linker pad it correctlyFixes #21225
Fixes #19769
Details
Both Zig's built-in Mach-O linker and
ld64.lld-20share the same LLVM code for laying out TLS sections. When__thread_data(align 8, from Rust objects) precedes__thread_bss(align 16, from C++thread_local), the linker aligns the__thread_bssvirtual address to 16 but the TLS template offset remains misaligned because__thread_datastarts at an 8-aligned VA.At runtime,
dyldallocates a 16-aligned TLS block and copies the template at the recorded offsets. Variables that should be atblock + 0x40(16-aligned) end up atblock + 0x38(8-aligned), causingMOVAPSinstructions to fault.The fix adds an
alignas(16)initializedthread_localthat forces the__thread_datasection alignment to 16, which makes the linker pad the section end to a 16-byte boundary.Upstream bug: https://codeberg.org/ziglang/zig/issues/31461
Test plan
bbbinary with Zig for x86_64-macos__thread_dataalign 2^4 (16), offset to__thread_bssis 0x40 (mod 16 = 0)bb prove --scheme ultra_honkruns without segfaultEXC_I386_GPFLTSupersedes #21253