Skip to content

enzyme: opt-in Clang/Enzyme build option + AD smoke test#565

Open
krystophny wants to merge 7 commits into
proximafusion:mainfrom
itpplasma:enzyme-build-option
Open

enzyme: opt-in Clang/Enzyme build option + AD smoke test#565
krystophny wants to merge 7 commits into
proximafusion:mainfrom
itpplasma:enzyme-build-option

Conversation

@krystophny

@krystophny krystophny commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Stacked PR — part 2/19 of the differentiable-VMEC++ series. merge after #564 (build-clang21).
Diff is cumulative (includes ancestor commits) because the branches are stacked on the fork; review the net change described below.


What

Add an opt-in Clang/Enzyme build path and a toolchain smoke test:

  • VMECPP_ENABLE_ENZYME CMake option, off by default. When on, it requires
    a Clang compiler and -DVMECPP_ENZYME_PLUGIN=/path/to/ClangEnzyme-NN.so, and
    builds the enzyme_smoke test target.
  • common/enzyme/enzyme.h: thin declarations of the Enzyme autodiff intrinsics
    and the activity markers, with the allocation constraint that shapes every
    differentiable kernel in this stack documented in place.
  • common/enzyme/enzyme_smoke_test.cc: differentiates a scalar objective over
    Eigen::Map'd caller buffers and checks reverse- and forward-mode gradients
    against the closed form and central finite differences.

Why

Stacked on #5. Enzyme differentiates LLVM IR through a Clang plugin, so the
differentiable VMEC++ work needs a Clang build and a way to attach the plugin.
This patch adds that switch and a test that fails loudly if the plugin is not
attached or if Enzyme cannot differentiate the Eigen::Map buffer pattern the
later kernels depend on. It adds no production code; with the option off, the
build is byte-for-byte the previous one.

The smoke test also pins down the one hard Enzyme constraint for this codebase:
Enzyme's allocation analysis does not track Eigen's aligned allocator, so a
dynamic-size Eigen heap temporary crossing the differentiated call aborts with
"freeing without malloc". Differentiable kernels therefore operate on
caller-owned buffers via Eigen::Map. The test exercises exactly that pattern.

Verification

Configure and build with Clang 21.1.8 and the ClangEnzyme-21 plugin:

-- Enzyme plugin: .../ClangEnzyme-21.so
cfg=0
build=0 (0 errors)

ctest:

    Start 1: enzyme_smoke
1/1 Test #1: enzyme_smoke .....................   Passed    0.00 sec
100% tests passed, 0 tests failed out of 1

Smoke test output (reverse and forward both exact, agree with finite
differences):

enzyme smoke test (n=8)
  max|reverse - analytic| = 0.000e+00
  max|forward - analytic| = 0.000e+00
  max|reverse - finite-diff| = 2.646e-08
PASS

No regression with the option off. A default configure (GCC, no Enzyme flags)
succeeds and emits no enzyme_smoke target:

cfg=0  enzyme target present? 0

Tracking: #587

The CMake FetchContent abseil pin (2024-08) fails to compile under
Clang >= 21: absl::Nonnull SFINAE in absl/strings/ascii.cc and the
numbers.cc nullability annotations are rejected by the newer frontend.
Bump to the 20260107.1 LTS, which compiles cleanly under Clang 21.1.8
and GCC. Clang is the compiler required for the Enzyme autodiff build.

The Bazel build keeps its own (BCR) abseil pin and is unaffected.
Add VMECPP_ENABLE_ENZYME (OFF by default), which requires a Clang
compiler and a ClangEnzyme plugin path and builds a self-contained
autodiff smoke test. The test differentiates a scalar objective written
over Eigen::Map'd caller buffers and checks reverse- and forward-mode
Enzyme gradients against the closed form and central finite differences.

enzyme.h documents the intrinsic ABI and the allocation constraint that
shapes the differentiable kernels: Enzyme cannot track Eigen's aligned
allocator, so differentiable paths use Eigen::Map over caller-owned
buffers and avoid heap expression temporaries.

With the option off the build is unchanged.
The 'Compare benchmark result' step uses github-action-benchmark with
comment-on-alert and the GITHUB_TOKEN, which is read-only for pull requests from
forks -> 'Resource not accessible by integration'. Gate that step on the PR
coming from the same repo so fork PRs still run the benchmarks but skip the
write-back instead of failing.
The pinned vmec-0.0.6 cp310 wheel was f90wrapped against numpy 1.x. Under
the numpy 2.x that the test env now resolves, importing it dies in the
f90wrap array interface (f90wrap_vmec_input__array__rbc: 0-th dimension
must be fixed to 2 but got 4), so test_ensure_vmec2000_input_from_vmecpp_input
could never actually run on CI (and is currently red on main too, where the
wheel's runtime libs are not even installed).

Build VMEC2000 from upstream source with current f90wrap, which produces
numpy-2-compatible bindings. The recipe mirrors SIMSOPT's own CI
(hiddenSymmetries/VMEC2000, cmake/machines/ubuntu.json). An explicit
'import vmec' check in the install step surfaces any remaining problem here
rather than as a confusing test failure.
With VMEC2000 built from current upstream source, the compatibility test
runs for the first time and hits vmecpp indata fields that have no
counterpart in the legacy VMEC2000 INDATA namelist (e.g.
free_boundary_method), which raised AttributeError. The test explicitly
checks only the common subset, so guard the lookup with hasattr and skip
fields VMEC2000 does not have, instead of enumerating them one by one.
@krystophny krystophny marked this pull request as draft June 15, 2026 04:48
…mit pin

Bring this stack branch up to the corrected CI baseline (from proximafusion#583/proximafusion#564):
- tests.yaml: build VMEC2000 from the pinned source commit and cache the
  wheel; drop the unused FFTW/HDF5 dev packages.
- benchmarks.yaml: skip the result upload on fork PRs (read-only token).
- test_simsopt_compat.py: skip vmecpp-only INDATA fields.
- CMakeLists: pin abseil to the 20260107.1 commit hash, not the tag.

@jurasic-pf jurasic-pf left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I propose to review and approve the stack piece by piece and merge all at once, since including enzyme has no benefit unless we merge the rest.

@krystophny

Copy link
Copy Markdown
Contributor Author

good then i undraft the prs - I am very excited about this as this will ultimately beat every other VMEC including DESC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants