Batch D3 and SevenNetD3Model for Torch-Sim interface by alphalm4 · Pull Request #300 · MDIL-SNU/SevenNet

alphalm4 · 2026-03-25T01:28:23Z

Batch D3 is implemented.

How it works

It flatten all input first (escaping pointer-to-pointer-to-pointer...), launch all kernels once, then unflatten
Mixed precision within one batch works
TorchSim - SevenNetD3Model can use both serial and batch D3, where the default threshold batchsize = 4
(heuristic, but I found some evidences that current batch D3 is also faster than serial D3 when the system size is smaller than 1k. Also, in very large cell, it slows down than serial D3. So I think better criterion for choosing serial/batch should consider the system size, but ...)

Precision

Batch D3's precision policy is FP32+64 (same with serial D3) but it utilizes Kahan summation inside each thread. It greatly diminishes precision-related rounding error in pairwise dispersion energy, approaching the reference FP64 fortran-dftd3 values within (|ΔE,F| < 1e-7, |ΔS| < 3e-7 in ase units)
The above difference in summation makes the output energy of batch D3 kernel slightly different with serial D3.
Applying Kahan summation also in serial D3 can nearly remove this discrepancy between serial/batch (|ΔE,F,S| < 1e-7 in ase units), but I think this should be addressed in a separate conversation. (since it breaks backward compatibility of calculation results)
code review @dambi3613
Float64 wrapper for double-precision MD computation flow
speed benchmark (especially, check nvalchemiops_dftd3) -> done, found custom batch D3 implementation is still valuable
test scripts
replace SevenNetD3Model with batched D3 version then remove torchsim_d3.py
overflow issues (only applied for batched D3, not serial D3)
Doc

…ial version in torchsim.py)

alphalm4 · 2026-03-25T08:14:22Z

Something like this might force Torch-Sim to run in float64 while preserving SevenNet works in float32.

class Float64Wrapper:
    """Wraps a float32 model so torch-sim runs in float64 precision.

    Casts state tensors to float32 before calling the model, then casts
    outputs back to float64. Reports dtype=float64 to torch-sim so all
    integrator arithmetic is done in double precision.
    """

    def __init__(self, model):
        self._model = model
        self._device = model.device
        self._dtype = torch.float64

    @property
    def device(self):
        return self._device

    @property
    def dtype(self):
        return self._dtype

    @property
    def compute_stress(self):
        return self._model.compute_stress

    @property
    def compute_forces(self):
        return self._model.compute_forces

    @property
    def memory_scales_with(self):
        return getattr(self._model, "memory_scales_with", "n_atoms_x_density")

    def __call__(self, state):
        # Cast state to float32 for the model
        state_f32 = state.to(dtype=torch.float32)
        output = self._model(state_f32)
        # Cast outputs back to float64
        return {
            k: v.to(dtype=torch.float64) if isinstance(v, torch.Tensor) else v
            for k, v in output.items()
        }

    def __getattr__(self, name: str):
        return getattr(self._model, name)

YutackPark · 2026-03-25T12:42:58Z

I dont know the context but why dont we just cast 7net model to float64? For speed?

alphalm4 · 2026-03-25T14:45:17Z

7net model을 64로 올리지않는건 speed 때문이 맞습니다. (굳..이?)

맥락을 추가하자면 해당 항목은 TorchSim state precision이 model precision을 따라가기 때문입니다. SevenNet은 float32 만 받기 때문에 현재 기준으로는 강제로 MD도 float32로 돌려야 합니다.
https://github.com/TorchSim/torch-sim/blob/c456ecec0dec1334a13c026dbcf231fa8309c849/torch_sim/runners.py#L297

아주 정밀하게 확인한 건 아니지만, float64 wrapper 구현 후 ensemble invariant (e.g. npt_nose_hoover_invariant) fluctuation을 확인했을 때 float32보다 줄어드는 것으로 보입니다 (즉 계산 오차 감소가 눈에 보입니다). 반드시 그것뿐만이 아니라도, MD precision을 선택할 방법은 있어야 할 것 같습니다.
물론 TorchSim issue에 올려도 되지만 어렵지않게 구현가능해보여서 넣었습니다.

alphalm4 added 3 commits March 25, 2026 10:13

add pair_d3_for_ts for batch d3

731336b

add SevenNetD3Model in temporal torchsim_d3.py (and also with its ser…

f58ba88

…ial version in torchsim.py)

update changelog

7010cf4

alphalm4 added 10 commits April 15, 2026 20:29

rollback manual wrapping since torchsim addressed it; add float64wrapper

708fd65

detach is safer

f96a848

fix broken mixed-pbc path and batch-d3 test scripts

759375f

Merge remote-tracking branch 'upstream/main' into d3b-remote

b453d5e

Merge remote-tracking branch 'upstream/main' into d3b-remote

799800d

handle zero-cell molecular batches

c674fa1

zero stress for non-periodic D3 calculations

27cfe65

rename d3-batch scripts and add comments

985ccf2

bump torch-sim-atomistic minimum version

0b7edc4

Merge remote-tracking branch 'upstream/main' into d3b-remote

eed1f15

YutackPark previously approved these changes Jun 9, 2026

View reviewed changes

alphalm4 added 2 commits June 9, 2026 15:40

refactor and add docs

994ae6f

Merge remote-tracking branch 'upstream/main' into d3b-remote

615cc7b

alphalm4 dismissed YutackPark’s stale review via 615cc7b June 9, 2026 06:45

add

ea843a1

alphalm4 marked this pull request as ready for review June 9, 2026 07:19

alphalm4 requested a review from YutackPark June 9, 2026 07:24

apply kahan sum in stress also

6461b3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch D3 and SevenNetD3Model for Torch-Sim interface#300

Batch D3 and SevenNetD3Model for Torch-Sim interface#300
alphalm4 wants to merge 17 commits into
MDIL-SNU:mainfrom
alphalm4:d3b

alphalm4 commented Mar 25, 2026 •

edited

Loading

Uh oh!

alphalm4 commented Mar 25, 2026

Uh oh!

YutackPark commented Mar 25, 2026

Uh oh!

alphalm4 commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alphalm4 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alphalm4 commented Mar 25, 2026

Uh oh!

YutackPark commented Mar 25, 2026

Uh oh!

alphalm4 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alphalm4 commented Mar 25, 2026 •

edited

Loading

alphalm4 commented Mar 25, 2026 •

edited

Loading