Skip to content

feat: add double precision (float64) inference support#156

Merged
yanghan234 merged 6 commits into
mainfrom
feat/double-precision-inference
May 11, 2026
Merged

feat: add double precision (float64) inference support#156
yanghan234 merged 6 commits into
mainfrom
feat/double-precision-inference

Conversation

@yanghan234
Copy link
Copy Markdown
Collaborator

Summary

Adds a dtype parameter ("float32" or "float64") to MatterSimCalculator for double-precision inference. This significantly improves numerical stability for sensitive properties like thermal conductivity.

On equilibrium Si diamond, float64 reduces force noise from ~1e-6 to ~1e-15 eV/Å compared to float32.

Note: This PR depends on #154 and should only be merged after #154 is merged. Once that is done, this branch will be rebased onto main.

Changes

src/mattersim/forcefield/potential.py

  • New dtype parameter on MatterSimCalculator.__init__ with validation
  • Calls model.double() when dtype="float64" is selected
  • Creates position/cell tensors in the chosen dtype
  • Upcasts all float tensors in the input dict before the forward pass

src/mattersim/forcefield/m3gnet/m3gnet.py

  • pbc_offsets.float()pbc_offsets.to(pos.dtype) to respect the model's precision

tests/applications/test_bte.py

  • BTE tests now use dtype="float64" for reproducibility
  • Reference FC2/FC3 norms and kappa values recalibrated for float64
  • Tolerance tightened from 8% to 2%

@yanghan234 yanghan234 force-pushed the feat/double-precision-inference branch 2 times, most recently from 52198e1 to 8aa2863 Compare April 30, 2026 13:04
@yanghan234 yanghan234 marked this pull request as ready for review April 30, 2026 13:25
yanghan234 added a commit that referenced this pull request May 11, 2026
float64 support belongs to PR #156. For now use the wrapper's
default float32 in model_loading and doc examples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
yanghan234 added a commit that referenced this pull request May 11, 2026
* feat: add load_mattersim helper to forcefield

Add a convenience function that wraps Potential.from_checkpoint() with
gradient checkpointing support and automatic version normalization from
checkpoint paths. Exported from mattersim.forcefield.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add torch-sim-atomistic as required dependency

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add torchsim integration for batch MD and relaxation

Port TorchSim support from the internal repo, providing GPU-accelerated
batch molecular dynamics and structure relaxation via torch-sim.

New package: mattersim.torchsim
- TorchSimWrapper: adapts MatterSim Potential as a TorchSim ModelInterface
- TorchSimBatchMD: batch MD with NVT/NPT integrators, temperature schedules
- TorchSimBatchRelaxer: batch structure optimization with FIRE/etc.
- Settings classes (IntegratorSettings, OptimizerSettings) with validation
- Trajectory loading/saving via TorchSim H5MD format
- Graph construction bridge (build_graph_from_simstate)
- Model loading with AOTI compilation support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: add torchsim integration tests

27 tests covering settings validation, TorchSimWrapper forward pass,
batch relaxation (file + in-memory), batch MD with multiple integrators,
trajectory continuation, per-system temperature schedules, and
trajectory loader convenience constructors.

GPU-heavy tests are marked @slow and @requires_gpu so they are skipped
on CPU-only machines.

Also adds mattersim_potential_best_device fixture to conftest.py.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: set version in load_mattersim for autobatcher lookup

load_mattersim now always sets Potential.version so the pre-computed
autobatcher memory scaler table is used instead of falling back to
slow runtime memory estimation.

- Default checkpoint (load_path=None) gets 'mattersim-v1.0.0-1M'
- Explicit paths are normalized to canonical version strings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: clean up dead python_version < 3.10 conditionals

Since requires-python is now >=3.12 (for torch-sim-atomistic), the
python_version < 3.10 dependency conditionals for emmet-core and numpy
are unreachable. Simplify to unconditional versions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: update test matrix to Python 3.12+ only

Drop Python 3.10 from CI matrix since requires-python is now >=3.12.
Add Python 3.13 to the test matrix. Update macOS job to 3.12.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: drop Python 3.10/3.11 wheel builds from release pipeline

Only build wheels for Python 3.12 and 3.13 since torch-sim-atomistic
requires Python >=3.12.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: bump gh-pages docs build to Python 3.12

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update Python version requirement to 3.12+ in README

Update badge, prerequisite, and conda create example to reflect
the new minimum Python 3.12 requirement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: update Python version to 3.12+ in MODEL_CARD, environment.yaml, cibuildwheel

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: remove batch MD and relaxation runners from torchsim

Remove the batch runner infrastructure (TorchSimBatchRunner, TorchSimBatchMD,
TorchSimBatchRelaxer), their settings classes (IntegratorSettings,
OptimizerSettings), trajectory loaders, and associated tests.

Keep the core TorchSim wrapper (TorchSimWrapper, graph_construction,
model_loading) that adapts MatterSim potentials as TorchSim ModelInterface.

Deleted files:
- src/mattersim/torchsim/base.py
- src/mattersim/torchsim/batch_relax.py
- src/mattersim/torchsim/md.py
- src/mattersim/torchsim/settings.py
- src/mattersim/torchsim/settings_base.py
- src/mattersim/torchsim/trajectory_loader.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add TorchSim simulation guide

Add user guide for running structure relaxation and molecular dynamics
using the TorchSim backend with MatterSim potentials. Covers wrapper
creation, optimization, MD with various integrators, and trajectory
saving.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: fix convergence function API in TorchSim guide

Use correct parameter name 'force_tol' (not 'fmax') and drop
'include_cell_forces' from the minimal example.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: remove unused emmet-core dependency

emmet-core is not imported anywhere in the codebase. Removing the
dead dependency per review feedback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* bump torchsim >= 0.6.0

* fix: remove remaining load_mattersim references

Replace load_mattersim() calls in model_loading.py with direct
Potential.from_checkpoint() + enable_gradient_checkpointing() calls.
Remove load_mattersim from forcefield __init__.py exports.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: remove _sanitize_outputs and sanitize_nan plumbing

Remove internal-only NaN sanitization logic per review feedback.
Removed from TorchSimWrapper and get_torchsim_wrapper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: drop explicit float64, use default float32 dtype

float64 support belongs to PR #156. For now use the wrapper's
default float32 in model_loading and doc examples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yanghan234 yanghan234 force-pushed the feat/double-precision-inference branch from 8aa2863 to 8a56b46 Compare May 11, 2026 15:03
yanghan234 and others added 4 commits May 11, 2026 16:10
Add dtype='float64' option to MatterSimCalculator for double-precision
inference. This improves numerical stability for sensitive properties
like thermal conductivity.

Changes:
- MatterSimCalculator: new dtype parameter ('float32' or 'float64'),
  converts model weights via model.double() and creates inputs in the
  matching dtype
- m3gnet.py: replace .float() with .to(pos.dtype) for pbc_offsets so
  it respects the model's precision
- calculate(): upcast legacy-path float tensors to match model dtype

On equilibrium Si diamond, float64 reduces force noise from ~1e-6 to
~1e-15 eV/Å compared to float32.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Switch BTE tests to dtype='float64' for more stable force constants.
Update reference kappa values to float64 means verified across 3 runs
(rel_std < 0.03%). Tighten tolerance from 8% to 2%.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MPS does not support float64 tensors. This commit:
- Validates dtype+device combo early in MatterSimCalculator.__init__,
  raising ValueError before loading the model
- Falls back to float32 in the basic BTE test on MPS
- Skips the strict BTE test on MPS (reference values require float64)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use float32 with 10% kappa tolerance and 0.5% FC tolerance on MPS,
while keeping float64 with tight tolerances (2% / 0.1%) on CUDA.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yanghan234 yanghan234 force-pushed the feat/double-precision-inference branch from 8a56b46 to e145b41 Compare May 11, 2026 15:10
The local 'import torch._inductor.config' statement caused Python to
treat 'torch' as a local variable in __init__, making earlier references
to torch.float32/torch.float64 fail with UnboundLocalError.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yanghan234
Copy link
Copy Markdown
Collaborator Author

@copilot review this PR

Copy link
Copy Markdown
Contributor

Copilot AI commented May 11, 2026

@copilot review this PR

Reviewed and found one regression in the float64 path: pickling/unpickling could restore the model in float32. I fixed it and added a regression test in commit d9bd42d. CodeQL reports no security alerts.

@yanghan234 yanghan234 merged commit 2966303 into main May 11, 2026
7 checks passed
@yanghan234 yanghan234 deleted the feat/double-precision-inference branch May 11, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants