Benchmarking inference layer by RishikeshRanade · Pull Request #33 · NVIDIA/physicsnemo-cfd

RishikeshRanade · 2026-04-06T14:08:24Z

PhysicsNeMo-CFD Pull Request

Description

PhysicsNeMo-CFD evaluation adds a config-driven benchmarking pipeline: registered model wrappers run inference on dataset adapters, built-in metrics (shared with physicsnemo.cfd.postprocessing_tools) evaluate predictions vs ground truth, and results flow into tabular reports (JSON/CSV/HTML) plus optional PNG report visuals. workflows/evaluation_examples/ is the primary Hydra workflow (main.py, conf/config_surface.yaml / config_volume.yaml).

Scope of the library

Model wrappers with registry-based CFDModel implementations.
Dataset adapters with canonical case schema, DrivAerML, Ahmed, extensible via adapter registry
Automated benchmarking with run_benchmark driver, per-case metrics, optional metrics cache, matrix mode (models × datasets), report plugins
Visualization consistent with original physicsnemo-cfd functionality
Multi-GPU functionality to process multiple case ids
Caching to restart benchmarking from where it left off

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.

Dependencies

Visualization layer into benchmarking layer

RishikeshRanade · 2026-04-06T14:16:05Z

@ram-cherukuri can you please review the readme?

Updated the README to clarify the purpose of the workflow and its usage.

Updated section headers and improved clarity of config descriptions.

Updated README to streamline configuration instructions and extend workflow customization guidelines.

peterdsharpe

Quick second pass after your fixup commits.

Big picture is pretty good. Most of the original 125 line-by-line items are addressed, the architectural concerns from the first review are still resolved via the centralized helpers (inference_seed, metric_exceptions, cuda_bf16_autocast, visual_filenames, natural_sort), and the two recent commits (a2f0861 "guard volume canonical mapping" and 03ed7dc "narrow scaling pickle fallback") match the previous review's asks closely. The _custom/_hf YAML split and the nim_inference extraction are nice reorganizations.

Three things from this round that I'd flag specifically:

One regression: commit f86b34c removed the with cuda_bf16_autocast(self._device): wrapper from TransolverWrapper.predict while leaving the import in place (now flagged as unused by ruff). FIGNet, XMGN, and DoMINO still use bf16 on CUDA; Transolver and GeoTransolver now silently run fp32 instead. If this was intentional for correctness, please call it out in the CHANGELOG since earlier benchmark numbers from this PR were generated with bf16 active on Transolver. If it was an accident during the "transolver bug" cleanup, restoring the wrap puts Transolver back in line with the others.
Tooling suggestion: a handful of items I flagged inline this round (the duplicate test functions, the dead loaded_epoch / os / re, an in_channels parameter accepted but never read on FIGConvUNetDrivAerML) would all be caught by ruff check --select F in milliseconds. Wiring ruff into the pre-commit / CI lint stage would surface this class of thing automatically and free up review for the more interesting cases.
README / YAML drift: a couple of header-vs-body and README-vs-YAML mismatches. The most consequential is the README claim at line 247 that log_env is false "in all example YAMLs", which is still wrong because config_matrix_volume_hf.yaml:59 ships log_env: true. The volume _hf YAML also has the same header-vs-body reports.enabled mismatch I noted on the surface _hf YAML. And the README metrics: example block now claims continuity_residual_l2 / momentum_residual_l2 are in the volume YAMLs; neither YAML actually has them in this round, so the example doesn't reproduce against either file.

For the 3 threads I left unresolved on the PR after the previous fixup:

DoMINO global_params indexing - thanks for engaging on that one; happy to keep it open while the fix is in flight.
Transolver _datapipe_resolution dead knob - the assignment was actually removed in f86b34c, so that thread can be re-resolved.
Config dataset /lustre/... root - the config_surface.yaml removal didn't quite cover this; config_matrix_surface_hf.yaml:58 still has root: "/lustre/fsw/portfolios/coreai/users/ktangsali/drivaerml" for datasets[0]. Worth a portable placeholder there too.

Patterns worth flagging at the design level:

Documentation drift is the recurring theme this round. README headers, YAML headers, and YAML bodies disagree with each other in 4-5 places across the recent commits. Each instance is small but the cumulative effect is reader confusion; a one-time alignment sweep before merge would close all of them at once.
Cross-wrapper inconsistencies persist. Even after the centralized helpers, GeoTransolver and Transolver now both lack the autocast wrap; XMGN has a bool(string) pattern that the same codebase fixed for align_ground_truth_to_model via _parse_bool; DoMINO has a bare except Exception hiding YAML errors. The structural fixes are good; the per-wrapper code paths could use a final consistency pass.

Same bottom line as the first review: PR is materially closer to merge-ready. The remaining work is concentrated and mostly mechanical (1-2 line fixes), with the Transolver autocast item being the one that needs an explicit decision rather than just a code change.

peterdsharpe

We're getting very very close to merging.

2 open comments remain from my side - one new (a regression that was introduce as part of the last commit), and one older one that is not yet fully resolved. Both straightforward to resolve.

Approving; please address these comments with new commits (or clarifying comments) before merging.

Resolved with branch content preferred; tree unchanged from HEAD.

ktangsali · 2026-05-05T22:32:00Z

/blossom-ci

ktangsali · 2026-05-06T00:32:55Z

/blossom-ci

ktangsali · 2026-05-06T03:04:05Z

Blossom CI is failing because of warp.device failure. This needs a fix to the upstream (Upstream physicsnemo needs to pin the lower version of Warp here: https://github.com/NVIDIA/physicsnemo/blob/main/pyproject.toml#L25 to 1.11.1 or higher). Since the GitHub CI is passing (correctly installs the latest warp), I am merging this. Will submit a PR to the upstream repo separately.

RishikeshRanade and others added 10 commits April 2, 2026 07:39

initial benchmarking and inference layer

06fe5fe

adding visualization layer and improving readme

ba7ed71

adding line plot visualization

35adf9d

fixing issues with visualization and merging workflows

480655c

refactoring nim evaluation

d6915d0

adding headers

37e48ed

adding caching capability and updating docstrings

08166df

refactoring code

f3ba6d5

adding distributed calculation and cleaning up

336f3c1

Merge pull request #2 from RishikeshRanade/visualization-layer

c84c911

Visualization layer into benchmarking layer

RishikeshRanade requested a review from ktangsali April 6, 2026 14:08

RishikeshRanade self-assigned this Apr 6, 2026

RishikeshRanade requested a review from ram-cherukuri April 6, 2026 14:15

renaming example and adding matrix evaluation configs

15296e3

ram-cherukuri reviewed Apr 6, 2026

View reviewed changes

Comment thread physicsnemo/cfd/evaluation/metrics/mesh_bridge.py

Comment thread workflows/benchmarking_workflow/README.md Outdated

Comment thread pyproject.toml

Comment thread physicsnemo/cfd/evaluation/common/interpolation.py

Comment thread physicsnemo/cfd/evaluation/metrics/builtin/physics.py

ram-cherukuri and others added 3 commits April 9, 2026 17:43

Revise README for model evaluation and benchmarking

fca40cd

Updated the README to clarify the purpose of the workflow and its usage.

Revise README for OOB Benchmarking section

e8769ad

Updated section headers and improved clarity of config descriptions.

domain-scoped metrics, aggregate volume visual, and naming cleanup

4d4703b

ram-cherukuri reviewed Apr 10, 2026

View reviewed changes

Comment thread workflows/benchmarking_workflow/README.md Outdated

ram-cherukuri and others added 10 commits April 10, 2026 09:19

Revise README for clarity and customization options

362ee52

Updated README to streamline configuration instructions and extend workflow customization guidelines.

Update README for benchmarking workflow sections

cb8f8ac

update api

d0c37f4

remove xmgn and fgnet volume, because they don't exist

0124bee

add notebooks after validation

eb0be99

add last notebook

be2d3f6

use pnemo functionals for knn

a4d49a7

add deprecation notice

d9948fe

add files for DrivAerML

f172a8a

cleaning up readme, adding ci tests and contributing details

b8a3e6e

remove skills from this PR

bb48e1a

peterdsharpe requested changes May 1, 2026

View reviewed changes

address PR review 2 comments

f121663

peterdsharpe approved these changes May 5, 2026

View reviewed changes

Comment thread workflows/benchmarking/README.md Outdated

Kaustubh Tangsali and others added 18 commits May 5, 2026 14:05

Address pending comments

be61fe7

initial benchmarking and inference layer

8e6ba1d

adding visualization layer and improving readme

4cdb536

adding line plot visualization

492ee88

fixing issues with visualization and merging workflows

1230715

refactoring nim evaluation

53c6edb

adding headers

938b987

adding caching capability and updating docstrings

84451d9

refactoring code

5c9614d

adding distributed calculation and cleaning up

e3300ca

Merge branch 'main' into benchmarking-inference-layer

824ce43

Resolved with branch content preferred; tree unchanged from HEAD.

update the physicsnemo dependency

e16c5c3

fix CI test version

7be36b5

update the python version required

3e888b1

add pin to physicsnemo version, handle the xvfb bug

d66a042

minor edit to toml

94f2d83

add torch-geometric dependency for xmgn, make ci use uv

667c096

fix ci test?

6b887ac

ktangsali added 3 commits May 5, 2026 23:29

black formatting

a13bcfc

fix interrogate issues

53ba51e

fix markdown linting

39b1546

fix license checks

d644d10

ktangsali merged commit bb4e2c7 into NVIDIA:main May 6, 2026
1 of 2 checks passed

Conversation

RishikeshRanade commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PhysicsNeMo-CFD Pull Request

Description

Checklist

Dependencies

Uh oh!

RishikeshRanade commented Apr 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peterdsharpe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peterdsharpe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ktangsali commented May 5, 2026

Uh oh!

ktangsali commented May 6, 2026

Uh oh!

ktangsali commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RishikeshRanade commented Apr 6, 2026 •

edited

Loading