Skip to content

Skip Dynamo graph break for scalar-only bin_ops when tensorify is enabled (#2687)#2687

Open
nandesuka wants to merge 1 commit intopytorch:mainfrom
nandesuka:export-D103055794
Open

Skip Dynamo graph break for scalar-only bin_ops when tensorify is enabled (#2687)#2687
nandesuka wants to merge 1 commit intopytorch:mainfrom
nandesuka:export-D103055794

Conversation

@nandesuka
Copy link
Copy Markdown

@nandesuka nandesuka commented May 6, 2026

Summary:

X-link: pytorch/pytorch#182026

When tensorify_python_scalars is enabled and dynamic=True, Dynamo lifts Python float/int arguments as 0-dim tensor placeholders followed by .item() calls, producing SymFloat/SymInt values. The tensorify pass (in AOTAutograd) then converts scalar ops back to tensor ops.

However, torch.add/sub/mul/div on all-scalar args hit a graph break in TorchInGraphFunctionVariable ("Attempted to call torch in-graph function on only torch.SymInt arguments") that was added before the tensorify infrastructure existed. This prevents the graph from reaching AOTAutograd, so tensorify never runs.

Fix: check _is_tensorify_enabled() to skip the graph break when tensorify can handle the scalar-to-tensor conversion downstream. This allows scalar-only binary ops to be compiled rather than graph-broken.

The _is_tensorify_enabled() helper is moved to torch/_dynamo/utils.py so it is shared between the Dynamo tracing check (in variables/torch.py) and the tensorify FX pass itself (torch/fx/passes/_tensorify_python_scalars.py), eliminating the previous duplication.

A comment at the Dynamo call site documents that this is an intentional abstraction violation: Dynamo peeks at a downstream pass's config to decide whether to graph-break, because when tensorify is enabled these scalar-only ops will be handled later in the pipeline.

Changed files:

  • torch/_dynamo/utils.py: Add shared _is_tensorify_enabled() (cached env var + JustKnobs check)
  • torch/_dynamo/variables/torch.py: Import _is_tensorify_enabled from utils, add abstraction-violation comment, gate graph break on tensorify being disabled
  • torch/fx/passes/_tensorify_python_scalars.py: Replace inline knob logic with import of shared _is_tensorify_enabled()
  • test/dynamo/test_misc.py: Add test_tensorify_scalar_only_bin_ops and test_tensorify_scalar_only_bin_ops_int verifying scalar-only binary ops compile without graph break when tensorify is enabled

Differential Revision: D103055794

@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 6, 2026

@nandesuka has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103055794.

…bled (pytorch#2687)

Summary:

X-link: pytorch/pytorch#182026

When `tensorify_python_scalars` is enabled and `dynamic=True`, Dynamo lifts Python float/int arguments as 0-dim tensor placeholders followed by `.item()` calls, producing SymFloat/SymInt values. The tensorify pass (in AOTAutograd) then converts scalar ops back to tensor ops.

However, `torch.add/sub/mul/div` on all-scalar args hit a graph break in TorchInGraphFunctionVariable ("Attempted to call torch in-graph function on only torch.SymInt arguments") that was added before the tensorify infrastructure existed. This prevents the graph from reaching AOTAutograd, so tensorify never runs.

Fix: check `_is_tensorify_enabled()` to skip the graph break when tensorify can handle the scalar-to-tensor conversion downstream. This allows scalar-only binary ops to be compiled rather than graph-broken.

The `_is_tensorify_enabled()` helper is moved to `torch/_dynamo/utils.py` so it is shared between the Dynamo tracing check (in `variables/torch.py`) and the tensorify FX pass itself (`torch/fx/passes/_tensorify_python_scalars.py`), eliminating the previous duplication.

A comment at the Dynamo call site documents that this is an intentional abstraction violation: Dynamo peeks at a downstream pass's config to decide whether to graph-break, because when tensorify is enabled these scalar-only ops will be handled later in the pipeline.

Changed files:
- `torch/_dynamo/utils.py`: Add shared `_is_tensorify_enabled()` (cached env var + JustKnobs check)
- `torch/_dynamo/variables/torch.py`: Import `_is_tensorify_enabled` from utils, add abstraction-violation comment, gate graph break on tensorify being disabled
- `torch/fx/passes/_tensorify_python_scalars.py`: Replace inline knob logic with import of shared `_is_tensorify_enabled()`
- `test/dynamo/test_misc.py`: Add `test_tensorify_scalar_only_bin_ops` and `test_tensorify_scalar_only_bin_ops_int` verifying scalar-only binary ops compile without graph break when tensorify is enabled

Differential Revision: D103055794
nandesuka added a commit to nandesuka/pytorch that referenced this pull request May 7, 2026
…bled (pytorch#182026)

Summary:
X-link: pytorch/benchmark#2687


When `tensorify_python_scalars` is enabled and `dynamic=True`, Dynamo lifts Python float/int arguments as 0-dim tensor placeholders followed by `.item()` calls, producing SymFloat/SymInt values. The tensorify pass (in AOTAutograd) then converts scalar ops back to tensor ops.

However, `torch.add/sub/mul/div` on all-scalar args hit a graph break in TorchInGraphFunctionVariable ("Attempted to call torch in-graph function on only torch.SymInt arguments") that was added before the tensorify infrastructure existed. This prevents the graph from reaching AOTAutograd, so tensorify never runs.

Fix: check `_is_tensorify_enabled()` to skip the graph break when tensorify can handle the scalar-to-tensor conversion downstream. This allows scalar-only binary ops to be compiled rather than graph-broken.

The `_is_tensorify_enabled()` helper is moved to `torch/_dynamo/utils.py` so it is shared between the Dynamo tracing check (in `variables/torch.py`) and the tensorify FX pass itself (`torch/fx/passes/_tensorify_python_scalars.py`), eliminating the previous duplication.

A comment at the Dynamo call site documents that this is an intentional abstraction violation: Dynamo peeks at a downstream pass's config to decide whether to graph-break, because when tensorify is enabled these scalar-only ops will be handled later in the pipeline.

Changed files:
- `torch/_dynamo/utils.py`: Add shared `_is_tensorify_enabled()` (cached env var + JustKnobs check)
- `torch/_dynamo/variables/torch.py`: Import `_is_tensorify_enabled` from utils, add abstraction-violation comment, gate graph break on tensorify being disabled
- `torch/fx/passes/_tensorify_python_scalars.py`: Replace inline knob logic with import of shared `_is_tensorify_enabled()`
- `test/dynamo/test_misc.py`: Add `test_tensorify_scalar_only_bin_ops` and `test_tensorify_scalar_only_bin_ops_int` verifying scalar-only binary ops compile without graph break when tensorify is enabled

Test Plan:
## Unit Tests

Added two tests in `test_misc.py` (`DynamoOpPromotionTests`):
- `test_tensorify_scalar_only_bin_ops`: Verifies `torch.add/sub/mul/div(float, float)` with `dynamic=True` compiles without graph break when tensorify is enabled
- `test_tensorify_scalar_only_bin_ops_int`: Verifies `torch.add(int, int)` with `dynamic=True` compiles without graph break when tensorify is enabled

Differential Revision: D103055794
@meta-codesync meta-codesync Bot changed the title Skip Dynamo graph break for scalar-only bin_ops when tensorify is enabled Skip Dynamo graph break for scalar-only bin_ops when tensorify is enabled (#2687) May 7, 2026
@nandesuka nandesuka force-pushed the export-D103055794 branch from d5fd2d5 to 9b7020e Compare May 7, 2026 06:52
@nandesuka nandesuka had a problem deploying to docker-s3-upload May 7, 2026 06:52 — with GitHub Actions Failure
@nandesuka nandesuka had a problem deploying to docker-s3-upload May 7, 2026 06:52 — with GitHub Actions Failure
@nandesuka nandesuka had a problem deploying to docker-s3-upload May 7, 2026 06:52 — with GitHub Actions Failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant