Fix pre-commit (ruff) and mypy errors by runame · Pull Request #258 · facebookresearch/optimizers

runame · 2026-04-25T16:24:35Z

Summary

Resolves the remaining pre-commit and type-check job failures on main after PRs #256 (init imports) and #257 (examples workflow). Three commits, each separately reviewable:

Commit 1: lint fixes (ruff E402 + E731) and copyright header normalization

E402 (~92 violations across 12 files): every affected file had a duplicate triple-quoted copyright block at the top followed by the real module docstring (or, in gpa/, a #-comment copyright). Python only treats the first triple-quoted string as a docstring, so the duplicate counted as a non-import statement and triggered E402 for every subsequent import. Drop the duplicate so the real module docstring is the first statement.
E731 (2 violations) in _shampoo_fully_shard_lossless_distributor.py:69 and _shampoo_hybrid_shard_lossless_distributor.py:75: rewrite should_assign_param_idx = lambda i: ... as a def.
Copyright convention: 79/79 untouched files use a single triple-quoted block. The 6 gpa files I touched (plus gpa/gpa_adamw.py, which had both styles) are normalized to that convention by dropping their #-comment copyright in favor of the triple-quoted form.

Commit 2: ruff-format pass

The pinned ruff-format (v0.8.0 in .pre-commit-config.yaml) considers 14 unrelated files mis-formatted (purely whitespace around assert ..., (...) wrapping). Applying the formatter so the hook passes on the first run instead of failing after auto-modifying files.

Commit 3: mypy fixes (33 errors → 0)

All 27 errors reported by CI plus 6 local-only errors that newer mypy/torch stubs surface:

gpa/gpa_adamw.py (10 errors):
- step(): add @overload pair to match torch.optim.Optimizer
- step(): annotate per-group buffer lists as list[Tensor]
- step(): skip empty parameter groups via continue; rename group-local first_param to group_first_param to avoid mypy union with the Optional[Parameter] from the train-mode check loop
gpa/tests/gpa_test_utils.py:42: annotate devices as tuple[device, ...]
gpa/gpu_tests/gpa_adamw_numerics_test.py:31: ignore missing parameterized stubs (no public stub package)
gpa/examples/cifar10_example.py:115: ignore len() on Dataset[Any]
distributed_shampoo/distributed_shampoo.py:
- lines 669-671: drop redundant re-annotations on the AdaGradPreconditionerConfig branch (no-redef)
- line 1632: move stray # type: ignore onto the _pre_load_train_modes assignment line so it actually applies
distributed_shampoo/distributor/_shampoo_hybrid_shard_lossless_distributor.py:160: replace filter(lambda, ...) with a generator expression to dodge a confusing TypeGuard overload mypy picks for filter()
distributed_shampoo/distributor/shampoo_{ddp,hsdp,hybrid_shard}_distributor.py: ignore attr-defined for the private DeviceMesh._get_all_submeshes — local-only, surfaces with current torch stubs
distributed_shampoo/distributor/shampoo_distributor.py:228: ignore narrowing of map(partial(tuple), ...) assignment — local-only
distributed_shampoo/distributor/gpu_tests/shampoo_checkpoint_test.py:376,411 and examples/parallelism.py:177,206: ignore arg-type for partial[FSDPModule] / FSDPModule passed to a Module parameter (FSDPModule mixes in but mypy can't see it)
distributed_shampoo/examples/utils.py:161: annotate sampler
distributed_shampoo/examples/tests/utils_test.py:228,240: replace assertIsNotNone with assert ... is not None so mypy narrows _fmt before assertIn
distributed_shampoo/utils/load_balancing_utils.py:57: extend existing # type: ignore[misc] to also cover call-overload from max(float, floating[Any])

Local verification

$ ruff check .
All checks passed!
$ ruff format --check .
93 files already formatted
$ mypy .
Success: no issues found in 93 source files

Test plan

CI pre-commit job passes
CI type-check job passes
Other CI jobs (tests, gpu-tests, examples) unchanged

Notes

Stacked on top of #256 (fix/init-stale-rename-imports) and #257 (fix/examples-beta2-config). Until those merge, the diff for this PR will appear to include their commits. Merge order: #256 → #257 → this.

🤖 Generated with Claude Code

`AmortizedPreconditionerConfig` was renamed to `BaseShampooPreconditionerConfig` and `ShampooPreconditionerConfig` to `ClassicShampooPreconditionerConfig` in `shampoo_types.py`, but `__init__.py` still imported and re-exported the old names. Replace them with the new names in the imports and `__all__`. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

The CLI overrides `optimizer.grafting_config={...}` use OmegaConf's default merge semantics, so fields from the base shampoo.yaml grafting_config (notably `beta2: 0.999`) leak into overrides whose `_target_` does not accept them — breaking the AdaGrad and SGD grafting cases. Replace each override with a delete (`~`) followed by add (`+`) so the new mapping fully replaces the inherited one. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

The `4-core-ubuntu-gpu-t4` runner currently has an outdated NVIDIA driver, so `torch.cuda.is_available()` returns False and the "Run DDP example on GPU." step crashes with `ProcessGroupNCCL is only supported with GPUs, no GPUs found!`. The "Run single GPU example on GPU." step "passes" only because torch silently falls back to CPU. Add a `gpu_check` step that probes `torch.cuda.is_available()` and gate both GPU-only steps on its output. If no GPU is detected, those steps are skipped (with a workflow warning) and the job stays green. When the runner image is fixed and a GPU is actually available, both steps run as before with no other changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

E402 errors (~92 total across 12 files): every affected file had a duplicate triple-quoted copyright block at the top followed by the real module docstring (or, in gpa/, a `#`-style copyright). The duplicate triple-quoted block was a non-import statement that triggered E402 for every subsequent import. Drop the duplicate so the real module docstring is the first statement. E731 errors (2 sites in *_lossless_distributor.py): convert `should_assign_param_idx = lambda i: ...` to a `def`. Header convention: 79 of 79 untouched files use a single triple-quoted copyright block (some with merged module docstring text). Normalize the 6 gpa files I touched and gpa/gpa_adamw.py to that style — drop the redundant `#`-comment copyright block in favor of triple-quoted. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Pre-commit's pinned `ruff-format` (v0.8.0 in .pre-commit-config.yaml) considers 14 files mis-formatted. The diffs are purely whitespace choices around `assert ..., (...)` wrapping introduced by ruff version changes. Apply the formatter so the `ruff-format` hook passes on the first run instead of failing after auto-modifying files. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

@overload

- gpa/gpa_adamw.py: * step(): add @overload pair matching torch.optim.Optimizer * step(): annotate the per-group buffer lists as list[Tensor] * step(): rename group-local first_param to group_first_param to avoid mypy union with the Optional[Parameter] from the train-mode check loop, and skip empty parameter groups via `continue` - gpa/tests/gpa_test_utils.py: annotate `devices` as tuple[device, ...] - gpa/gpu_tests/gpa_adamw_numerics_test.py: ignore missing `parameterized` stubs (no public stub package) - gpa/examples/cifar10_example.py: ignore len() on Dataset[Any] - distributed_shampoo/distributed_shampoo.py: * drop redundant re-annotations on AdaGrad branch (no-redef) * move stray `# type: ignore` onto the `_pre_load_train_modes` assignment line so it actually applies - distributed_shampoo/distributor/_shampoo_hybrid_shard_lossless_distributor.py: * replace filter(lambda, ...) with a generator expression to dodge a confusing TypeGuard overload mypy picks for filter() - distributed_shampoo/distributor/shampoo_*_distributor.py (3 files): * ignore attr-defined for the private DeviceMesh._get_all_submeshes - distributed_shampoo/distributor/shampoo_distributor.py: ignore assignment narrowing of map(partial(tuple), ...) - distributed_shampoo/distributor/gpu_tests/shampoo_checkpoint_test.py: ignore arg-type for partial[FSDPModule] passed to a Callable[[Module], Module] parameter (FSDPModule mixes in but mypy can't see it) - distributed_shampoo/examples/parallelism.py: same FSDPModule ignore for two fully_shard call sites - distributed_shampoo/examples/utils.py: annotate `sampler` - distributed_shampoo/examples/tests/utils_test.py: replace assertIsNotNone with `assert ... is not None` so mypy narrows `_fmt` before assertIn - distributed_shampoo/utils/load_balancing_utils.py: extend the existing `# type: ignore[misc]` to also cover `call-overload` from max(float, floating[Any]) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

meta-codesync · 2026-04-29T22:01:40Z

@hjmshi has imported this pull request. If you are a Meta employee, you can view this in D103082000.

wz337

Review automatically exported from Phabricator review in Meta.

runame and others added 2 commits April 11, 2026 20:38

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 25, 2026

runame and others added 4 commits April 25, 2026 17:33

runame force-pushed the fix/precommit-and-mypy branch from 76eb53b to ce5b552 Compare April 25, 2026 16:40

wz337 approved these changes Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pre-commit (ruff) and mypy errors#258

Fix pre-commit (ruff) and mypy errors#258
runame wants to merge 6 commits intofacebookresearch:mainfrom
runame:fix/precommit-and-mypy

runame commented Apr 25, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Apr 29, 2026

Uh oh!

wz337 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

runame commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commit 1: lint fixes (ruff E402 + E731) and copyright header normalization

Commit 2: ruff-format pass

Commit 3: mypy fixes (33 errors → 0)

Local verification

Test plan

Notes

Uh oh!

meta-codesync Bot commented Apr 29, 2026

Uh oh!

wz337 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

runame commented Apr 25, 2026 •

edited

Loading