Fix pre-commit (ruff) and mypy errors#258
Open
runame wants to merge 6 commits intofacebookresearch:mainfrom
Open
Fix pre-commit (ruff) and mypy errors#258runame wants to merge 6 commits intofacebookresearch:mainfrom
runame wants to merge 6 commits intofacebookresearch:mainfrom
Conversation
`AmortizedPreconditionerConfig` was renamed to `BaseShampooPreconditionerConfig` and `ShampooPreconditionerConfig` to `ClassicShampooPreconditionerConfig` in `shampoo_types.py`, but `__init__.py` still imported and re-exported the old names. Replace them with the new names in the imports and `__all__`. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The CLI overrides `optimizer.grafting_config={...}` use OmegaConf's
default merge semantics, so fields from the base shampoo.yaml
grafting_config (notably `beta2: 0.999`) leak into overrides whose
`_target_` does not accept them — breaking the AdaGrad and SGD
grafting cases. Replace each override with a delete (`~`) followed by
add (`+`) so the new mapping fully replaces the inherited one.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The `4-core-ubuntu-gpu-t4` runner currently has an outdated NVIDIA driver, so `torch.cuda.is_available()` returns False and the "Run DDP example on GPU." step crashes with `ProcessGroupNCCL is only supported with GPUs, no GPUs found!`. The "Run single GPU example on GPU." step "passes" only because torch silently falls back to CPU. Add a `gpu_check` step that probes `torch.cuda.is_available()` and gate both GPU-only steps on its output. If no GPU is detected, those steps are skipped (with a workflow warning) and the job stays green. When the runner image is fixed and a GPU is actually available, both steps run as before with no other changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
E402 errors (~92 total across 12 files): every affected file had a duplicate triple-quoted copyright block at the top followed by the real module docstring (or, in gpa/, a `#`-style copyright). The duplicate triple-quoted block was a non-import statement that triggered E402 for every subsequent import. Drop the duplicate so the real module docstring is the first statement. E731 errors (2 sites in *_lossless_distributor.py): convert `should_assign_param_idx = lambda i: ...` to a `def`. Header convention: 79 of 79 untouched files use a single triple-quoted copyright block (some with merged module docstring text). Normalize the 6 gpa files I touched and gpa/gpa_adamw.py to that style — drop the redundant `#`-comment copyright block in favor of triple-quoted. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Pre-commit's pinned `ruff-format` (v0.8.0 in .pre-commit-config.yaml) considers 14 files mis-formatted. The diffs are purely whitespace choices around `assert ..., (...)` wrapping introduced by ruff version changes. Apply the formatter so the `ruff-format` hook passes on the first run instead of failing after auto-modifying files. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- gpa/gpa_adamw.py: * step(): add @overload pair matching torch.optim.Optimizer * step(): annotate the per-group buffer lists as list[Tensor] * step(): rename group-local first_param to group_first_param to avoid mypy union with the Optional[Parameter] from the train-mode check loop, and skip empty parameter groups via `continue` - gpa/tests/gpa_test_utils.py: annotate `devices` as tuple[device, ...] - gpa/gpu_tests/gpa_adamw_numerics_test.py: ignore missing `parameterized` stubs (no public stub package) - gpa/examples/cifar10_example.py: ignore len() on Dataset[Any] - distributed_shampoo/distributed_shampoo.py: * drop redundant re-annotations on AdaGrad branch (no-redef) * move stray `# type: ignore` onto the `_pre_load_train_modes` assignment line so it actually applies - distributed_shampoo/distributor/_shampoo_hybrid_shard_lossless_distributor.py: * replace filter(lambda, ...) with a generator expression to dodge a confusing TypeGuard overload mypy picks for filter() - distributed_shampoo/distributor/shampoo_*_distributor.py (3 files): * ignore attr-defined for the private DeviceMesh._get_all_submeshes - distributed_shampoo/distributor/shampoo_distributor.py: ignore assignment narrowing of map(partial(tuple), ...) - distributed_shampoo/distributor/gpu_tests/shampoo_checkpoint_test.py: ignore arg-type for partial[FSDPModule] passed to a Callable[[Module], Module] parameter (FSDPModule mixes in but mypy can't see it) - distributed_shampoo/examples/parallelism.py: same FSDPModule ignore for two fully_shard call sites - distributed_shampoo/examples/utils.py: annotate `sampler` - distributed_shampoo/examples/tests/utils_test.py: replace assertIsNotNone with `assert ... is not None` so mypy narrows `_fmt` before assertIn - distributed_shampoo/utils/load_balancing_utils.py: extend the existing `# type: ignore[misc]` to also cover `call-overload` from max(float, floating[Any]) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
76eb53b to
ce5b552
Compare
|
@hjmshi has imported this pull request. If you are a Meta employee, you can view this in D103082000. |
wz337
approved these changes
Apr 30, 2026
Contributor
wz337
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves the remaining
pre-commitandtype-checkjob failures onmainafter PRs #256 (init imports) and #257 (examples workflow). Three commits, each separately reviewable:Commit 1: lint fixes (ruff E402 + E731) and copyright header normalization
gpa/, a#-comment copyright). Python only treats the first triple-quoted string as a docstring, so the duplicate counted as a non-import statement and triggered E402 for every subsequent import. Drop the duplicate so the real module docstring is the first statement._shampoo_fully_shard_lossless_distributor.py:69and_shampoo_hybrid_shard_lossless_distributor.py:75: rewriteshould_assign_param_idx = lambda i: ...as adef.gpa/gpa_adamw.py, which had both styles) are normalized to that convention by dropping their#-comment copyright in favor of the triple-quoted form.Commit 2: ruff-format pass
The pinned
ruff-format(v0.8.0 in.pre-commit-config.yaml) considers 14 unrelated files mis-formatted (purely whitespace aroundassert ..., (...)wrapping). Applying the formatter so the hook passes on the first run instead of failing after auto-modifying files.Commit 3: mypy fixes (33 errors → 0)
All 27 errors reported by CI plus 6 local-only errors that newer mypy/torch stubs surface:
step(): add@overloadpair to matchtorch.optim.Optimizerstep(): annotate per-group buffer lists aslist[Tensor]step(): skip empty parameter groups viacontinue; rename group-localfirst_paramtogroup_first_paramto avoid mypy union with theOptional[Parameter]from the train-mode check loopdevicesastuple[device, ...]parameterizedstubs (no public stub package)len()onDataset[Any]AdaGradPreconditionerConfigbranch (no-redef)# type: ignoreonto the_pre_load_train_modesassignment line so it actually appliesfilter(lambda, ...)with a generator expression to dodge a confusingTypeGuardoverload mypy picks forfilter()attr-definedfor the privateDeviceMesh._get_all_submeshes— local-only, surfaces with current torch stubsmap(partial(tuple), ...)assignment — local-onlyarg-typeforpartial[FSDPModule]/FSDPModulepassed to aModuleparameter (FSDPModule mixes in but mypy can't see it)samplerassertIsNotNonewithassert ... is not Noneso mypy narrows_fmtbeforeassertIn# type: ignore[misc]to also covercall-overloadfrommax(float, floating[Any])Local verification
Test plan
pre-commitjob passestype-checkjob passestests,gpu-tests,examples) unchangedNotes
Stacked on top of #256 (
fix/init-stale-rename-imports) and #257 (fix/examples-beta2-config). Until those merge, the diff for this PR will appear to include their commits. Merge order: #256 → #257 → this.🤖 Generated with Claude Code