Skip to content

Conversation

@danielvegamyhre
Copy link
Contributor

@danielvegamyhre danielvegamyhre commented Dec 20, 2025

Stacked PRs:


[mxfp8 moe training] update readme with kernel microbenchmarks for dsv3

Add mxfp8 kernel microbenchmarks for DeepSeekV3 shapes, grouped by forward/dgrad/wgrad. With environment details and commands to reproduce.

As you can see, the last weak point that meaningfully affects performance for DSV3 is the per grouped blocked layout kernel for groups along M (used in forward). We should be able to easily make a modified version of the new CUDA kernel added in the stack that operates on groups along K to instead operate for groups along M and resolve this issue.

Preview:

Screenshot 2025-12-20 at 11 50 03 AM

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3521

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 98e81e0 with merge base 7035fb7 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

danielvegamyhre added a commit that referenced this pull request Dec 20, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 9953446 to 2070bb7 Compare December 20, 2025 19:40
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025
@danielvegamyhre danielvegamyhre marked this pull request as draft December 20, 2025 19:43
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 20, 2025 19:43
danielvegamyhre added a commit that referenced this pull request Dec 20, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 2070bb7 to ac2f0a5 Compare December 20, 2025 19:43
danielvegamyhre added a commit that referenced this pull request Dec 20, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from ac2f0a5 to 16e9bdc Compare December 20, 2025 19:44
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 20, 2025 19:44
@danielvegamyhre danielvegamyhre added topic: documentation Use this tag if this PR adds or improves documentation mx moe labels Dec 20, 2025
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 20, 2025 19:48
@danielvegamyhre danielvegamyhre marked this pull request as draft December 21, 2025 02:13
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 21, 2025 02:13
danielvegamyhre added a commit that referenced this pull request Dec 21, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 16e9bdc to 6f6d495 Compare December 21, 2025 02:13
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 21, 2025 02:13
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 21, 2025 02:13
@danielvegamyhre danielvegamyhre marked this pull request as draft December 21, 2025 02:16
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 21, 2025 02:16
danielvegamyhre added a commit that referenced this pull request Dec 21, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 6f6d495 to 23f82b3 Compare December 21, 2025 02:16
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 21, 2025 02:16
@danielvegamyhre danielvegamyhre marked this pull request as draft December 21, 2025 02:19
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 21, 2025 02:19
danielvegamyhre added a commit that referenced this pull request Dec 21, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 23f82b3 to c92cb6c Compare December 21, 2025 02:19
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 21, 2025 02:19
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 21, 2025 02:20
@danielvegamyhre danielvegamyhre marked this pull request as draft December 21, 2025 02:34
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 21, 2025 02:34
danielvegamyhre added a commit that referenced this pull request Dec 21, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from c92cb6c to 321d1f7 Compare December 21, 2025 02:34
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 21, 2025 02:34
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 21, 2025 02:34
@danielvegamyhre danielvegamyhre marked this pull request as draft December 22, 2025 01:19
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 22, 2025 01:19
danielvegamyhre added a commit that referenced this pull request Dec 22, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 321d1f7 to 322ff0e Compare December 22, 2025 01:19
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 22, 2025 01:19
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 22, 2025 01:20
@danielvegamyhre danielvegamyhre marked this pull request as draft December 22, 2025 01:26
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 22, 2025 01:26
danielvegamyhre added a commit that referenced this pull request Dec 22, 2025
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 322ff0e to 2b20010 Compare December 22, 2025 01:26
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 22, 2025 01:26
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 22, 2025 01:26
stack-info: PR: #3521, branch: danielvegamyhre/stack/90
@danielvegamyhre danielvegamyhre marked this pull request as draft December 22, 2025 01:29
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/89 to main December 22, 2025 01:29
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/90 branch from 2b20010 to 98e81e0 Compare December 22, 2025 01:30
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/89 December 22, 2025 01:30
@danielvegamyhre danielvegamyhre marked this pull request as ready for review December 22, 2025 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. moe mx topic: documentation Use this tag if this PR adds or improves documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants