Skip to content

Add challenge 99: Grouped GEMM (Medium)#264

Open
claude[bot] wants to merge 1 commit into
mainfrom
add-challenge-99-grouped-gemm
Open

Add challenge 99: Grouped GEMM (Medium)#264
claude[bot] wants to merge 1 commit into
mainfrom
add-challenge-99-grouped-gemm

Conversation

@claude
Copy link
Copy Markdown
Contributor

@claude claude Bot commented May 8, 2026

Summary

  • Adds challenge 99 (medium): Grouped GEMM, the core building block of Mixture-of-Experts inference.
  • Each contiguous row group of A (defined by a cumulative group_offsets array) is multiplied by its own per-group weight matrix B[g], with output written to the corresponding rows of C. Empty groups are supported.
  • Includes challenge.py (10 functional tests covering empty groups, imbalanced routing, edge sizes, zeros/negatives, and a realistic MoE perf test of 8 experts × 8,192 tokens × 1,024 hidden × 2,048 intermediate), challenge.html with SVG visualization, and starter files for all six frameworks.

Test plan

  • pre-commit run --all-files passes on the new files
  • scripts/run_challenge.py --action submit on T4: all functional + performance tests pass with a tiled CUDA solution

🤖 Generated with Claude Code

Implements a Grouped General Matrix Multiplication where rows of A are
partitioned into G contiguous groups (via cumulative offsets) and each
group is multiplied by its own per-group weight matrix B[g]. This is the
core building block of Mixture-of-Experts inference: after a router has
dispatched tokens to experts, a single grouped GEMM computes every
expert's projection in one launch with imbalanced per-expert workloads.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants