Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 4 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,26 +122,15 @@ On non-NVIDIA chips, the default dataset is automatically set to `KernelGenBench

### Multi-Source (NVIDIA A100, 210 operators)

All results use Claude Opus-4.6. Acc = accuracy (%), Spd = geometric mean speedup relative to PyTorch/cuBLAS baseline.
Evaluation across 210 operators from three sources (ATen, vLLM, cuBLAS), showing accuracy and speedup by operator source across all generation paradigms.

| Method | Overall Acc | Overall Spd | ATen Acc | ATen Spd | vLLM Acc | vLLM Spd | cuBLAS Acc | cuBLAS Spd |
|--------|:-----------:|:-----------:|:--------:|:--------:|:--------:|:--------:|:----------:|:----------:|
| Pass@1 | 41 | 0.70 | 39 | 0.90 | 20 | 0.76 | 68 | 0.49 |
| Pass@5 | 57 | 0.68 | 62 | 0.79 | 28 | 0.71 | 74 | 0.49 |
| Claude Code | 87 | 0.78 | 92 | 0.86 | 68 | 1.02 | 94 | 0.51 |
| OpenCode | 81 | 0.73 | 92 | 0.82 | 46 | 0.97 | 92 | 0.50 |
| AKO4all | 83 | 0.97 | 91 | 1.00 | 64 | 1.62 | 84 | 0.61 |
![Multi-Source Results](assets/table_multi_source.png)

### Multi-Chip (110 ATen operators, 6 platforms)

All results use Claude Opus-4.6. Platforms A–E are anonymized vendor hardware.
Cross-platform evaluation on 110 ATen operators across six hardware platforms, showing whether correctness and speedup transfer across heterogeneous hardware backends. Platforms A–E are anonymized vendor hardware.

| Method | NVIDIA Acc/Spd | Platform A Acc/Spd | Platform B Acc/Spd | Platform C Acc/Spd | Platform D Acc/Spd | Platform E Acc/Spd |
|--------|:--------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|
| Pass@1 | 39 / 0.90 | 46 / 0.19 | 44 / 0.69 | 37 / 0.98 | 38 / 0.89 | 38 / 0.88 |
| Pass@5 | 62 / 0.79 | 63 / 0.15 | 60 / 0.74 | 54 / 0.92 | 65 / 0.68 | 57 / 0.83 |
| Claude Code | 92 / 0.86 | 89 / 0.18 | 93 / 0.80 | 88 / 0.87 | 96 / 0.89 | 83 / 0.83 |
| AKO4all | 89 / 1.00 | 84 / 0.30 | 88 / 1.09 | 88 / 1.08 | 86 / 1.12 | 80 / 1.07 |
![Multi-Chip Results](assets/table_multi_chip.png)

![Cross-platform accuracy, speedup, and ecosystem overhead](assets/figure_crossplatform.png)

Expand Down
Binary file added assets/table_multi_chip.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/table_multi_source.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading