[WIP]Add FP8KV for DS/QWEN #2367

yiliu30 · 2025-12-18T01:19:06Z

QWEN
DS
depends on
Add FP8 KV Support for DS auto-round#1180
Deepseek MXFP4/MXFP8 + FP8KV yiliu30/vllm-fork#84

cc @thuang6

Signed-off-by: yiliu30 <[email protected]>

PRAgent4INC · 2025-12-18T01:19:46Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Typo There is a typo in the usage function where `each` should be `echo`. each " $0 --kv_cache_dtype fp8 -m /path/to/my/model" Redundant Comment The comment "Set environment variables based on quantization type" is repeated twice. # Set environment variables based on quantization type if [[ "$QUANT_TYPE_UPPER" == "MXFP4" ]]; then

PRAgent4INC · 2025-12-18T01:20:08Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Remove duplicate section Remove the duplicate NVFP4 section. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/README.md [34-40] - NVFP4 ```bash bash run_quant.sh --model $MODEL -t nvfp4 --output_dir ./qmodels -+ -+``` -+ -+- NVFP4 -+```bash -+bash run_quant.sh --model $MODEL -t nvfp4 --output_dir ./qmodels +``` Suggestion importance[1-10]: 8 __ Why: The suggestion is correct and important as it removes a duplicate section, improving the clarity and maintainability of the README file.	Medium
General	Fix usage example typo Correct the typo in the usage example. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/run_generate.sh [26] -each " $0 --kv_cache_dtype fp8 -m /path/to/my/model" +echo " $0 --kv_cache_dtype fp8 -m /path/to/my/model" Suggestion importance[1-10]: 3 __ Why: The suggestion corrects a typo in the usage example, improving readability but offering minimal functional impact.	Low

Signed-off-by: yiliu30 <[email protected]>

yiliu30 added 8 commits December 8, 2025 18:29

Add ds nvfp4

52f5120

Signed-off-by: yiliu30 <[email protected]>

update example

22fb6fc

Signed-off-by: yiliu30 <[email protected]>

update

2d88b4b

Signed-off-by: yiliu30 <[email protected]>

fix ct version

eee8763

Signed-off-by: yiliu30 <[email protected]>

update

66edd91

Signed-off-by: yiliu30 <[email protected]>

merge

66b647e

Signed-off-by: yiliu30 <[email protected]>

add fp8 kv

1d63d12

Signed-off-by: yiliu30 <[email protected]>

fix

2c8f0a5

Signed-off-by: yiliu30 <[email protected]>

PRAgent4INC added the Review effort 3/5 label Dec 18, 2025

yiliu30 changed the title ~~fp8kv~~ Add FP8KV for DS/QWEN Dec 18, 2025

yiliu30 added 3 commits December 17, 2025 17:39

add for qwen

ffbadd6

Signed-off-by: yiliu30 <[email protected]>

fix

e77b2c2

Signed-off-by: yiliu30 <[email protected]>

fix

488d7c9

Signed-off-by: yiliu30 <[email protected]>

yiliu30 requested a review from chensuyue December 18, 2025 02:36

yiliu30 changed the title ~~Add FP8KV for DS/QWEN~~ [WIP]Add FP8KV for DS/QWEN Dec 18, 2025

yiliu30 added the WIP label Dec 18, 2025

yiliu30 added 6 commits December 21, 2025 19:17

merge master

2eb6c98

Signed-off-by: yiliu30 <[email protected]>

refactor

45bdeab

Signed-off-by: yiliu30 <[email protected]>

refactor

b6a9dd7

Signed-off-by: yiliu30 <[email protected]>

fix

039675a

Signed-off-by: yiliu30 <[email protected]>

update branch

0237780

Signed-off-by: yiliu30 <[email protected]>

fix

b7fb504

Signed-off-by: yiliu30 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP]Add FP8KV for DS/QWEN #2367

[WIP]Add FP8KV for DS/QWEN #2367

yiliu30 commented Dec 18, 2025 •

edited

Loading

Uh oh!

PRAgent4INC commented Dec 18, 2025

Uh oh!

PRAgent4INC commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP]Add FP8KV for DS/QWEN #2367

Are you sure you want to change the base?

[WIP]Add FP8KV for DS/QWEN #2367

Conversation

yiliu30 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PRAgent4INC commented Dec 18, 2025

PR Reviewer Guide 🔍

Uh oh!

PRAgent4INC commented Dec 18, 2025

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yiliu30 commented Dec 18, 2025 •

edited

Loading