Skip to content

Conversation

@yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Dec 18, 2025

Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
@PRAgent4INC
Copy link
Collaborator

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Typo

There is a typo in the usage function where each should be echo.

each "  $0 --kv_cache_dtype fp8 -m /path/to/my/model"
Redundant Comment

The comment "Set environment variables based on quantization type" is repeated twice.

# Set environment variables based on quantization type
if [[ "$QUANT_TYPE_UPPER" == "MXFP4" ]]; then

@PRAgent4INC
Copy link
Collaborator

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Remove duplicate section

Remove the duplicate NVFP4 section.

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/README.md [34-40]

 - NVFP4
 ```bash
 bash run_quant.sh --model $MODEL -t nvfp4 --output_dir ./qmodels
-+
-+```
-+
-+- NVFP4
-+```bash
-+bash run_quant.sh --model $MODEL -t nvfp4 --output_dir ./qmodels
+```
Suggestion importance[1-10]: 8

__

Why: The suggestion is correct and important as it removes a duplicate section, improving the clarity and maintainability of the README file.

Medium
Fix usage example typo

Correct the typo in the usage example.

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/run_generate.sh [26]

-each "  $0 --kv_cache_dtype fp8 -m /path/to/my/model"
+echo "  $0 --kv_cache_dtype fp8 -m /path/to/my/model"
Suggestion importance[1-10]: 3

__

Why: The suggestion corrects a typo in the usage example, improving readability but offering minimal functional impact.

Low

@yiliu30 yiliu30 changed the title fp8kv Add FP8KV for DS/QWEN Dec 18, 2025
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
@yiliu30 yiliu30 requested a review from chensuyue December 18, 2025 02:36
@yiliu30 yiliu30 changed the title Add FP8KV for DS/QWEN [WIP]Add FP8KV for DS/QWEN Dec 18, 2025
@yiliu30 yiliu30 added the WIP label Dec 18, 2025
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants