Add ds nvfp4 #2356

yiliu30 · 2025-12-09T02:30:38Z

User description

Signed-off-by: yiliu30 yi4.liu@intel.com

PR Type

Enhancement

Description

Added support for NVFP4 quantization scheme
Updated usage instructions and validation checks
Modified environment variable settings for NVFP4

Diagram Walkthrough

flowchart LR
  A["Add NVFP4 config"] -- "Update quantize.py" --> B["Modify run_evaluation.sh"]
  B -- "Update usage and validation" --> C["Adjust run_generate.sh"]
  C -- "Set NVFP4 env vars" --> D["Update README.md"]

File Walkthrough

Relevant files

Enhancement

quantize.py `Add NVFP4 configuration` examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/quantize.py Added NVFP4 configuration to `config_dict` Set `enable_torch_compile` to True Added `low_gpu_mem_usage` parameter	+7/-1
run_evaluation.sh `Update evaluation script for NVFP4` examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/run_evaluation.sh Updated usage message to include NVFP4 Added NVFP4 condition to set environment variables Updated error message to include NVFP4	+9/-2
run_generate.sh `Update generation script for NVFP4` examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/run_generate.sh Updated quantization type validation to include NVFP4 Added NVFP4 condition to set environment variables Moved common environment variable setting	+10/-3

Signed-off-by: yiliu30 <yi4.liu@intel.com>

PRAgent4INC · 2025-12-09T02:31:18Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Default Values The new `nvfp4` configuration uses default values for `iters` and `fp_layers` that are identical to other schemes. Ensure these defaults are appropriate for `nvfp4`. "nvfp4": { "scheme": "NVFP4", "fp_layers": "lm_head,self_attn", "iters": 0, }, Environment Variables The environment variables set for `nvfp4` are different from those for `mxfp4` and `mxfp8`. Verify that these settings are correct and necessary for `nvfp4`. elif [[ "$SCHEME" == "nvfp4" ]]; then VLLM_AR_MXFP4_MODULAR_MOE=0 VLLM_MXFP4_PRE_UNPACK_TO_FP8=0 VLLM_MXFP4_PRE_UNPACK_WEIGHTS=0 VLLM_ENABLE_STATIC_MOE=0 VLLM_USE_DEEP_GEMM=0 VLLM_ENABLE_AR_EXT=0 Error Message The error message now includes `nvfp4` as a valid option. Ensure that all parts of the script correctly handle `nvfp4` as a valid input. echo "Error: Invalid quantization scheme (-s). Must be 'mxfp4', 'nvfp4' or 'mxfp8'."

PRAgent4INC · 2025-12-09T02:31:44Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Make torch compile configurable Consider making `enable_torch_compile` configurable via command-line arguments instead of hardcoding it. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/quantize.py [71] -enable_torch_compile=True, +enable_torch_compile=args.enable_torch_compile, Suggestion importance[1-10]: 7 __ Why: Making `enable_torch_compile` configurable via command-line arguments improves flexibility but does not address a critical issue.	Medium
	Add NVFP4 support Update the command to include the new NVFP4 option. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/README.md [43] -bash run_generate.sh -s [mxfp4\|mxfp8] -tp [tensor_parallel_size] -m [model_path] +bash run_generate.sh -s [mxfp4\|mxfp8\|nvfp4] -tp [tensor_parallel_size] -m [model_path] Suggestion importance[1-10]: 7 __ Why: The suggestion correctly updates the command to include the new NVFP4 option, improving the documentation's accuracy.	Medium
	Add NVFP4 examples Add examples for NVFP4 evaluation. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/README.md [71-72] bash run_evaluation.sh -s mxfp4 -t piqa,hellaswag,mmlu -tp 8 -b 512 -m /path/to/ds_mxfp4 bash run_evaluation.sh -s mxfp4 -t gsm8k -tp 8 -b 256 -m /path/to/ds_mxfp4 +bash run_evaluation.sh -s nvfp4 -t piqa,hellaswag,mmlu -tp 8 -b 512 -m /path/to/ds_nvfp4 +bash run_evaluation.sh -s nvfp4 -t gsm8k -tp 8 -b 256 -m /path/to/ds_nvfp4 Suggestion importance[1-10]: 7 __ Why: The suggestion correctly adds examples for NVFP4 evaluation, enhancing the documentation's completeness.	Medium
	Remove duplicate comment Remove the duplicate comment. examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/run_generate.sh [85-86] -# Set environment variables based on quantization type # Set environment variables based on quantization type Suggestion importance[1-10]: 5 __ Why: Removing the duplicate comment enhances code readability but offers a minor improvement.	Low

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Add ds nvfp4

52f5120

Signed-off-by: yiliu30 <yi4.liu@intel.com>

PRAgent4INC added the Review effort 3/5 label Dec 9, 2025

update example

22fb6fc

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ds nvfp4 #2356

Add ds nvfp4 #2356

Uh oh!

yiliu30 commented Dec 9, 2025 •

edited by PRAgent4INC

Loading

Uh oh!

PRAgent4INC commented Dec 9, 2025

Uh oh!

PRAgent4INC commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add ds nvfp4 #2356

Are you sure you want to change the base?

Add ds nvfp4 #2356

Uh oh!

Conversation

yiliu30 commented Dec 9, 2025 • edited by PRAgent4INC Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

PRAgent4INC commented Dec 9, 2025

PR Reviewer Guide 🔍

Uh oh!

PRAgent4INC commented Dec 9, 2025

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yiliu30 commented Dec 9, 2025 •

edited by PRAgent4INC

Loading