-
Notifications
You must be signed in to change notification settings - Fork 643
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
Hi TE team,
I'm interested in whether there are plans to implement block-wise quantization for FP8 training, similar to what's described in papers like "Deepseek V3".
Block quantization could potentially provide better numerical stability and accuracy compared to tensor-wide quantization, especially for outlier values. This could be particularly valuable for large language models where maintaining precision is crucial.
Some specific questions:
- Is this feature currently on your roadmap?
- If yes, what's the approximate timeline?
- If no, are there technical challenges preventing this implementation?
Thank you for your time!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested