Skip to content

Conversation

@chai-xiaonan
Copy link

@chai-xiaonan chai-xiaonan commented Dec 31, 2025

The Nemo-Bridge model has been adapted to FlagScale, enabling FlagScale to support saving and loading checkpoints in HF safe tensor format. Verification was performed on Qwen3-0.6B, Deepseek v3-16_a3B, and Qwen-32B models; saving and loading HF safe tensor format worked without issues, and the accuracy was correct.

@lxd-cumt
Copy link
Collaborator

lxd-cumt commented Jan 7, 2026

Please add an argument, hf-save-steps, to dynamically control how often to save an Hugging Face checkpoint during training.

@lxd-cumt
Copy link
Collaborator

lxd-cumt commented Jan 8, 2026

Please remove patches, and pr to Megatron-LM-FL for megatron/core related modification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants