-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix missing propagation of RoPE scaling and interpolation arguments in model builders
community-request
#2902
opened Jan 11, 2026 by
taehwakkwon
Loading…
1 of 6 tasks
Revert "[dev] Add assertion for mxfp8 params without dp overlap (#2270)"
dev2main: mbridge
dev to main: this PR is needed in main for mbridge
Final Review
Apply this label to indicate that your PR is ready for final review.
fix(dist_ckpt): Remove step tensor from sharded param state to fix MoE/EP checkpoint validation
community-request
#2895
opened Jan 10, 2026 by
jreiml
Loading…
[Megatron-FSDP] Test FP8 activations + parameter sharding with Megatron-FSDP fully-shard. Update README.
Final Review
Apply this label to indicate that your PR is ready for final review.
Support custom Router implementations in MoELayer
community-request
#2891
opened Jan 9, 2026 by
nschank
Loading…
2 of 6 tasks
Add a logprobs test with real gpt model.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Run tests
Remove cross-rank synchronization during checkpoint load & deprecate torch.distributed.checkpoint.state_dict_loader.load_state_dict
#2864
opened Jan 8, 2026 by
asolergi-nv
Loading…
Use global user buffer when the bucket size does not fit FixedPoolAllocator
#2857
opened Jan 7, 2026 by
shengf-nv
Loading…
6 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-12-11.