Skip to content

Comments

[Draft] Integrate TE permutation #1

Merged
tdophung merged 173 commits intomainfrom
integrate_te_permutation
Feb 11, 2026
Merged

[Draft] Integrate TE permutation #1
tdophung merged 173 commits intomainfrom
integrate_te_permutation

Conversation

@tdophung
Copy link
Owner

@tdophung tdophung commented Jan 21, 2026

Description

This change add an option to use TE's implementation of the permutation operation for the moe layer.

Current permutation operation in MaxText is done as taking an array of input tokens and multiply that to a sparse array of routing map. Then this is sharded in multiple different parallelization schemes.

TE's implementation should not be a multiplication of sparse arrays, but rather constructing a (n, 2E + 1) row_id_map, and read-write operations to write the input tokens in the correct index within each expert, base on the constructed row_id_map. Current TE's implementation will allow for FSDP and EP sharding.

FIXES NVIDIA/TransformerEngine#2585

Tests

TBD

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

ycchenzheng and others added 9 commits December 8, 2025 21:03
…p ; [docs/reference/api.rst] Placeholder for generated API docs ; [pyproject.toml] Uncomment docs deps ; [docs/guides.md] Remove redundant extensions ; [src/MaxText/inference_mlperf/offline_mode.py] Switch from `flags` to `argparse` to overcome duplicate flag error; [.github/workflows/check_docs_build.yml] Use `uv` to resolve max recursive dependency resolution issue ; [src/MaxText/inference_mlperf/matmul/matmul_dtypes.py] Hide under `if __name__ == "__main__"` to resolve hangups on TPUs and CPUs ; [*.py] Various minor fixes to get rid of doc warnings
…uojin-fix-hlo

PiperOrigin-RevId: 859148792
@tdophung
Copy link
Owner Author

gemini-review

Google-ML-Automation and others added 19 commits January 21, 2026 11:43
…ation_tests_restructure

PiperOrigin-RevId: 859214905
…lation_restructure

PiperOrigin-RevId: 859237519
…x/benchmark-v5p-llama2-70b-configs

PiperOrigin-RevId: 859607658
…vidow_log_config_once

PiperOrigin-RevId: 859657642
Co-authored-by: Eitan Porat <eporat@lightricks.com>
Google-ML-Automation and others added 27 commits February 4, 2026 19:04
…sai/fix-ga-in-sft-trainer

PiperOrigin-RevId: 865712234
…/fix_ungroup

PiperOrigin-RevId: 866061982
…kenizer-path

PiperOrigin-RevId: 866106142
…oguo-utils2

PiperOrigin-RevId: 866156329
PiperOrigin-RevId: 866171684

Co-authored-by: maxtext authors <google-ml-automation@google.com>
PiperOrigin-RevId: 866619829
update

clean up

add attention_out for attention, attention_mla
…/add_checkpoint

PiperOrigin-RevId: 866687753
…er/sharony/exp_sharding_dump

PiperOrigin-RevId: 867643814
@tdophung tdophung marked this pull request as draft February 9, 2026 18:12
@tdophung tdophung force-pushed the integrate_te_permutation branch from 23b11e1 to fb04bc7 Compare February 11, 2026 00:52
@tdophung tdophung merged commit 8434e35 into main Feb 11, 2026
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JAX] Integrate Permutation to MaxText