[pull] main from NVIDIA:main by pull[bot] · Pull Request #473 · phu0ngng/TransformerEngine

pull · 2026-02-03T04:32:05Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@ksivaman

* Support building with headers from nvidia wheels There are two changes: 1. `import nvidia` returns a namespace package with `__file__` equal to `None` 2. Add the way to force headers from nvidia wheels. Without that envvar, it's practically impossible with CUDA installed system-wide. I successfully built the package with torch using the following `uv` configuration: ``` [tool.uv.extra-build-dependencies] "transformer-engine-torch" = [ "ninja", "nvidia-cuda-crt==13.0.88", "nvidia-cuda-cccl==13.0.85", { requirement = "torch", match-runtime = true }, { requirement = "pytorch-triton", match-runtime = true }, { requirement = "nvidia-cusolver", match-runtime = true }, { requirement = "nvidia-curand", match-runtime = true }, { requirement = "nvidia-cublas", match-runtime = true }, { requirement = "nvidia-cusparse", match-runtime = true }, { requirement = "nvidia-cudnn-cu13", match-runtime = true }, { requirement = "nvidia-nvtx", match-runtime = true }, { requirement = "nvidia-cuda-nvrtc", match-runtime = true }, { requirement = "nvidia-cuda-runtime", match-runtime = true }, ] ``` Signed-off-by: Vadim Markovtsev <vadim@poolside.ai> * Apply suggestion from @ksivaman Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by: Vadim Markovtsev <vadim@poolside.ai> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

* Fixed scaling-factor computation for FP32 to match the reference implementation. Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com> * Uncommented the tuned kernel path Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

vmarkovtsev and others added 2 commits February 2, 2026 17:00

pull bot locked and limited conversation to collaborators Feb 3, 2026

pull bot added the ⤵️ pull label Feb 3, 2026

pull bot merged commit 29b84c1 into phu0ngng:main Feb 3, 2026
8 of 10 checks passed

pull bot had a problem deploying to github-pages February 3, 2026 04:33 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[pull] main from NVIDIA:main#473

[pull] main from NVIDIA:main#473
pull[bot] merged 2 commits intophu0ngng:mainfrom
NVIDIA:main

pull bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

pull bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull bot commented Feb 3, 2026 •

edited

Loading