Skip to content

Commit b0a2071

Browse files
authored
docs: document perlmutter gasnet requirements (#3073)
1 parent 6f7dd88 commit b0a2071

File tree

3 files changed

+18
-3
lines changed

3 files changed

+18
-3
lines changed

conda/conda-build/meta.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ requirements:
186186
{% if gpu_enabled_bool %}
187187
- cuda-version ={{ cuda_version }}
188188
# these are all constrained by cuda-version
189-
- nccl
189+
- nccl >=2.0,<2.28
190190
- cuda-cudart-dev
191191
- cuda-nvtx-dev
192192
- cuda-nvml-dev
@@ -218,7 +218,7 @@ requirements:
218218
- {{ pin_compatible('cuda-cudart', min_pin='x', max_pin='x') }}
219219
- {{ pin_compatible('cuda-nvtx', min_pin='x', max_pin='x') }}
220220
- libcufile >=1.0,<2
221-
- nccl >=2.0
221+
- nccl >=2.0,<2.28
222222
{% endif %}
223223
- rich
224224
{% if network == 'ucx' %}

continuous_integration/requirements-build.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
--extra-index-url=https://pypi.nvidia.com
33
cmake>=3.26.4,!=3.30.0
44
ninja
5-
nvidia-nccl-cu12<=2.27
5+
nvidia-nccl-cu12<2.28
66
libucx-cu12
77
nvidia-libcal-cu12
88
scikit-build-core[pyproject]>=0.10.0

docs/legate/source/gasnet.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,13 @@ install cmake`` or by any other means. Note that when the wrapper is built, the
201201
final message suggests reactivating the environment, but that is not necessary
202202
before building the GASNet wrapper:
203203

204+
.. note::
205+
206+
As of October 2025, the GASNet wrapper on Perlmutter only works when the
207+
NERSC-provided ``mpich`` module is loaded. Attempts to build or use the
208+
wrapper with ``cray-mpich`` currently fail, so ensure ``module load mpich``
209+
is issued before running ``build-gex-wrapper.sh``.
210+
204211
.. code-block:: sh
205212
206213
login40:~> /conda/envs/legate-gex-anaconda/gex-wrapper/build-gex-wrapper.sh
@@ -339,6 +346,14 @@ actually provide the ``srun`` command around Legate, and then we would use
339346
launcher. With our options, Legate run results in the following output on
340347
Perlmutter:
341348

349+
.. warning::
350+
351+
As of October 2025, Perlmutter jobs that request more than 32 GB of frame
352+
buffer memory (for example, ``--fbmem 64000``) must include ``-gex:bindcuda
353+
0`` in the options passed through ``REALM_DEFAULT_ARGS``. Otherwise the OFI
354+
provider aborts with ``Unexpected error 12 (Cannot allocate memory) from
355+
fi_mr_regattr()``.
356+
342357
.. code-block:: sh
343358
344359
--------------------- CONDA/MPI_WRAPPER/ACTIVATE.SH -----------------------

0 commit comments

Comments
 (0)