-
Notifications
You must be signed in to change notification settings - Fork 10
Description
It's just so wired that I have configure the nccl of version 2.17.1-1 and the latest nccl-test
These are what I do after git clone there issue.
For nccl:
204 cd nccl 205 git tag 206 git checkout 2.17.1-1 207 git checkout v2.17.1-1 208 git apply /home/aiscuser/ymy/NPKit/nccl_samples/npkit-for-nccl-2.17.1-1.diff 209 make -j src.build 210 sudo apt install build-essential devscripts debhelper fakeroot 211 make pkg.debian.build
For nccl-tests:
211 cd nccl-tests/ 212 make NCCL_HOME=$my_path_tonccl$
Then I just run bash npkit_launcher.sh in nccl_samples
These are the parameters in npkit_launcher.sh:
And npkit_runner.sh file is never be changed.
My error is:
/home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister
But I really don't know which step I did wrong.. I will really appreciate it if you can give me some advice!