Skip to content

meeting errors when running nccl_samples #38

@mkbk-with-circle

Description

@mkbk-with-circle

It's just so wired that I have configure the nccl of version 2.17.1-1 and the latest nccl-test
These are what I do after git clone there issue.

For nccl:
204 cd nccl 205 git tag 206 git checkout 2.17.1-1 207 git checkout v2.17.1-1 208 git apply /home/aiscuser/ymy/NPKit/nccl_samples/npkit-for-nccl-2.17.1-1.diff 209 make -j src.build 210 sudo apt install build-essential devscripts debhelper fakeroot 211 make pkg.debian.build

For nccl-tests:
211 cd nccl-tests/ 212 make NCCL_HOME=$my_path_tonccl$

Then I just run bash npkit_launcher.sh in nccl_samples

These are the parameters in npkit_launcher.sh:

Image

And npkit_runner.sh file is never be changed.

My error is:
/home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: symbol lookup error: /home/aiscuser/ymy/nccl-tests/build/all_reduce_perf: undefined symbol: ncclCommRegister

But I really don't know which step I did wrong.. I will really appreciate it if you can give me some advice!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions