we would like to profile the nccl calls which are used in nccl-tests, what are the steps to use NPkit profiler