Skip to content

Conversation

@chenjun2hao
Copy link

fix batchnorm data type debug

@DingXiaoH
Copy link
Owner

I understand the np.float32 part, thanks for the suggestion, but why is the cuda part deleted? That causes a gpu-cpu mismatch during inference or training (RuntimeError: expected device cpu but got device cuda:0) when you get the equivalent kernel for some reason. id_tensor should be on the same device. The latest version (self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)) looks good and works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants