Skip to content

Order of computations in ResNet blocks #2

@Paandaman

Description

@Paandaman

out = F.relu(self.bn1(x))
shortcut = self.shortcut(out) if hasattr(self, 'shortcut') else x
out = self.conv1(out)
out = self.conv2(F.relu(self.bn2(out)))
return out + shortcut

What is the motivation behind computing the batch norm and relu before sending the data into the convolutional layer?

In the implementation done by https://github.com/kuangliu/pytorch-cifar, the computation is done in the following order which seems more conventional, so I am curious why it is changed!

out = F.relu(self.bn1(self.conv1(x)))
out = self.bn2(self.conv2(out))
shortcut = self.shortcut(x) if hasattr(self, 'shortcut') else x
out += shortcut
out = F.relu(out)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions