if not caculuated wrong, it take 46.5 hours in 8*H100 to train this model on cifar-10? very large computation cost right?