-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Description
Calling Ranger21 with mostly default parameters:
optimizer = ranger21.Ranger21(
net.parameters(), lr=0.001, num_epochs=50, weight_decay=1e-5,
num_batches_per_epoch=len(train_loader)
)
Training seems fine for half a day with decent progress on all loss metrics, but then halts:
File "./train_pt.py", line 727, in <module>
main(sys.argv[1:])
File "./train_pt.py", line 612, in main
optimizer.step()
File "/home/morbo/git/sjeng/train/venv19/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/morbo/git/sjeng/train/venv19/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/morbo/git/Ranger21/ranger21/ranger21.py", line 714, in step
raise RuntimeError("hit nan for variance_normalized")
RuntimeError: hit nan for variance_normalized
Metadata
Metadata
Assignees
Labels
No labels