Using our own data from EHR and default parameters of med2vec, the cost went nan in epoch 1. Which parameter should I adujst to avoid such things happen? Enhance L2 or set a bigger log_eps? We have in total over 100 thousand batches, do we need to set a bigger batch_size?