Can you describe some training details? For example, batch size, how much loss has decreased, and graphics card model.