Skip to content

about train loss #84

@foralliance

Description

@foralliance

@zisianw
@sfzhang15

为了表达清楚,这里就用中文了!

关于train loss中的震荡现象.

自己基于提供的代码修改参数进行了3次训练,3次的差异在于batch_size和LR.每次训练时总的iterations保持一样(120K).
唯一的差异是自己基于单GPU进行

在默认参数下,即:batch_size=32,learning rate = 0.01,epoches = 300.训练.最终结果:PASCAL:0.9630;AFW:0.9838;FDDB:0.954/0.725;
训练的cls loss和reg loss分别为下图:
trian_loss_C_FaceBoxes1.pdf
trian_loss_L_FaceBoxes1.pdf

修改参数,即: batch_size=16,learning rate = 0.005,epoches = 150.训练.最终结果:PASCAL:0.9628;AFW:0.9826;DDB:0.954/0.724.
得到的cls loss和reg loss分别为下图:
trian_loss_C_FaceBoxes4.pdf
trian_loss_L_FaceBoxes4.pdf

修改参数,即: batch_size=8,learning rate = 0.0025,epoches = 75.训练.最终结果:PASCAL:0.9567;AFW:0.9839;FDDB:0.945/0.719.
得到的cls loss和reg loss分别为下图:
trian_loss_C_FaceBoxes12.pdf
trian_loss_L_FaceBoxes12.pdf

可以看到不论哪种参数配置下,loss都存在着严重的震荡现象.按理来说,这种震荡对模型是会有影响的,但模型效果却很好,这个该如何去理解!!
谢谢了!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions