Skip to content

Conversation

@zihangJiang
Copy link

fix weight decay in param_optimizer to agree with the original (hugging face's) implementation.

(Current implementation seems to apply weight decay of 0.01 to all parameters, since "n not in no_decay" is always True.)

fix weight decay in param_optimizer to agree with the original (hugging face's) implementation
fix weight decay in param_optimizer to agree with the original (hugging face's) implementation.

(Current implementation seems to apply weight decay to all parameters, since "n not in no_decay" is always True.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant