Gradient calculation formula of word2vec

In line 523 of word2vec there is a formula:

g = (1 - vocab[word].code[d] - f) * alpha;

Can you please help me understand its logic? 

Since f is the cross product of embedding and context in the case of hierarchical softmax we want it to be as close as possible to the turn (0 or 1) in a Huffman tree we have to take for this previous word (embedding) and current word's node index (context). In this case we just need

g = (vocab[word].code[d] - f)*alpha

Taking into account that vocab[word].code[d] could be 0 or 1 only, the "1 - vocab[word].code[d]" is just the inversion left-to-right-and-back nodes; what's its purpose?

I summed up some details here: https://datascience.stackexchange.com/questions/129865/intuition-behind-g-variable-calculation-in-the-original-word2vec-implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gradient calculation formula of word2vec #34

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Gradient calculation formula of word2vec #34

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions