-
Notifications
You must be signed in to change notification settings - Fork 305
Open
Description
The description of the video links to this article for a description of the positional encoding for timesteps. However, i've noticed a difference between how the function is implemented in this repo, and the definition in the article. In the article, the sin and cos terms are interleaved such that indexes corresponding to 2i are sin, and indexes corresponding to 2i + 1 are cos. In the code, afaict instead the cos terms are just appended to the sin terms rather than interleaved.
I figure maybe this doesn't matter and that the neural network / training process is indifferent to which position in the vector different terms are in? But i asked ChatGPT and it seemed to think it mattered a lot, so I thought it was at least worth an ask.
Metadata
Metadata
Assignees
Labels
No labels