Is the pos_encoding function in modules.py correct?

The description of the video links to [this](https://machinelearningmastery.com/a-gentle-introduction-to-positional-encoding-in-transformer-models-part-1/) article for a description of the positional encoding for timesteps. However, i've noticed a difference between how the function is implemented in this repo, and the definition in the article. In the article, the sin and cos terms are interleaved such that indexes corresponding to 2i are sin, and indexes corresponding to 2i + 1 are cos. In the code, afaict instead the cos terms are just appended to the sin terms rather than interleaved.

I figure maybe this doesn't matter and that the neural network / training process is indifferent to which position in the vector different terms are in? But i asked ChatGPT and it seemed to think it mattered a lot, so I thought it was at least worth an ask.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is the pos_encoding function in modules.py correct? #45

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Is the pos_encoding function in modules.py correct? #45

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions