Hello :) In the lines https://github.com/explainingai-code/DiT-PyTorch/blob/ed9c0bd29f2c2b2a64fad8c5b759b834f8c1c4c5/model/transformer_layer.py#L61C1-L70C82 it seems you are not implementing a residual connection here? Is this on purpose? Cheers