🐛 Bug
We are seeing mixed-precision accuracy regressions for several large models, including mixtral and llama4.
We have narrowed the regression down to the change #9663.
To Reproduce
Will try the public tutorial: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd-inference/tutorials/trn1-llama3.1-70b-instruct-accuracy-eval-tutorial.html
Expected behavior
No regression in accuracy compared to 2.8
Environment
- Reproducible on XLA backend [CPU/TPU]: Neuron
- torch_xla version: 2.9
Additional context