You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Standardizes parameter naming across attention functions
Updates parameter names to use consistent naming conventions across CUDA, Triton, and Flex attention implementations.
Changes 'softmax_scale' to 'scale' and converts positional arguments to keyword arguments for better API consistency and clarity.
Fixes tensor dimension ordering in Flex attention by adding transpose operations to match expected input format.
0 commit comments