Skip to content

Commit 152c73a

Browse files
committed
Adds flash sparse attention interface
Enables calling sparse Flash attention CUDA kernels through custom autograd helpers. Registers fake implementations and padding logic so torch.compile stays compatible with varying head shapes.
1 parent 6bf01c4 commit 152c73a

File tree

1 file changed

+760
-0
lines changed

1 file changed

+760
-0
lines changed

0 commit comments

Comments
 (0)