Skip to content

Commit 6bf01c4

Browse files
committed
Introduces Triton sparse attention kernels
Adds fused forward/backward kernels in Triton to accelerate sparse attention with masking, bias, and GQA support for PyTorch integration.
1 parent ea4350a commit 6bf01c4

File tree

1 file changed

+1246
-0
lines changed

1 file changed

+1246
-0
lines changed

0 commit comments

Comments
 (0)