Commit 6bf01c4
committed
Introduces Triton sparse attention kernels
Adds fused forward/backward kernels in Triton to accelerate sparse attention with masking, bias, and GQA support for PyTorch integration.1 parent ea4350a commit 6bf01c4
1 file changed
+1246
-0
lines changed
0 commit comments