Commit 6bf01c4

committed

Introduces Triton sparse attention kernels

Adds fused forward/backward kernels in Triton to accelerate sparse attention with masking, bias, and GQA support for PyTorch integration.

1 parent ea4350a commit 6bf01c4Copy full SHA for 6bf01c4

1 file changed

+1246

-0

lines changed

+1246

-0

lines changed

Comments

(0)