Skip to content

CUDA: Accelerate MXFP4 table lookup using __byte_perm (#15451) #230

CUDA: Accelerate MXFP4 table lookup using __byte_perm (#15451)

CUDA: Accelerate MXFP4 table lookup using __byte_perm (#15451) #230

Job Run time
7m 17s
13m 7s
57m 41s
11m 21s
10m 16s
7m 27s
15m 28s
3m 30s
4m 21s
6m 21s
18m 17s
11m 1s
7m 3s
9m 27s
5m 12s
4m 40s
9m 47s
8m 35s
2m 22s
14m 23s
5m 48s
9m 45s
2m 43s
19m 21s
48m 14s
9m 18s
5m 38s
3m 22s
7m 59s
13m 45s
7m 7s
1m 55s
1m 54s
20m 6s
9m 31s
21m 21s
3m 23s
13m 10s
4m 46s
4m 56s
7m 35s
3m 7s
7h 32m 20s