Overview
- Flash attention is now enabled by default
- Performance improvements
M1 Pro
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M1 Pro |
METAL |
tiny |
1 |
0 |
32.44 |
1.71 |
0.43 |
0.04 |
8a67c55 |
| M1 Pro |
METAL |
base |
1 |
0 |
63.54 |
2.62 |
0.71 |
0.06 |
8a67c55 |
| M1 Pro |
METAL |
small |
1 |
0 |
200.30 |
5.34 |
1.72 |
0.17 |
8a67c55 |
| M1 Pro |
METAL |
medium |
1 |
0 |
580.06 |
11.71 |
4.18 |
0.45 |
8a67c55 |
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M1 Pro |
METAL |
tiny |
1 |
1 |
22.09 |
1.84 |
0.43 |
0.03 |
8a67c55 |
| M1 Pro |
METAL |
base |
1 |
1 |
40.57 |
2.22 |
0.44 |
0.04 |
8a67c55 |
| M1 Pro |
METAL |
small |
1 |
1 |
135.15 |
4.23 |
0.95 |
0.12 |
8a67c55 |
| M1 Pro |
METAL |
medium |
1 |
1 |
395.18 |
9.14 |
2.21 |
0.30 |
8a67c55 |
M2 Ultra
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M2 ULTRA |
METAL |
tiny |
1 |
0 |
8.63 |
1.09 |
0.27 |
0.01 |
b57b9d3 |
| M2 ULTRA |
METAL |
tiny-q5_0 |
1 |
0 |
9.04 |
1.06 |
0.28 |
0.01 |
b57b9d3 |
| M2 ULTRA |
METAL |
tiny-q5_1 |
1 |
0 |
8.98 |
1.06 |
0.28 |
0.01 |
b57b9d3 |
| M2 ULTRA |
METAL |
tiny-q8_0 |
1 |
0 |
8.69 |
1.06 |
0.27 |
0.01 |
b57b9d3 |
| M2 ULTRA |
METAL |
base |
1 |
0 |
15.39 |
1.54 |
0.43 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
base-q5_0 |
1 |
0 |
16.50 |
1.50 |
0.42 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
base-q5_1 |
1 |
0 |
16.45 |
1.49 |
0.43 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
base-q8_0 |
1 |
0 |
15.62 |
1.51 |
0.42 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
small |
1 |
0 |
45.99 |
2.99 |
0.90 |
0.05 |
b57b9d3 |
| M2 ULTRA |
METAL |
small-q5_0 |
1 |
0 |
50.65 |
2.98 |
0.92 |
0.06 |
b57b9d3 |
| M2 ULTRA |
METAL |
small-q5_1 |
1 |
0 |
50.74 |
2.96 |
0.92 |
0.06 |
b57b9d3 |
| M2 ULTRA |
METAL |
small-q8_0 |
1 |
0 |
47.16 |
2.83 |
0.89 |
0.06 |
b57b9d3 |
| M2 ULTRA |
METAL |
medium |
1 |
0 |
132.78 |
6.46 |
2.02 |
0.13 |
b57b9d3 |
| M2 ULTRA |
METAL |
medium-q5_0 |
1 |
0 |
149.35 |
6.11 |
2.09 |
0.14 |
b57b9d3 |
| M2 ULTRA |
METAL |
medium-q5_1 |
1 |
0 |
149.11 |
6.09 |
2.11 |
0.14 |
b57b9d3 |
| M2 ULTRA |
METAL |
medium-q8_0 |
1 |
0 |
137.37 |
6.05 |
2.03 |
0.13 |
b57b9d3 |
| M2 ULTRA |
METAL |
medium-dis |
1 |
0 |
121.60 |
0.90 |
0.25 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v2 |
1 |
0 |
231.19 |
9.40 |
3.10 |
0.22 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v2-q5_0 |
1 |
0 |
265.90 |
8.98 |
3.11 |
0.25 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v2-q5_1 |
1 |
0 |
265.18 |
8.92 |
3.13 |
0.25 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v2-q8_0 |
1 |
0 |
240.23 |
9.06 |
2.98 |
0.23 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v2-dis |
1 |
0 |
210.25 |
0.99 |
0.28 |
0.02 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v3-turbo |
1 |
0 |
211.72 |
1.52 |
0.46 |
0.03 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v3-turbo-q5_0 |
1 |
0 |
242.17 |
1.40 |
0.47 |
0.04 |
b57b9d3 |
| M2 ULTRA |
METAL |
large-v3-turbo-q8_0 |
1 |
0 |
219.75 |
1.40 |
0.45 |
0.04 |
b57b9d3 |
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M2 ULTRA |
METAL |
tiny |
1 |
1 |
6.28 |
0.96 |
0.22 |
0.01 |
a77d11d |
| M2 ULTRA |
METAL |
tiny-q5_0 |
1 |
1 |
6.69 |
0.92 |
0.22 |
0.01 |
a77d11d |
| M2 ULTRA |
METAL |
tiny-q5_1 |
1 |
1 |
6.67 |
0.91 |
0.22 |
0.01 |
a77d11d |
| M2 ULTRA |
METAL |
tiny-q8_0 |
1 |
1 |
6.34 |
0.92 |
0.21 |
0.01 |
a77d11d |
| M2 ULTRA |
METAL |
base |
1 |
1 |
10.77 |
1.30 |
0.32 |
0.02 |
a77d11d |
| M2 ULTRA |
METAL |
base-q5_0 |
1 |
1 |
11.84 |
1.23 |
0.33 |
0.02 |
a77d11d |
| M2 ULTRA |
METAL |
base-q5_1 |
1 |
1 |
11.95 |
1.24 |
0.33 |
0.02 |
a77d11d |
| M2 ULTRA |
METAL |
base-q8_0 |
1 |
1 |
11.14 |
1.23 |
0.32 |
0.02 |
a77d11d |
| M2 ULTRA |
METAL |
small |
1 |
1 |
32.12 |
2.43 |
0.65 |
0.04 |
a77d11d |
| M2 ULTRA |
METAL |
small-q5_0 |
1 |
1 |
36.95 |
2.42 |
0.68 |
0.04 |
a77d11d |
| M2 ULTRA |
METAL |
small-q5_1 |
1 |
1 |
37.40 |
2.42 |
0.68 |
0.04 |
a77d11d |
| M2 ULTRA |
METAL |
small-q8_0 |
1 |
1 |
33.48 |
2.30 |
0.65 |
0.04 |
a77d11d |
| M2 ULTRA |
METAL |
medium |
1 |
1 |
89.28 |
5.05 |
1.46 |
0.09 |
a77d11d |
| M2 ULTRA |
METAL |
medium-q5_0 |
1 |
1 |
105.24 |
4.89 |
1.48 |
0.11 |
a77d11d |
| M2 ULTRA |
METAL |
medium-q5_1 |
1 |
1 |
105.28 |
4.98 |
1.49 |
0.11 |
a77d11d |
| M2 ULTRA |
METAL |
medium-q8_0 |
1 |
1 |
93.61 |
4.89 |
1.43 |
0.10 |
a77d11d |
| M2 ULTRA |
METAL |
medium-dis |
1 |
1 |
78.44 |
0.81 |
0.20 |
0.01 |
a77d11d |
| M2 ULTRA |
METAL |
large-v2 |
1 |
1 |
165.69 |
7.50 |
2.16 |
0.17 |
a77d11d |
| M2 ULTRA |
METAL |
large-v2-q5_0 |
1 |
1 |
199.40 |
7.37 |
2.18 |
0.20 |
a77d11d |
| M2 ULTRA |
METAL |
large-v2-q5_1 |
1 |
1 |
199.29 |
7.37 |
2.21 |
0.20 |
a77d11d |
| M2 ULTRA |
METAL |
large-v2-q8_0 |
1 |
1 |
174.60 |
6.87 |
2.16 |
0.18 |
a77d11d |
| M2 ULTRA |
METAL |
large-v2-dis |
1 |
1 |
145.80 |
0.90 |
0.22 |
0.02 |
a77d11d |
| M2 ULTRA |
METAL |
large-v3-turbo |
1 |
1 |
146.98 |
1.31 |
0.34 |
0.03 |
a77d11d |
| M2 ULTRA |
METAL |
large-v3-turbo-q5_0 |
1 |
1 |
176.77 |
1.19 |
0.35 |
0.03 |
a77d11d |
| M2 ULTRA |
METAL |
large-v3-turbo-q8_0 |
1 |
1 |
154.73 |
1.20 |
0.33 |
0.03 |
a77d11d |
M4 Max
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M4 Max |
METAL |
tiny |
1 |
0 |
10.51 |
0.86 |
0.23 |
0.01 |
47fcd7d |
| M4 Max |
METAL |
tiny-q8_0 |
1 |
0 |
10.73 |
0.84 |
0.24 |
0.01 |
47fcd7d |
| M4 Max |
METAL |
base |
1 |
0 |
19.50 |
1.34 |
0.36 |
0.02 |
47fcd7d |
| M4 Max |
METAL |
base-q8_0 |
1 |
0 |
20.17 |
1.25 |
0.36 |
0.02 |
47fcd7d |
| M4 Max |
METAL |
small |
1 |
0 |
61.91 |
2.77 |
0.78 |
0.06 |
47fcd7d |
| M4 Max |
METAL |
small-q8_0 |
1 |
0 |
64.17 |
2.43 |
0.78 |
0.06 |
47fcd7d |
| M4 Max |
METAL |
medium |
1 |
0 |
181.50 |
6.44 |
1.85 |
0.15 |
47fcd7d |
| M4 Max |
METAL |
medium-q8_0 |
1 |
0 |
187.71 |
5.80 |
1.84 |
0.15 |
47fcd7d |
| M4 Max |
METAL |
large-v2 |
1 |
0 |
335.49 |
10.49 |
3.01 |
0.26 |
47fcd7d |
| M4 Max |
METAL |
large-v2-q8_0 |
1 |
0 |
349.89 |
8.65 |
2.97 |
0.27 |
47fcd7d |
| M4 Max |
METAL |
large-v3-turbo |
1 |
0 |
301.34 |
1.83 |
0.49 |
0.04 |
47fcd7d |
| CPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| M4 Max |
METAL |
tiny |
1 |
1 |
8.23 |
0.71 |
0.16 |
0.01 |
47fcd7d |
| M4 Max |
METAL |
tiny-q8_0 |
1 |
1 |
8.47 |
0.67 |
0.16 |
0.01 |
47fcd7d |
| M4 Max |
METAL |
base |
1 |
1 |
15.47 |
1.12 |
0.26 |
0.02 |
47fcd7d |
| M4 Max |
METAL |
base-q8_0 |
1 |
1 |
15.70 |
1.05 |
0.27 |
0.02 |
47fcd7d |
| M4 Max |
METAL |
small |
1 |
1 |
49.82 |
2.37 |
0.53 |
0.05 |
47fcd7d |
| M4 Max |
METAL |
small-q8_0 |
1 |
1 |
51.76 |
1.99 |
0.53 |
0.05 |
47fcd7d |
| M4 Max |
METAL |
medium |
1 |
1 |
147.76 |
5.52 |
1.27 |
0.12 |
47fcd7d |
| M4 Max |
METAL |
medium-q8_0 |
1 |
1 |
153.98 |
4.59 |
1.24 |
0.13 |
47fcd7d |
| M4 Max |
METAL |
large-v2 |
1 |
1 |
282.89 |
9.06 |
2.11 |
0.22 |
47fcd7d |
| M4 Max |
METAL |
large-v2-q8_0 |
1 |
1 |
296.43 |
7.44 |
2.09 |
0.23 |
47fcd7d |
| M4 Max |
METAL |
large-v3-turbo |
1 |
1 |
249.91 |
1.65 |
0.38 |
0.04 |
47fcd7d |
RTX 5090
| GPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| RTX 5090 |
CUDA |
tiny |
1 |
0 |
2.06 |
0.55 |
0.13 |
0.00 |
e4bf87b |
| RTX 5090 |
CUDA |
tiny-q8_0 |
1 |
0 |
2.50 |
0.55 |
0.14 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
base |
1 |
0 |
3.72 |
0.81 |
0.19 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
base-q8_0 |
1 |
0 |
4.35 |
0.79 |
0.20 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
small |
1 |
0 |
11.24 |
1.55 |
0.38 |
0.02 |
e4bf87b |
| RTX 5090 |
CUDA |
small-q8_0 |
1 |
0 |
12.69 |
1.69 |
0.40 |
0.02 |
e4bf87b |
| RTX 5090 |
CUDA |
medium |
1 |
0 |
31.16 |
3.19 |
0.79 |
0.04 |
e4bf87b |
| RTX 5090 |
CUDA |
medium-q8_0 |
1 |
0 |
32.74 |
3.43 |
0.80 |
0.05 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v2 |
1 |
0 |
50.09 |
4.55 |
1.14 |
0.05 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v2-q8_0 |
1 |
0 |
52.44 |
4.76 |
1.11 |
0.07 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v3-turbo |
1 |
0 |
46.78 |
0.70 |
0.17 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v3-turbo-q8_0 |
1 |
0 |
48.57 |
0.70 |
0.16 |
0.01 |
e4bf87b |
| GPU |
Config |
Model |
Th |
FA |
Enc. |
Dec. |
Bch5 |
PP |
Commit |
| RTX 5090 |
CUDA |
tiny |
1 |
1 |
1.39 |
0.47 |
0.11 |
0.00 |
e4bf87b |
| RTX 5090 |
CUDA |
tiny-q8_0 |
1 |
1 |
1.83 |
0.48 |
0.12 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
base |
1 |
1 |
2.17 |
0.70 |
0.16 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
base-q8_0 |
1 |
1 |
2.78 |
0.68 |
0.17 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
small |
1 |
1 |
5.02 |
1.33 |
0.32 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
small-q8_0 |
1 |
1 |
6.39 |
1.46 |
0.34 |
0.02 |
e4bf87b |
| RTX 5090 |
CUDA |
medium |
1 |
1 |
13.89 |
2.68 |
0.64 |
0.03 |
e4bf87b |
| RTX 5090 |
CUDA |
medium-q8_0 |
1 |
1 |
15.40 |
2.92 |
0.67 |
0.04 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v2 |
1 |
1 |
21.24 |
3.88 |
0.96 |
0.04 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v2-q8_0 |
1 |
1 |
23.54 |
4.01 |
0.93 |
0.05 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v3-turbo |
1 |
1 |
18.18 |
0.62 |
0.15 |
0.01 |
e4bf87b |
| RTX 5090 |
CUDA |
large-v3-turbo-q8_0 |
1 |
1 |
19.89 |
0.61 |
0.14 |
0.01 |
e4bf87b |
What's Changed
New Contributors
Full Changelog: v1.7.6...v1.8.0