Skip to content

Commit 46ef280

Browse files
authored
[Doc] Add model feature matrix table. (#4040)
### What this PR does / why we need it? Add model feature matrix table. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b Signed-off-by: menogrey <1299267905@qq.com>
1 parent 22286fc commit 46ef280

File tree

1 file changed

+67
-67
lines changed

1 file changed

+67
-67
lines changed

docs/source/user_guide/support_matrix/supported_models.md

Lines changed: 67 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -6,78 +6,78 @@ Get the latest info here: https://github.com/vllm-project/vllm-ascend/issues/160
66

77
### Generative Models
88

9-
| Model | Support | Note |
10-
|-------------------------------|-----------|----------------------------------------------------------------------|
11-
| DeepSeek V3/3.1 || |
12-
| DeepSeek V3.2 EXP || |
13-
| DeepSeek R1 || |
14-
| DeepSeek Distill (Qwen/LLama) || |
15-
| Qwen3 || |
16-
| Qwen3-based || |
17-
| Qwen3-Coder || |
18-
| Qwen3-Moe || |
19-
| Qwen3-Next || |
20-
| Qwen2.5 || |
21-
| Qwen2 || |
22-
| Qwen2-based || |
23-
| QwQ-32B || |
24-
| LLama2/3/3.1 || |
25-
| Internlm || [#1962](https://github.com/vllm-project/vllm-ascend/issues/1962) |
26-
| Baichuan || |
27-
| Baichuan2 || |
28-
| Phi-4-mini || |
29-
| MiniCPM || |
30-
| MiniCPM3 || |
31-
| Ernie4.5 || |
32-
| Ernie4.5-Moe || |
33-
| Gemma-2 || |
34-
| Gemma-3 || |
35-
| Phi-3/4 || |
36-
| Mistral/Mistral-Instruct || |
37-
| GLM-4.5 || |
38-
| GLM-4 || [#2255](https://github.com/vllm-project/vllm-ascend/issues/2255) |
39-
| GLM-4-0414 || [#2258](https://github.com/vllm-project/vllm-ascend/issues/2258) |
40-
| ChatGLM || [#554](https://github.com/vllm-project/vllm-ascend/issues/554) |
41-
| DeepSeek V2.5 | 🟡 | Need test |
42-
| Mllama | 🟡 | Need test |
43-
| MiniMax-Text | 🟡 | Need test |
9+
| Model | Support | Note | BF16 | Supported Hardware | W8A8 | Chunked Prefill | Automatic Prefix Cache | LoRA | Speculative Decoding | Async Scheduling | Tensor Parallel | Pipeline Parallel | Expert Parallel | Data Parallel | Prefill-decode Disaggregation | Piecewise AclGraph | Fullgraph AclGraph | max-model-len | MLP Weight Prefetch | Doc |
10+
|-------------------------------|-----------|----------------------------------------------------------------------|------|--------------------|------|-----------------|------------------------|------|----------------------|------------------|-----------------|-------------------|-----------------|---------------|-------------------------------|--------------------|--------------------|---------------|---------------------|-----|
11+
| DeepSeek V3/3.1 || |||||||||||||||||||
12+
| DeepSeek V3.2 EXP || || A2/A3 |||||| |||||| | | 163840 | | [DeepSeek-V3.2-Exp tutorial](../../tutorials/DeepSeek-V3.2-Exp.md) |
13+
| DeepSeek R1 || |||||||||||||||||||
14+
| DeepSeek Distill (Qwen/LLama) || |||||||||||||||||||
15+
| Qwen3 || |||||||||||||||||||
16+
| Qwen3-based || |||||||||||||||||||
17+
| Qwen3-Coder || |||||||||||||||||||
18+
| Qwen3-Moe || |||||||||||||||||||
19+
| Qwen3-Next || |||||||||||||||||||
20+
| Qwen2.5 || |||||||||||||||||||
21+
| Qwen2 || |||||||||||||||||||
22+
| Qwen2-based || |||||||||||||||||||
23+
| QwQ-32B || |||||||||||||||||||
24+
| LLama2/3/3.1 || |||||||||||||||||||
25+
| Internlm || [#1962](https://github.com/vllm-project/vllm-ascend/issues/1962) |||||||||||||||||||
26+
| Baichuan || |||||||||||||||||||
27+
| Baichuan2 || |||||||||||||||||||
28+
| Phi-4-mini || |||||||||||||||||||
29+
| MiniCPM || |||||||||||||||||||
30+
| MiniCPM3 || |||||||||||||||||||
31+
| Ernie4.5 || |||||||||||||||||||
32+
| Ernie4.5-Moe || |||||||||||||||||||
33+
| Gemma-2 || |||||||||||||||||||
34+
| Gemma-3 || |||||||||||||||||||
35+
| Phi-3/4 || |||||||||||||||||||
36+
| Mistral/Mistral-Instruct || |||||||||||||||||||
37+
| GLM-4.5 || |||||||||||||||||||
38+
| GLM-4 || [#2255](https://github.com/vllm-project/vllm-ascend/issues/2255) |||||||||||||||||||
39+
| GLM-4-0414 || [#2258](https://github.com/vllm-project/vllm-ascend/issues/2258) |||||||||||||||||||
40+
| ChatGLM || [#554](https://github.com/vllm-project/vllm-ascend/issues/554) |||||||||||||||||||
41+
| DeepSeek V2.5 | 🟡 | Need test |||||||||||||||||||
42+
| Mllama | 🟡 | Need test |||||||||||||||||||
43+
| MiniMax-Text | 🟡 | Need test |||||||||||||||||||
4444

4545
### Pooling Models
4646

47-
| Model | Support | Note |
48-
|-------------------------------|-----------|----------------------------------------------------------------------|
49-
| Qwen3-Embedding || |
50-
| Molmo || [1942](https://github.com/vllm-project/vllm-ascend/issues/1942) |
51-
| XLM-RoBERTa-based || [1960](https://github.com/vllm-project/vllm-ascend/issues/1960) |
47+
| Model | Support | Note | BF16 | Supported Hardware | W8A8 | Chunked Prefill | Automatic Prefix Cache | LoRA | Speculative Decoding | Async Scheduling | Tensor Parallel | Pipeline Parallel | Expert Parallel | Data Parallel | Prefill-decode Disaggregation | Piecewise AclGraph | Fullgraph AclGraph | max-model-len | MLP Weight Prefetch | Doc |
48+
|-------------------------------|-----------|----------------------------------------------------------------------|------|--------------------|------|-----------------|------------------------|------|----------------------|------------------|-----------------|-------------------|-----------------|---------------|-------------------------------|--------------------|--------------------|---------------|---------------------|-----|
49+
| Qwen3-Embedding || |||||||||||||||||||
50+
| Molmo || [1942](https://github.com/vllm-project/vllm-ascend/issues/1942) |||||||||||||||||||
51+
| XLM-RoBERTa-based || [1960](https://github.com/vllm-project/vllm-ascend/issues/1960) |||||||||||||||||||
5252

5353
## Multimodal Language Models
5454

5555
### Generative Models
5656

57-
| Model | Support | Note |
58-
|--------------------------------|---------------|----------------------------------------------------------------------|
59-
| Qwen2-VL || |
60-
| Qwen2.5-VL || |
61-
| Qwen3-VL || |
62-
| Qwen3-VL-MOE || |
63-
| Qwen2.5-Omni || [1760](https://github.com/vllm-project/vllm-ascend/issues/1760) |
64-
| QVQ || |
65-
| LLaVA 1.5/1.6 || [1962](https://github.com/vllm-project/vllm-ascend/issues/1962) |
66-
| InternVL2 || |
67-
| InternVL2.5 || |
68-
| Qwen2-Audio || |
69-
| Aria || |
70-
| LLaVA-Next || |
71-
| LLaVA-Next-Video || |
72-
| MiniCPM-V || |
73-
| Mistral3 || |
74-
| Phi-3-Vison/Phi-3.5-Vison || |
75-
| Gemma3 || |
76-
| LLama4 || [1972](https://github.com/vllm-project/vllm-ascend/issues/1972) |
77-
| LLama3.2 || [1972](https://github.com/vllm-project/vllm-ascend/issues/1972) |
78-
| Keye-VL-8B-Preview || [1963](https://github.com/vllm-project/vllm-ascend/issues/1963) |
79-
| Florence-2 || [2259](https://github.com/vllm-project/vllm-ascend/issues/2259) |
80-
| GLM-4V || [2260](https://github.com/vllm-project/vllm-ascend/issues/2260) |
81-
| InternVL2.0/2.5/3.0<br>InternVideo2.5/Mono-InternVL || [2064](https://github.com/vllm-project/vllm-ascend/issues/2064) |
82-
| Whisper || [2262](https://github.com/vllm-project/vllm-ascend/issues/2262) |
83-
| Ultravox | 🟡 | Need test |
57+
| Model | Support | Note | BF16 | Supported Hardware | W8A8 | Chunked Prefill | Automatic Prefix Cache | LoRA | Speculative Decoding | Async Scheduling | Tensor Parallel | Pipeline Parallel | Expert Parallel | Data Parallel | Prefill-decode Disaggregation | Piecewise AclGraph | Fullgraph AclGraph | max-model-len | MLP Weight Prefetch | Doc |
58+
|--------------------------------|---------------|----------------------------------------------------------------------|------|--------------------|------|-----------------|------------------------|------|----------------------|------------------|-----------------|-------------------|-----------------|---------------|-------------------------------|--------------------|--------------------|---------------|---------------------|-----|
59+
| Qwen2-VL || |||||||||||||||||||
60+
| Qwen2.5-VL || |||||||||||||||||||
61+
| Qwen3-VL || |||||||||||||||||||
62+
| Qwen3-VL-MOE || |||||||||||||||||||
63+
| Qwen2.5-Omni || [1760](https://github.com/vllm-project/vllm-ascend/issues/1760) |||||||||||||||||||
64+
| QVQ || |||||||||||||||||||
65+
| LLaVA 1.5/1.6 || [1962](https://github.com/vllm-project/vllm-ascend/issues/1962) |||||||||||||||||||
66+
| InternVL2 || |||||||||||||||||||
67+
| InternVL2.5 || |||||||||||||||||||
68+
| Qwen2-Audio || |||||||||||||||||||
69+
| Aria || |||||||||||||||||||
70+
| LLaVA-Next || |||||||||||||||||||
71+
| LLaVA-Next-Video || |||||||||||||||||||
72+
| MiniCPM-V || |||||||||||||||||||
73+
| Mistral3 || |||||||||||||||||||
74+
| Phi-3-Vison/Phi-3.5-Vison || |||||||||||||||||||
75+
| Gemma3 || |||||||||||||||||||
76+
| LLama4 || [1972](https://github.com/vllm-project/vllm-ascend/issues/1972) |||||||||||||||||||
77+
| LLama3.2 || [1972](https://github.com/vllm-project/vllm-ascend/issues/1972) |||||||||||||||||||
78+
| Keye-VL-8B-Preview || [1963](https://github.com/vllm-project/vllm-ascend/issues/1963) |||||||||||||||||||
79+
| Florence-2 || [2259](https://github.com/vllm-project/vllm-ascend/issues/2259) |||||||||||||||||||
80+
| GLM-4V || [2260](https://github.com/vllm-project/vllm-ascend/issues/2260) |||||||||||||||||||
81+
| InternVL2.0/2.5/3.0<br>InternVideo2.5/Mono-InternVL || [2064](https://github.com/vllm-project/vllm-ascend/issues/2064) |||||||||||||||||||
82+
| Whisper || [2262](https://github.com/vllm-project/vllm-ascend/issues/2262) |||||||||||||||||||
83+
| Ultravox | 🟡 | Need test |||||||||||||||||||

0 commit comments

Comments
 (0)