Skip to content

Commit 34ef192

Browse files
committed
Update models README
1 parent bd58a23 commit 34ef192

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

vec_inf/config/README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,14 +211,21 @@ More profiling metrics coming soon!
211211

212212
| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
213213
|:----------:|:----------:|:----------:|:----------:|
214-
| [`InternVL2_5-8B`](https://huggingface.co/OpenGVLab/InternVL2_5-8B) | 2x a40 | - tokens/s | - tokens/s |
214+
| [`InternVL2_5-8B`](https://huggingface.co/OpenGVLab/InternVL2_5-8B) | 1x a40 | - tokens/s | - tokens/s |
215+
| [`InternVL2_5-26B`](https://huggingface.co/OpenGVLab/InternVL2_5-26B) | 2x a40 | - tokens/s | - tokens/s |
216+
| [`InternVL2_5-38B`](https://huggingface.co/OpenGVLab/InternVL2_5-38B) | 4x a40 | - tokens/s | - tokens/s |
215217

216218
### [THUDM: GLM-4](https://huggingface.co/collections/THUDM/glm-4-665fcf188c414b03c2f7e3b7)
217219

218220
| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
219221
|:----------:|:----------:|:----------:|:----------:|
220222
| [`glm-4v-9b`](https://huggingface.co/THUDM/glm-4v-9b) | 1x a40 | - tokens/s | - tokens/s |
221223

224+
### [DeepSeek: DeepSeek-VL2](https://huggingface.co/collections/deepseek-ai/deepseek-vl2-675c22accc456d3beb4613ab)
225+
| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
226+
|:----------:|:----------:|:----------:|:----------:|
227+
| [`deepseek-vl2`](https://huggingface.co/deepseek-ai/deepseek-vl2) | 2x a40 | - tokens/s | - tokens/s |
228+
| [`deepseek-vl2-small`](https://huggingface.co/deepseek-ai/deepseek-vl2-small) | 1x a40 | - tokens/s | - tokens/s |
222229

223230

224231
## Text Embedding Models
@@ -247,3 +254,4 @@ More profiling metrics coming soon!
247254
| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
248255
|:----------:|:----------:|:----------:|:----------:|
249256
| [`Qwen2.5-Math-RM-72B`](https://huggingface.co/Qwen/Qwen2.5-Math-RM-72B) | 4x a40 | - tokens/s | - tokens/s |
257+
| [`Qwen2.5-Math-PRM-7B`](https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B) | 1x a40 | - tokens/s | - tokens/s |

0 commit comments

Comments
 (0)