@@ -211,14 +211,21 @@ More profiling metrics coming soon!
211211
212212| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
213213| :----------:| :----------:| :----------:| :----------:|
214- | [ ` InternVL2_5-8B ` ] ( https://huggingface.co/OpenGVLab/InternVL2_5-8B ) | 2x a40 | - tokens/s | - tokens/s |
214+ | [ ` InternVL2_5-8B ` ] ( https://huggingface.co/OpenGVLab/InternVL2_5-8B ) | 1x a40 | - tokens/s | - tokens/s |
215+ | [ ` InternVL2_5-26B ` ] ( https://huggingface.co/OpenGVLab/InternVL2_5-26B ) | 2x a40 | - tokens/s | - tokens/s |
216+ | [ ` InternVL2_5-38B ` ] ( https://huggingface.co/OpenGVLab/InternVL2_5-38B ) | 4x a40 | - tokens/s | - tokens/s |
215217
216218### [ THUDM: GLM-4] ( https://huggingface.co/collections/THUDM/glm-4-665fcf188c414b03c2f7e3b7 )
217219
218220| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
219221| :----------:| :----------:| :----------:| :----------:|
220222| [ ` glm-4v-9b ` ] ( https://huggingface.co/THUDM/glm-4v-9b ) | 1x a40 | - tokens/s | - tokens/s |
221223
224+ ### [ DeepSeek: DeepSeek-VL2] ( https://huggingface.co/collections/deepseek-ai/deepseek-vl2-675c22accc456d3beb4613ab )
225+ | Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
226+ | :----------:| :----------:| :----------:| :----------:|
227+ | [ ` deepseek-vl2 ` ] ( https://huggingface.co/deepseek-ai/deepseek-vl2 ) | 2x a40 | - tokens/s | - tokens/s |
228+ | [ ` deepseek-vl2-small ` ] ( https://huggingface.co/deepseek-ai/deepseek-vl2-small ) | 1x a40 | - tokens/s | - tokens/s |
222229
223230
224231## Text Embedding Models
@@ -247,3 +254,4 @@ More profiling metrics coming soon!
247254| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
248255| :----------:| :----------:| :----------:| :----------:|
249256| [ ` Qwen2.5-Math-RM-72B ` ] ( https://huggingface.co/Qwen/Qwen2.5-Math-RM-72B ) | 4x a40 | - tokens/s | - tokens/s |
257+ | [ ` Qwen2.5-Math-PRM-7B ` ] ( https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B ) | 1x a40 | - tokens/s | - tokens/s |
0 commit comments