File tree Expand file tree Collapse file tree 2 files changed +89
-12
lines changed
Expand file tree Collapse file tree 2 files changed +89
-12
lines changed Original file line number Diff line number Diff line change @@ -24,12 +24,6 @@ More profiling metrics coming soon!
2424| [ ` CodeLlama-70b-hf ` ] ( https://huggingface.co/meta-llama/CodeLlama-70b-hf ) | 4x a40 | - tokens/s | - tokens/s |
2525| [ ` CodeLlama-70b-Instruct-hf ` ] ( https://huggingface.co/meta-llama/CodeLlama-70b-Instruct-hf ) | 4x a40 | - tokens/s | - tokens/s |
2626
27- ### [ Databricks: DBRX] ( https://huggingface.co/collections/databricks/dbrx-6601c0852a0cdd3c59f71962 )
28-
29- | Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
30- | :----------:| :----------:| :----------:| :----------:|
31- | [ ` dbrx-instruct ` ] ( https://huggingface.co/databricks/dbrx-instruct ) | 8x a40 (2 nodes, 4 a40/node) | 107 tokens/s | 904 tokens/s |
32-
3327### [ Google: Gemma 2] ( https://huggingface.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315 )
3428
3529| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
@@ -104,12 +98,6 @@ More profiling metrics coming soon!
10498| :----------:| :----------:| :----------:| :----------:|
10599| [ ` Phi-3-medium-128k-instruct ` ] ( https://huggingface.co/microsoft/Phi-3-medium-128k-instruct ) | 2x a40 | - tokens/s | - tokens/s |
106100
107- ### [ Aaditya Ura: Llama3-OpenBioLLM] ( https://huggingface.co/aaditya/Llama3-OpenBioLLM-70B )
108-
109- | Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
110- | :----------:| :----------:| :----------:| :----------:|
111- | [ ` Llama3-OpenBioLLM-70B ` ] ( https://huggingface.co/aaditya/Llama3-OpenBioLLM-70B ) | 4x a40 | - tokens/s | - tokens/s |
112-
113101### [ Nvidia: Llama-3.1-Nemotron] ( https://huggingface.co/collections/nvidia/llama-31-nemotron-70b-670e93cd366feea16abc13d8 )
114102
115103| Variant | Suggested resource allocation | Avg prompt throughput | Avg generation throughput |
You can’t perform that action at this time.
0 commit comments