Skip to content

Commit 618e18f

Browse files
committed
Change command output screenshots to actual output tables except for list command for more consistent visual and hiding user names
1 parent f55a5f7 commit 618e18f

File tree

1 file changed

+81
-6
lines changed

1 file changed

+81
-6
lines changed

docs/user_guide.md

Lines changed: 81 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,28 @@ vec-inf launch Meta-Llama-3.1-8B-Instruct
1313
```
1414
You should see an output like the following:
1515

16-
<img width="600" alt="launch_img" src="https://github.com/user-attachments/assets/62fa818b-57dd-47de-b094-18aa91747f2d">
16+
```
17+
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
18+
┃ Job Config ┃ Value ┃
19+
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
20+
│ Slurm Job ID │ 16060964 │
21+
│ Job Name │ Meta-Llama-3.1-8B-Instruct │
22+
│ Model Type │ LLM │
23+
│ Vocabulary Size │ 128256 │
24+
│ Partition │ a40 │
25+
│ QoS │ m2 │
26+
│ Time Limit │ 08:00:00 │
27+
│ Num Nodes │ 1 │
28+
│ GPUs/Node │ 1 │
29+
│ CPUs/Task │ 16 │
30+
│ Memory/Node │ 64G │
31+
│ Model Weights Directory │ /model-weights/Meta-Llama-3.1-8B-Instruct │
32+
│ Log Directory │ /h/vi_user/.vec-inf-logs/Meta-Llama-3.1 │
33+
│ vLLM Arguments: │ │
34+
│ --max-model-len: │ 131072 │
35+
│ --max-num-seqs: │ 256 │
36+
└─────────────────────────┴───────────────────────────────────────────┘
37+
```
1738

1839
#### Overrides
1940

@@ -70,7 +91,7 @@ You would then set the `VEC_INF_CONFIG` path using:
7091
export VEC_INF_CONFIG=/h/<username>/my-model-config.yaml
7192
```
7293

73-
Note that there are other parameters that can also be added to the config but not shown in this example, check the [`ModelConfig`](vec_inf/client/config.py) for details.
94+
Note that there are other parameters that can also be added to the config but not shown in this example, check the [`ModelConfig`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/client/config.py) for details.
7495

7596
### `status` command
7697

@@ -82,11 +103,28 @@ vec-inf status 15373800
82103

83104
If the server is pending for resources, you should see an output like this:
84105

85-
<img width="400" alt="status_pending_img" src="https://github.com/user-attachments/assets/b659c302-eae1-4560-b7a9-14eb3a822a2f">
106+
```
107+
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
108+
┃ Job Status ┃ Value ┃
109+
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
110+
│ Model Name │ Meta-Llama-3.1-8B-Instruct │
111+
│ Model Status │ PENDING │
112+
│ Pending Reason │ Resources │
113+
│ Base URL │ UNAVAILABLE │
114+
└────────────────┴────────────────────────────┘
115+
```
86116

87117
When the server is ready, you should see an output like this:
88118

89-
<img width="400" alt="status_ready_img" src="https://github.com/user-attachments/assets/672986c2-736c-41ce-ac7c-1fb585cdcb0d">
119+
```
120+
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
121+
┃ Job Status ┃ Value ┃
122+
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
123+
│ Model Name │ Meta-Llama-3.1-8B-Instruct │
124+
│ Model Status │ READY │
125+
│ Base URL │ http://gpu042:8080/v1 │
126+
└──────────────┴────────────────────────────┘
127+
```
90128

91129
There are 5 possible states:
92130

@@ -107,7 +145,23 @@ vec-inf metrics 15373800
107145

108146
And you will see the performance metrics streamed to your console, note that the metrics are updated with a 2-second interval.
109147

110-
<img width="400" alt="metrics_img" src="https://github.com/user-attachments/assets/3ee143d0-1a71-4944-bbd7-4c3299bf0339">
148+
```
149+
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
150+
┃ Metric ┃ Value ┃
151+
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
152+
│ Prompt Throughput │ 10.9 tokens/s │
153+
│ Generation Throughput │ 34.2 tokens/s │
154+
│ Requests Running │ 1 reqs │
155+
│ Requests Waiting │ 0 reqs │
156+
│ Requests Swapped │ 0 reqs │
157+
│ GPU Cache Usage │ 0.1% │
158+
│ CPU Cache Usage │ 0.0% │
159+
│ Avg Request Latency │ 2.6 s │
160+
│ Total Prompt Tokens │ 441 tokens │
161+
│ Total Generation Tokens │ 1748 tokens │
162+
│ Successful Requests │ 14 reqs │
163+
└─────────────────────────┴─────────────────┘
164+
```
111165
112166
### `shutdown` command
113167
@@ -135,7 +189,28 @@ You can also view the default setup for a specific supported model by providing
135189
vec-inf list Meta-Llama-3.1-70B-Instruct
136190
```
137191

138-
<img width="500" alt="list_model_img" src="https://github.com/user-attachments/assets/34e53937-2d86-443e-85f6-34e408653ddb">
192+
```
193+
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
194+
┃ Model Config ┃ Value ┃
195+
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
196+
│ model_name │ Meta-Llama-3.1-8B-Instruct │
197+
│ model_family │ Meta-Llama-3.1 │
198+
│ model_variant │ 8B-Instruct │
199+
│ model_type │ LLM │
200+
│ gpus_per_node │ 1 │
201+
│ num_nodes │ 1 │
202+
│ cpus_per_task │ 16 │
203+
│ mem_per_node │ 64G │
204+
│ vocab_size │ 128256 │
205+
│ qos │ m2 │
206+
│ time │ 08:00:00 │
207+
│ partition │ a40 │
208+
│ model_weights_parent_dir │ /model-weights │
209+
│ vLLM Arguments: │ │
210+
│ --max-model-len: │ 131072 │
211+
│ --max-num-seqs: │ 256 │
212+
└──────────────────────────┴────────────────────────────┘
213+
```
139214

140215
`launch`, `list`, and `status` command supports `--json-mode`, where the command output would be structured as a JSON string.
141216

0 commit comments

Comments
 (0)