You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The model would be launched using the [default parameters](vec_inf/models/models.csv), you can override these values by providing additional parameters, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html). You will need to specify model launching parameters for custom models.
21
+
The model would be launched using the [default parameters](vec_inf/models/models.csv), you can override these values by providing additional parameters, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), and make sure to follow the instructions below:
22
+
* Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT`.
23
+
* Your model weights directory should contain HF format weights.
24
+
* The following launch parameters will conform to default value if not specified: `--max-num-seqs`, `--partition`, `--data-type`, `--venv`, `--log-dir`, `--model-weights-parent-dir`, `--pipeline-parallelism`. All other launch parameters need to be specified for custom models.
25
+
* Example for setting the model weights parent directory: `--model-weights-parent-dir /h/user_name/my_weights`.
26
+
* For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).
21
27
28
+
### `status` command
22
29
You can check the inference server status by providing the Slurm job ID to the `status` command:
23
30
```bash
24
31
vec-inf status 13014393
@@ -38,6 +45,7 @@ There are 5 possible states:
38
45
39
46
Note that the base URL is only available when model is in `READY` state, and if you've changed the Slurm log directory path, you also need to specify it when using the `status` command.
40
47
48
+
### `metrics` command
41
49
Once your server is ready, you can check performance metrics by providing the Slurm job ID to the `metrics` command:
42
50
```bash
43
51
vec-inf metrics 13014393
@@ -47,13 +55,15 @@ And you will see the performance metrics streamed to your console, note that the
0 commit comments