You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`vec_inf/client/slurm_vars.py`](vec_inf/client/slurm_vars.py), and the model config for cached model weights in [`vec_inf/config/models.yaml`](vec_inf/config/models.yaml) accordingly.
13
+
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, follow the instructions in [Installation](#installation).
14
14
15
15
## Installation
16
16
If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:
@@ -20,6 +20,11 @@ pip install vec-inf
20
20
```
21
21
Otherwise, we recommend using the provided [`Dockerfile`](Dockerfile) to set up your own environment with the package. The latest image has `vLLM` version `0.8.5.post1`.
22
22
23
+
If you'd like to use `vec-inf` on your own Slurm cluster, you would need to update the configuration files, there are 3 ways to do it:
24
+
* Clone the repository and update the `environment.yaml` and the `models.yaml` file in [`vec_inf/config`](vec_inf/config/), then install from source by running `pip install .`.
25
+
* The package would try to look for cached configuration files in your environment before using the default configuration. The default cached configuration directory path points to `/model-weights/vec-inf-shared`, you would need to create an `environment.yaml` and a `models.yaml` following the format of these files in [`vec_inf/config`](vec_inf/config/).
26
+
* The package would also look for an enviroment variable `VEC_INF_CONFIG_DIR`. You can put your `environment.yaml` and `models.yaml` in a directory of your choice and set the enviroment variable `VEC_INF_CONFIG_DIR` to point to that location.
27
+
23
28
## Usage
24
29
25
30
Vector Inference provides 2 user interfaces, a CLI and an API
@@ -61,7 +66,8 @@ You can also launch your own custom model as long as the model architecture is [
61
66
* Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT` ($MODEL_VARIANT is OPTIONAL).
62
67
* Your model weights directory should contain HuggingFace format weights.
63
68
* You should specify your model configuration by:
64
-
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
69
+
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_MODEL_CONFIG` (This one will supersede `VEC_INF_CONFIG_DIR` if that is also set). Check the [default parameters](vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
70
+
* Add your model configuration to the cached `models.yaml` in your cluster environment (if you have write access to the cached configuration directory).
65
71
* Using launch command options to specify your model setup.
66
72
* For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).
67
73
@@ -89,7 +95,7 @@ models:
89
95
--compilation-config: 3
90
96
```
91
97
92
-
You would then set the `VEC_INF_CONFIG` path using:
98
+
You would then set the `VEC_INF_MODEL_CONFIG` path using:
0 commit comments