You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
# Vector Inference: Easy inference on Slurm clusters
2
-
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update [`launch_server.sh`](vec-inf/launch_server.sh), [`vllm.slurm`](vec-inf/vllm.slurm), [`multinode_vllm.slurm`](vec-inf/multinode_vllm.slurm) and [`models.csv`](vec-inf/models/models.csv) accordingly.
2
+
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update [`launch_server.sh`](vec-inf/launch_server.sh), [`vllm.slurm`](vec-inf/vllm.slurm), [`multinode_vllm.slurm`](vec-inf/multinode_vllm.slurm) and [`models.csv`](vec-inf/models/models.csv) accordingly.
3
3
4
4
## Installation
5
5
If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:
@@ -17,7 +17,7 @@ You should see an output like the following:
The model would be launched using the [default parameters](vec-inf/models/models.csv), you can override these values by providing additional options, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), you'll need to specify all model launching related options to run a successful run.
20
+
The model would be launched using the [default parameters](vec-inf/models/models.csv), you can override these values by providing additional options, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), you'll need to specify all model launching related options to run a successful run.
21
21
22
22
You can check the inference server status by providing the Slurm job ID to the `status` command:
23
23
```bash
@@ -32,7 +32,7 @@ There are 5 possible states:
32
32
33
33
***PENDING**: Job submitted to Slurm, but not executed yet. Job pending reason will be shown.
34
34
***LAUNCHING**: Job is running but the server is not ready yet.
35
-
***READY**: Inference server running and ready to take requests.
35
+
***READY**: Inference server running and ready to take requests.
36
36
***FAILED**: Inference server in an unhealthy state. Job failed reason will be shown.
37
37
***SHUTDOWN**: Inference server is shutdown/cancelled.
Copy file name to clipboardExpand all lines: examples/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,4 +5,4 @@
5
5
-[`llm/completions.sh`](inference/llm/completions.sh): Bash example of sending completion requests to OpenAI compatible server, supports JSON mode
6
6
-[`vlm/vision_completions.py`](inference/vlm/vision_completions.py): Python example of sending chat completion requests with image attached to prompt to OpenAI compatible server for vision language models
7
7
-[`logits`](logits): Example for logits generation
8
-
-[`logits.py`](logits/logits.py): Python example of getting logits from hosted model.
8
+
-[`logits.py`](logits/logits.py): Python example of getting logits from hosted model.
0 commit comments