Skip to content

Commit f4810ae

Browse files
authored
Merge branch 'main' into bugfix/broken-throughput
2 parents b345546 + 8442168 commit f4810ae

File tree

2 files changed

+19
-1
lines changed

2 files changed

+19
-1
lines changed

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,11 @@ Example:
7474
>>> status = client.get_status(job_id)
7575
>>> if status.status == ModelStatus.READY:
7676
... print(f"Model is ready at {status.base_url}")
77+
>>> # Alternatively, use wait_until_ready which will either return a StatusResponse or throw a ServerError
78+
>>> try:
79+
>>> status = wait_until_ready(job_id)
80+
>>> except ServerError as e:
81+
>>> print(f"Model launch failed: {e}")
7782
>>> client.shutdown_model(job_id)
7883
```
7984

@@ -127,3 +132,16 @@ If you want to run inference from your local device, you can open a SSH tunnel t
127132
ssh -L 8081:10.1.1.29:8081 username@v.vectorinstitute.ai -N
128133
```
129134
The example provided above is for the Vector Killarney cluster, change the variables accordingly for your environment. The IP address for the compute nodes on Killarney follow `10.1.1.XX` pattern, where `XX` is the GPU number (`kn029` -> `29` in this example).
135+
136+
## Reference
137+
If you found Vector Inference useful in your research or applications, please cite using the following BibTeX template:
138+
```
139+
@software{vector_inference,
140+
title = {Vector Inference: Efficient LLM inference on Slurm clusters using vLLM},
141+
author = {Wang, Marshall},
142+
organization = {Vector Institute},
143+
year = {<YEAR_OF_RELEASE>},
144+
version = {<VERSION_TAG>},
145+
url = {https://github.com/VectorInstitute/vector-inference}
146+
}
147+
```

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "vec-inf"
3-
version = "0.7.0"
3+
version = "0.7.1"
44
description = "Efficient LLM inference on Slurm clusters using vLLM."
55
readme = "README.md"
66
authors = [{name = "Marshall Wang", email = "marshall.wang@vectorinstitute.ai"}]

0 commit comments

Comments
 (0)