Skip to content

Commit 9c3a166

Browse files
authored
Merge pull request #54 from rohan-uiuc/programmatic_access
Add python API for programmatic access
2 parents 38a4fa7 + 1e9f8d7 commit 9c3a166

34 files changed

+2099
-1002
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
[![codecov](https://codecov.io/github/VectorInstitute/vector-inference/branch/develop/graph/badge.svg?token=NI88QSIGAC)](https://app.codecov.io/github/VectorInstitute/vector-inference/tree/develop)
99
![GitHub License](https://img.shields.io/github/license/VectorInstitute/vector-inference)
1010

11-
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](vec_inf/cli/_helper.py), [`cli/_config.py`](vec_inf/cli/_config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.
11+
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](vec_inf/shared/utils.py), [`shared/config.py`](vec_inf/shared/config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.
1212

1313
## Installation
1414
If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:

docs/source/conf.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77

88
import os
99
import sys
10-
from typing import List
1110

1211

1312
sys.path.insert(0, os.path.abspath("../../vec_inf"))
@@ -51,8 +50,16 @@
5150
copybutton_prompt_text = r">>> |\.\.\. "
5251
copybutton_prompt_is_regexp = True
5352

53+
apidoc_module_dir = "../../vec_inf"
54+
apidoc_excluded_paths = ["tests", "cli", "shared"]
55+
exclude_patterns = ["reference/api/vec_inf.rst"]
56+
apidoc_output_dir = "reference/api"
57+
apidoc_separate_modules = True
58+
apidoc_extra_args = ["-f", "-M", "-T", "--implicit-namespaces"]
59+
suppress_warnings = ["ref.python"]
60+
5461
intersphinx_mapping = {
55-
"python": ("https://docs.python.org/3.9/", None),
62+
"python": ("https://docs.python.org/3.10/", None),
5663
}
5764

5865
# Add any paths that contain templates here, relative to this directory.
@@ -61,7 +68,7 @@
6168
# List of patterns, relative to source directory, that match files and
6269
# directories to ignore when looking for source files.
6370
# This pattern also affects html_static_path and html_extra_path.
64-
exclude_patterns: List[str] = []
71+
exclude_patterns = ["reference/api/vec_inf.rst"]
6572

6673
# -- Options for Markdown files ----------------------------------------------
6774
#

docs/source/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,11 @@ hide-toc: true
88
:hidden:
99
1010
user_guide
11+
reference/api/index
1112
1213
```
1314

14-
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_helper.py), [`cli/_config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.
15+
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/utils.py), [`shared/config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.
1516

1617
## Installation
1718

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Python API
2+
==========
3+
4+
This section documents the Python API for the `vec_inf` package.
5+
6+
.. toctree::
7+
:maxdepth: 4
8+
9+
vec_inf.api
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
vec\_inf.api.client module
2+
==========================
3+
4+
.. automodule:: vec_inf.api.client
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
vec\_inf.api.models module
2+
==========================
3+
4+
.. automodule:: vec_inf.api.models
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
vec\_inf.api package
2+
====================
3+
4+
.. automodule:: vec_inf.api
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
8+
9+
Submodules
10+
----------
11+
12+
.. toctree::
13+
:maxdepth: 4
14+
15+
vec_inf.api.client
16+
vec_inf.api.models
17+
vec_inf.api.utils
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
vec\_inf.api.utils module
2+
=========================
3+
4+
.. automodule:: vec_inf.api.utils
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
vec\_inf package
2+
================
3+
4+
.. automodule:: vec_inf
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
8+
9+
Subpackages
10+
-----------
11+
12+
.. toctree::
13+
:maxdepth: 4
14+
15+
vec_inf.api

docs/source/user_guide.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# User Guide
22

3-
## Usage
3+
## CLI Usage
44

55
### `launch` command
66

@@ -17,7 +17,7 @@ You should see an output like the following:
1717

1818
#### Overrides
1919

20-
Models that are already supported by `vec-inf` would be launched using the [default parameters](vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:
20+
Models that are already supported by `vec-inf` would be launched using the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:
2121

2222
```bash
2323
vec-inf launch Meta-Llama-3.1-8B-Instruct --qos <new_qos>
@@ -29,7 +29,7 @@ You can also launch your own custom model as long as the model architecture is [
2929
* Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT` ($MODEL_VARIANT is OPTIONAL).
3030
* Your model weights directory should contain HuggingFace format weights.
3131
* You should specify your model configuration by:
32-
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
32+
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
3333
* Using launch command options to specify your model setup.
3434
* For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).
3535

@@ -179,3 +179,10 @@ If you want to run inference from your local device, you can open a SSH tunnel t
179179
ssh -L 8081:172.17.8.29:8081 username@v.vectorinstitute.ai -N
180180
```
181181
Where the last number in the URL is the GPU number (gpu029 in this case). The example provided above is for the vector cluster, change the variables accordingly for your environment
182+
183+
## Python API Usage
184+
185+
You can also use the `vec_inf` Python API to launch and manage inference servers.
186+
187+
Check out the [Python API documentation](reference/api/index) for more details. There
188+
are also Python API usage examples in the [`examples`](https://github.com/VectorInstitute/vector-inference/tree/develop/examples/api) folder.

0 commit comments

Comments
 (0)