VectorInstitute
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/conf.py‎
Lines changed: 10 additions & 3 deletions b/‎docs/source/conf.py‎
Lines changed: 10 additions & 3 deletions
diff --git a/‎docs/source/index.md‎
Lines changed: 2 additions & 1 deletion b/‎docs/source/index.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/source/reference/api/index.rst‎
Lines changed: 9 additions & 0 deletions b/‎docs/source/reference/api/index.rst‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎docs/source/reference/api/vec_inf.api.client.rst‎
Lines changed: 7 additions & 0 deletions b/‎docs/source/reference/api/vec_inf.api.client.rst‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/source/reference/api/vec_inf.api.models.rst‎
Lines changed: 7 additions & 0 deletions b/‎docs/source/reference/api/vec_inf.api.models.rst‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/source/reference/api/vec_inf.api.rst‎
Lines changed: 17 additions & 0 deletions b/‎docs/source/reference/api/vec_inf.api.rst‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎docs/source/reference/api/vec_inf.api.utils.rst‎
Lines changed: 7 additions & 0 deletions b/‎docs/source/reference/api/vec_inf.api.utils.rst‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/source/reference/api/vec_inf.rst‎
Lines changed: 15 additions & 0 deletions b/‎docs/source/reference/api/vec_inf.rst‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/source/user_guide.md‎
Lines changed: 10 additions & 3 deletions b/‎docs/source/user_guide.md‎
Lines changed: 10 additions & 3 deletions
@@ -8,7 +8,7 @@
 [![codecov](https://codecov.io/github/VectorInstitute/vector-inference/branch/develop/graph/badge.svg?token=NI88QSIGAC)](https://app.codecov.io/github/VectorInstitute/vector-inference/tree/develop)
 ![GitHub License](https://img.shields.io/github/license/VectorInstitute/vector-inference)
 
-This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](vec_inf/cli/_helper.py), [`cli/_config.py`](vec_inf/cli/_config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.
+This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](vec_inf/shared/utils.py), [`shared/config.py`](vec_inf/shared/config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.
 
 ## Installation
 If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:
 
@@ -7,7 +7,6 @@
 
 import os
 import sys
-from typing import List
 
 
 sys.path.insert(0, os.path.abspath("../../vec_inf"))
@@ -51,8 +50,16 @@
 copybutton_prompt_text = r">>> |\.\.\. "
 copybutton_prompt_is_regexp = True
 
+apidoc_module_dir = "../../vec_inf"
+apidoc_excluded_paths = ["tests", "cli", "shared"]
+exclude_patterns = ["reference/api/vec_inf.rst"]
+apidoc_output_dir = "reference/api"
+apidoc_separate_modules = True
+apidoc_extra_args = ["-f", "-M", "-T", "--implicit-namespaces"]
+suppress_warnings = ["ref.python"]
+
 intersphinx_mapping = {
-    "python": ("https://docs.python.org/3.9/", None),
+    "python": ("https://docs.python.org/3.10/", None),
 }
 
 # Add any paths that contain templates here, relative to this directory.
@@ -61,7 +68,7 @@
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
-exclude_patterns: List[str] = []
+exclude_patterns = ["reference/api/vec_inf.rst"]
 
 # -- Options for Markdown files ----------------------------------------------
 #
 
@@ -8,10 +8,11 @@ hide-toc: true
 :hidden:
 
 user_guide
+reference/api/index
 
 ```
 
-This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_helper.py), [`cli/_config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.
+This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/utils.py), [`shared/config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.
 
 ## Installation
 
 
@@ -0,0 +1,9 @@
+Python API
+==========
+
+This section documents the Python API for the `vec_inf` package.
+
+.. toctree::
+   :maxdepth: 4
+
+   vec_inf.api
@@ -0,0 +1,7 @@
+vec\_inf.api.client module
+==========================
+
+.. automodule:: vec_inf.api.client
+   :members:
+   :undoc-members:
+   :show-inheritance:
@@ -0,0 +1,7 @@
+vec\_inf.api.models module
+==========================
+
+.. automodule:: vec_inf.api.models
+   :members:
+   :undoc-members:
+   :show-inheritance:
@@ -0,0 +1,17 @@
+vec\_inf.api package
+====================
+
+.. automodule:: vec_inf.api
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+Submodules
+----------
+
+.. toctree::
+   :maxdepth: 4
+
+   vec_inf.api.client
+   vec_inf.api.models
+   vec_inf.api.utils
@@ -0,0 +1,7 @@
+vec\_inf.api.utils module
+=========================
+
+.. automodule:: vec_inf.api.utils
+   :members:
+   :undoc-members:
+   :show-inheritance:
@@ -0,0 +1,15 @@
+vec\_inf package
+================
+
+.. automodule:: vec_inf
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+Subpackages
+-----------
+
+.. toctree::
+   :maxdepth: 4
+
+   vec_inf.api
@@ -1,6 +1,6 @@
 # User Guide
 
-## Usage
+## CLI Usage
 
 ### `launch` command
 
@@ -17,7 +17,7 @@ You should see an output like the following:
 
 #### Overrides
 
-Models that are already supported by `vec-inf` would be launched using the [default parameters](vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:
+Models that are already supported by `vec-inf` would be launched using the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:
 
 ```bash
 vec-inf launch Meta-Llama-3.1-8B-Instruct --qos <new_qos>
@@ -29,7 +29,7 @@ You can also launch your own custom model as long as the model architecture is [
 * Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT` ($MODEL_VARIANT is OPTIONAL).
 * Your model weights directory should contain HuggingFace format weights.
 * You should specify your model configuration by:
-  * Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
+  * Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
   * Using launch command options to specify your model setup.
 * For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).
 
@@ -179,3 +179,10 @@ If you want to run inference from your local device, you can open a SSH tunnel t
 ssh -L 8081:172.17.8.29:8081 username@v.vectorinstitute.ai -N
 ```
 Where the last number in the URL is the GPU number (gpu029 in this case). The example provided above is for the vector cluster, change the variables accordingly for your environment
+
+## Python API Usage
+
+You can also use the `vec_inf` Python API to launch and manage inference servers.
+
+Check out the [Python API documentation](reference/api/index) for more details. There
+are also Python API usage examples in the [`examples`](https://github.com/VectorInstitute/vector-inference/tree/develop/examples/api) folder.