Skip to content

Commit c89f198

Browse files
committed
Add model type field to models, added 2 text embedding models, updated list command based on model type, updated READMEs, removed debugging code
1 parent f5d2260 commit c89f198

File tree

6 files changed

+111
-87
lines changed

6 files changed

+111
-87
lines changed

README.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,15 @@ There are 5 possible states:
3838

3939
Note that the base URL is only available when model is in `READY` state, and if you've changed the Slurm log directory path, you also need to specify it when using the `status` command.
4040

41+
Once your server is ready, you can check performance metrics by providing the Slurm job ID to the `metrics` command:
42+
```bash
43+
vec-inf metrics
44+
```
45+
46+
And you will see the performance metrics streamed to your console, note that the metrics are updated with a 10-second interval.
47+
48+
<img width="400" alt="metrics_img" src="https://github.com/user-attachments/assets/6732215b-96f3-407c-ba45-6334b2061706">
49+
4150
Finally, when you're finished using a model, you can shut it down by providing the Slurm job ID:
4251
```bash
4352
vec-inf shutdown 13014393
@@ -49,13 +58,13 @@ You call view the full list of available models by running the `list` command:
4958
```bash
5059
vec-inf list
5160
```
52-
<img width="1200" alt="list_img" src="https://github.com/user-attachments/assets/a4f0d896-989d-43bf-82a2-6a6e5d0d288f">
61+
<img width="1200" alt="list_img" src="https://github.com/user-attachments/assets/50b12ca4-2adc-4b2b-8a40-543b6cda0b1a">
5362

5463
You can also view the default setup for a specific supported model by providing the model name, for example `Meta-Llama-3.1-70B-Instruct`:
5564
```bash
5665
vec-inf list Meta-Llama-3.1-70B-Instruct
5766
```
58-
<img width="400" alt="list_model_img" src="https://github.com/user-attachments/assets/5dec7a33-ba6b-490d-af47-4cf7341d0b42">
67+
<img width="400" alt="list_model_img" src="https://github.com/user-attachments/assets/30e42ab7-dde2-4d20-85f0-187adffefc3d">
5968

6069
`launch`, `list`, and `status` command supports `--json-mode`, where the command output would be structured as a JSON string.
6170

vec_inf/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
* `launch`: Specify a model family and other optional parameters to launch an OpenAI compatible inference server, `--json-mode` supported. Check [`here`](./models/README.md) for complete list of available options.
44
* `list`: List all available model names, `--json-mode` supported.
5+
* `metrics`: Streams performance metrics to the console.
56
* `status`: Check the model status by providing its Slurm job ID, `--json-mode` supported.
67
* `shutdown`: Shutdown a model by providing its Slurm job ID.
78

vec_inf/cli/_cli.py

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
from typing import Optional
44

55
import click
6+
import pandas as pd
67
from rich.columns import Columns
78
from rich.console import Console
89
from rich.live import Live
@@ -111,13 +112,12 @@ def launch(
111112
else:
112113
model_args = models_df.columns.tolist()
113114
model_args.remove("model_name")
115+
model_args.remove("model_type")
114116
for arg in model_args:
115117
if locals()[arg] is not None:
116118
renamed_arg = arg.replace("_", "-")
117119
launch_cmd += f" --{renamed_arg} {locals()[arg]}"
118120

119-
print(launch_cmd)
120-
121121
output = utils.run_bash_command(launch_cmd)
122122

123123
slurm_job_id = output.split(" ")[-1].strip().strip("\n")
@@ -242,17 +242,15 @@ def list(model_name: Optional[str] = None, json_mode: bool = False) -> None:
242242
"""
243243
List all available models, or get default setup of a specific model
244244
"""
245-
models_df = utils.load_models_df()
246245

247-
if model_name:
246+
def list_model(model_name: str, models_df: pd.DataFrame, json_mode: bool):
248247
if model_name not in models_df["model_name"].values:
249248
raise ValueError(f"Model name {model_name} not found in available models")
250249

251250
excluded_keys = {"venv", "log_dir"}
252251
model_row = models_df.loc[models_df["model_name"] == model_name]
253252

254253
if json_mode:
255-
# click.echo(model_row.to_json(orient='records'))
256254
filtered_model_row = model_row.drop(columns=excluded_keys, errors="ignore")
257255
click.echo(filtered_model_row.to_json(orient="records"))
258256
return
@@ -262,16 +260,32 @@ def list(model_name: Optional[str] = None, json_mode: bool = False) -> None:
262260
if key not in excluded_keys:
263261
table.add_row(key, str(value))
264262
CONSOLE.print(table)
265-
return
266263

267-
if json_mode:
268-
click.echo(models_df["model_name"].to_json(orient="records"))
269-
return
270-
panels = []
271-
for _, row in models_df.iterrows():
272-
styled_text = f"[magenta]{row['model_family']}[/magenta]-{row['model_variant']}"
273-
panels.append(Panel(styled_text, expand=True))
274-
CONSOLE.print(Columns(panels, equal=True))
264+
def list_all(models_df: pd.DataFrame, json_mode: bool):
265+
if json_mode:
266+
click.echo(models_df["model_name"].to_json(orient="records"))
267+
return
268+
panels = []
269+
model_type_colors = {
270+
"LLM": "cyan",
271+
"VLM": "blue",
272+
"Text Embedding": "purple",
273+
}
274+
custom_order = ["LLM", "VLM", "Text Embedding"]
275+
models_df["model_type"] = pd.Categorical(models_df["model_type"], categories=custom_order, ordered=True)
276+
models_df = models_df.sort_values(by="model_type")
277+
for _, row in models_df.iterrows():
278+
panel_color = model_type_colors.get(row["model_type"], "white")
279+
styled_text = f"[magenta]{row['model_family']}[/magenta]-{row['model_variant']}"
280+
panels.append(Panel(styled_text, expand=True, border_style=panel_color))
281+
CONSOLE.print(Columns(panels, equal=True))
282+
283+
models_df = utils.load_models_df()
284+
285+
if model_name:
286+
list_model(model_name, models_df, json_mode)
287+
else:
288+
list_all(models_df, json_mode)
275289

276290

277291
@cli.command("metrics")
@@ -283,7 +297,7 @@ def list(model_name: Optional[str] = None, json_mode: bool = False) -> None:
283297
)
284298
def metrics(slurm_job_id: int, log_dir: Optional[str] = None) -> None:
285299
"""
286-
Get metrics of a running model on the cluster
300+
Stream performance metrics to the console
287301
"""
288302
status_cmd = f"scontrol show job {slurm_job_id} --oneliner"
289303
output = utils.run_bash_command(status_cmd)

vec_inf/cli/_utils.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,7 @@ def load_default_args(models_df: pd.DataFrame, model_name: str) -> dict:
134134
row_data = models_df.loc[models_df["model_name"] == model_name]
135135
default_args = row_data.iloc[0].to_dict()
136136
default_args.pop("model_name")
137+
default_args.pop("model_type")
137138
return default_args
138139

139140

vec_inf/launch_server.sh

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ while [[ "$#" -gt 0 ]]; do
2222
shift
2323
done
2424

25-
required_vars=(model_family model_variant partition qos walltime num_nodes num_gpus max_model_len vocab_size, pipeline_parallelism)
25+
required_vars=(model_family model_variant partition qos walltime num_nodes num_gpus max_model_len vocab_size pipeline_parallelism)
2626

2727
for var in "$required_vars[@]"; do
2828
if [ -z "$!var" ]; then
29-
echo "Error: Missing required --$var//_/- argument."
29+
echo "Error: Missing required --$var argument."
3030
exit 1
3131
fi
3232
done
@@ -41,9 +41,6 @@ export NUM_GPUS=$num_gpus
4141
export VLLM_MAX_MODEL_LEN=$max_model_len
4242
export VLLM_MAX_LOGPROBS=$vocab_size
4343
export PIPELINE_PARALLELISM=$pipeline_parallelism
44-
45-
echo Pipeline Parallelism: $PIPELINE_PARALLELISM
46-
4744
# For custom models, the following are set to default if not specified
4845
export VLLM_DATA_TYPE="auto"
4946
export VENV_BASE="singularity"

0 commit comments

Comments
 (0)