Skip to content

Commit 75c1d1b

Browse files
author
Paweł Kędzia
committed
Merge branch 'features/refactor-new-repo'
# Conflicts: # README.md # services/workers/hosts/192.168.100.71/vllm/0/run-gemma-3-12b-it-vllm.sh # services/workers/hosts/192.168.100.71/vllm/1/run-gemma-3-12b-it-vllm.sh
2 parents 3eb12b7 + 7315d85 commit 75c1d1b

File tree

152 files changed

+99
-9987
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+99
-9987
lines changed

.version

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.1
1+
0.4.0

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,5 @@
1414
| 0.2.3 | New web configurator: Handling projects, configs for each user separately. First Available strategy is more powerful, a lot of improvements to efficiency. |
1515
| 0.2.4 | Anonymizer module, integration anonymization with any endpoint (using dynamic payload analysis and full payload anonymisation), dedicated `/api/anonymize_text` endpoint as memory only anonymization. Whole router may be run in `FORCE_ANONYMISATION` mode. |
1616
| 0.3.0 | Anonymization available with three strategies: `fast_masker`, `genai`, `prov_masker`. |
17-
| 0.3.1 | Refactoring `lb.strategies` to be more flexible modular. Introduced `MaskerPipeline` and `GuardrailPipeline` both configured via env. Removed genai-based masking endpoint. |
17+
| 0.3.1 | Refactoring `lb.strategies` to be more flexible modular. Introduced `MaskerPipeline` and `GuardrailPipeline` both configured via env. Removed genai-based masking endpoint. |
18+
| 0.4.0 | The main repository is divided into dedicated ones: plugins, services, web — separate repositories. Clean up the whole repository. Examples of integration with llamaindex, langchain, openai, litellm and haystack. |

README.md

Lines changed: 10 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,16 @@ a ready‑made image in your own infrastructure.
1212
and optional Prometheus metrics.
1313
- **llm_router_lib** is a Python SDK that wraps the API with typed request/response models, automatic retries, token
1414
handling and a rich exception hierarchy, letting developers focus on application logic rather than raw HTTP calls.
15-
- **llm_router_web** offers ready‑to‑use Flask UIs – an anonymizer UI that masks sensitive data and a configuration
15+
- [**llm_router_web**](https://github.com/radlab-dev-group/llm-router-web) offers ready‑to‑use Flask UIs – an anonymizer
16+
UI that masks sensitive data and a configuration
1617
manager for model/user settings – demonstrating how to consume the router from a browser.
17-
- **llm_router_plugins** (e.g., the **fast_masker** plugin) deliver a rule‑based text anonymisation engine with
18+
- [**llm_router_plugins**](https://github.com/radlab-dev-group/llm-router-plugins) (e.g., the **fast_masker** plugin)
19+
deliver a rule‑based text anonymisation engine with
1820
a comprehensive set of Polish‑specific masking rules (emails, IPs, URLs, phone numbers, PESEL, NIP, KRS, REGON,
1921
monetary amounts, dates, etc.) and an extensible architecture for custom rules and validators.
22+
- [**llm_router_services**](https://github.com/radlab-dev-group/llm-router-services) provides HTTP services that
23+
implement the core functionality used by the LLM‑Router’s plugin system. The services expose guardrail and masking
24+
capabilities through Flask applications.
2025

2126
All components run on Python 3.10+ using `virtualenv` and require only the listed dependencies, making the suite easy to
2227
install, extend, and deploy in both development and production environments.
@@ -62,21 +67,6 @@ project README:
6267

6368
#### Base requirements
6469

65-
> **Prerequisite**: `radlab-ml-utils`
66-
>
67-
> This project uses the
68-
> [radlab-ml-utils](https://github.com/radlab-dev-group/ml-utils)
69-
> library for machine learning utilities
70-
> (e.g., experiment/result logging with Weights & Biases/wandb).
71-
> Install it before working with ML-related parts:
72-
>
73-
> ```bash
74-
> pip install git+https://github.com/radlab-dev-group/ml-utils.git
75-
> ```
76-
>
77-
> For more options and details, see the library README:
78-
> https://github.com/radlab-dev-group/ml-utils
79-
8070
```shell script
8171
python3 -m venv .venv
8272
source .venv/bin/activate
@@ -118,7 +108,7 @@ metrics for monitoring and alerting.
118108
LLM_ROUTER_MINIMUM=1 python3 -m llm_router_api.rest_api
119109
```
120110

121-
### 📦 Docker
111+
## 📦 Docker
122112

123113
Run the container with the default configuration:
124114

@@ -157,7 +147,7 @@ docker run \
157147

158148
---
159149

160-
### Configuration (via environment)
150+
## 🛠️ Configuration (via environment)
161151

162152
A full list of environment variables is available at the link
163153
[.env list](llm_router_api/README.md#environment-variables)
@@ -194,7 +184,7 @@ a description of the streaming mechanisms can be found at the link:
194184

195185
---
196186

197-
## 🛠️ Development
187+
## 🔧 Development
198188

199189
- **Python**3.10+ (project is tested on 3.10.6)
200190
- All dependencies are listed in `requirements.txt`. Install them inside the virtualenv.

install.md

Lines changed: 0 additions & 14 deletions
This file was deleted.

llm_router_lib/README.md

Lines changed: 75 additions & 124 deletions
Original file line numberDiff line numberDiff line change
@@ -1,168 +1,119 @@
1-
# llm‑router-LIB — Python client library
2-
3-
**llm‑router** is a lightweight Python client for interacting with the LLM‑Router API.
4-
It provides typed request models, convenient service wrappers, and robust error handling so you can focus on building
5-
LLM‑driven applications rather than dealing with raw HTTP calls.
6-
7-
---
1+
# llm_router_lib
82

93
## Overview
104

11-
`llm_router_lib` is the official Python SDK for the **LLM‑Router**
12-
project <https://github.com/radlab-dev-group/llm-router>.
13-
14-
It abstracts the HTTP layer behind a small, well‑typed API:
15-
16-
* **Typed payloads** built with *pydantic* (e.g., `GenerativeConversationModel`).
17-
* **Service objects** that know the endpoint URL and the model class they expect.
18-
* **Automatic token handling**, request retries, and exponential back‑off.
19-
* **Rich exception hierarchy** (`LLMRouterError`, `AuthenticationError`, `RateLimitError`, `ValidationError`).
20-
21-
---
5+
`llm_router_lib` is ** a collection of data‑model definitions**.
6+
It supplies the **foundation** for request/response structures used by the
7+
`llm_router_api` package **and** provides a **thin, opinionated client wrapper**
8+
that makes interacting with the LLM Router service straightforward.
229

23-
## Features
10+
Key components:
2411

25-
| Feature | Description |
26-
|------------------------------------|--------------------------------------------------------------------------------|
27-
| **Typed request/response models** | Guarantees payload correctness at runtime using Pydantic. |
28-
| **Built‑in conversation services** | Simple `conversation_with_model` and `extended_conversation_with_model` calls. |
29-
| **Retry & timeout** | Configurable request timeout and automatic retries with exponential back‑off. |
30-
| **Authentication** | Bearer‑token support; raises `AuthenticationError` on 401/403. |
31-
| **Rate‑limit handling** | Detects HTTP 429 and raises `RateLimitError`. |
32-
| **Extensible** | Add custom services or models by extending the base classes. |
33-
| **Test suite** | Ready‑to‑run unit tests in `llm_router_lib/tests`. |
12+
| Package | Purpose |
13+
|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
14+
| **`data_models`** | `pydantic` models that define the shape of payloads sent to the router (e.g. `GenerativeConversationModel`, `ExtendedGenerativeConversationModel`, utility models for question generation, translation, etc.). These models are shared with the API side, ensuring both client and server speak the same contract. |
15+
| **`client.py`** | `LLMRouterClient` – a lightweight wrapper around the router’s HTTP API. It offers high‑level methods (`conversation_with_model`, `extended_conversation_with_model`) that accept either plain dictionaries **or** the aforementioned data‑model instances. The client handles payload validation, provider selection, error mapping, and response parsing. |
16+
| **`services`** | Low‑level service classes (`ConversationService`, `ExtendedConversationService`) that perform the actual HTTP calls via `HttpRequester`. They are used internally by the client but can be reused directly if finer‑grained control is needed. |
17+
| **`exceptions.py`** | Custom exception hierarchy (`LLMRouterError`, `AuthenticationError`, `RateLimitError`, `ValidationError`) that mirrors the router’s error semantics, making error handling in user code clean and explicit. |
18+
| **`utils/http.py`** | `HttpRequester` – a small wrapper around `requests` providing retries, time‑outs and logging. It is the networking backbone for the client wrapper. |
3419

35-
---
20+
In short, `llm_router_lib` provides **both** the data contract (the “schema”) **and** a convenient Pythonic client to
21+
consume the router service.
3622

3723
## Installation
3824

39-
The library is pure Python and works with **Python 3.10+**.
25+
The library targets **Python 3.10.6** and uses a `virtualenv`. Install it in editable mode for development:
4026

41-
```shell script
42-
# Create a virtualenv (recommended)
43-
python -m venv .venv
27+
``` bash
28+
# Clone the repository (if you haven't already)
29+
git clone https://github.com/radlab-dev-group/llm-router.git
30+
cd llm-router/llm_router_lib
31+
32+
# Create and activate a virtual environment
33+
python3 -m venv .venv
4434
source .venv/bin/activate
4535

46-
# Install from the repository (editable mode)
36+
# Install the package and its dependencies
4737
pip install -e .
4838
```
4939

50-
If you prefer a regular installation from a wheel or source distribution, use:
51-
52-
```shell script
53-
pip install .
54-
```
55-
56-
> **Note** – The project relies only on the packages listed in the repository’s `requirements.txt`
57-
> (pydantic, requests, etc.), all of which are installed automatically by `pip`.
58-
59-
---
40+
All runtime dependencies (`requests`, `pydantic`, `rdl_ml_utils`) are declared in the project’s `requirements.txt`.
6041

6142
## Quick start
6243

63-
```python
64-
from llm_router_lib.client import LLMRouterClient
65-
from llm_router_lib.data_models.builtin_chat import GenerativeConversationModel
44+
``` python
45+
from llm_router_lib import LLMRouterClient
6646

67-
# Initialise the client (replace with your own endpoint and token)
47+
# Initialise the client – point it at the router’s base URL
6848
client = LLMRouterClient(
69-
api="https://api.your-llm-router.com",
70-
token="YOUR_ACCESS_TOKEN"
49+
api="http://localhost:8080/api", # router base URL
50+
token="YOUR_ROUTER_TOKEN", # optional, if router requires auth
7151
)
7252

73-
# Build a request payload
74-
payload = GenerativeConversationModel(
75-
model_name="google/gemma-3-12b-it",
76-
user_last_statement="Hello, how are you?",
77-
historical_messages=[{"user": "Hi"}],
78-
temperature=0.7,
79-
max_new_tokens=128,
80-
)
53+
# Build a payload using the provided data model (validation is automatic)
54+
payload = {
55+
"model_name": "google/gemma-3-12b-it",
56+
"user_last_statement": "Hello, how are you?",
57+
"temperature": 0.7,
58+
"max_new_tokens": 128,
59+
}
8160

82-
# Call the API
61+
# Call the standard conversation endpoint
8362
response = client.conversation_with_model(payload)
8463

85-
print(response) #dict with the model's answer and metadata
64+
print(response) #{'status': True, 'body': {...}}
8665
```
8766

88-
### Extended conversation
67+
You can also pass a `pydantic` model instance directly:
8968

90-
```python
91-
from llm_router_lib.data_models.builtin_chat import ExtendedGenerativeConversationModel
69+
```
70+
python
71+
from llm_router_lib.data_models.builtin_chat import GenerativeConversationModel
9272
93-
payload = ExtendedGenerativeConversationModel(
73+
model = GenerativeConversationModel(
9474
model_name="google/gemma-3-12b-it",
95-
user_last_statement="Explain quantum entanglement.",
96-
system_prompt="Answer as a friendly professor.",
97-
temperature=0.6,
98-
max_new_tokens=256,
75+
user_last_statement="Hello, how are you?",
76+
temperature=0.7,
77+
max_new_tokens=128,
9978
)
10079
101-
response = client.extended_conversation_with_model(payload)
102-
print(response)
80+
response = client.conversation_with_model(model)
10381
```
10482

105-
---
106-
107-
## Core concepts
108-
109-
### Client
110-
111-
`LLMRouterClient` is the entry point. It handles:
112-
113-
* Base URL normalization.
114-
* Optional bearer token injection.
115-
* Construction of the internal `HttpRequester`.
83+
## Data models
11684

117-
All public methods accept either a **dict** or a **pydantic model**; models are automatically serialized with
118-
`.model_dump()`.
85+
All request payloads are defined in `llm_router_lib/data_models`.
86+
Common base:
11987

120-
### Data models
121-
122-
Located in `llm_router_lib/data_models/`.
123-
Key models:
124-
125-
| Model | Purpose |
126-
|----------------------------------------------|-------------------------------------------------------------------|
127-
| `GenerativeConversationModel` | Simple chat payload (model name, user message, optional history). |
128-
| `ExtendedGenerativeConversationModel` | Same as above, plus a `system_prompt`. |
129-
| `GenerateQuestionFromTextsModel` | Generate questions from a list of texts. |
130-
| `TranslateTextModel`, `SimplifyTextModel`, … | Various utility models for text transformation. |
131-
| `OpenAIChatModel` | Payload for direct OpenAI‑compatible chat calls. |
132-
133-
All models inherit from a common `_GenerativeOptions` base that defines temperature, token limits, language, etc.
134-
135-
### Services
136-
137-
Implemented in `llm_router_lib/services/`.
138-
Each service extends `_BaseConversationService` and defines:
139-
140-
* `endpoint` – the API path (e.g., `/api/conversation_with_model`).
141-
* `model_cls` – the Pydantic model class used for validation.
142-
143-
The service’s `call()` method performs the HTTP POST and returns a parsed JSON dictionary, raising `LLMRouterError` on
144-
malformed responses.
88+
``` python
89+
class BaseModelOptions(BaseModel):
90+
"""Options shared across many endpoint models."""
91+
mask_payload: bool = False
92+
masker_pipeline: Optional[List[str]] = None
93+
```
14594

146-
### Utilities
95+
### Conversation models
14796

148-
* `llm_router_lib/utils/http.py` – thin wrapper around `requests` with retry logic, response validation, and logging.
149-
* Logging is integrated via the standard library `logging` module; you can inject your own logger when constructing the
150-
client.
97+
| Model | Required fields | Optional / extra fields |
98+
|---------------------------------------|-------------------------------------|-----------------------------------------------------------|
99+
| `GenerativeConversationModel` | `model_name`, `user_last_statement` | `temperature`, `max_new_tokens`, `historical_messages`, … |
100+
| `ExtendedGenerativeConversationModel` | All of the above + `system_prompt` ||
151101

152-
### Error handling
102+
Utility models for other built‑in endpoints (question generation, translation,
103+
article creation, context‑based answering, etc.) follow the same pattern and
104+
inherit from `BaseModelOptions`.
153105

154-
| Exception | When raised |
155-
|-----------------------|-----------------------------------------------------------|
156-
| `LLMRouterError` | Generic SDK‑level error (e.g., non‑JSON response). |
157-
| `AuthenticationError` | HTTP 401/403 – missing or invalid token. |
158-
| `RateLimitError` | HTTP 429 – the server throttled the request. |
159-
| `ValidationError` | HTTP 400 – request payload failed server‑side validation. |
106+
## Thin client wrapper (`LLMRouterClient`)
160107

161-
All exceptions inherit from `LLMRouterError`, allowing a single `except LLMRouterError:` clause to catch any SDK‑related
162-
problem.
108+
`LLMRouterClient` offers a **high‑level API** that abstracts away the low‑level
109+
HTTP details:
163110

164-
---
111+
| Method | Description |
112+
|---------------------------------------------|----------------------------------------------------------------------------------------------------------------|
113+
| `conversation_with_model(payload)` | Calls `/api/conversation_with_model`. Accepts a dict **or** a `GenerativeConversationModel`. |
114+
| `extended_conversation_with_model(payload)` | Calls `/api/extended_conversation_with_model`. Accepts a dict **or** an `ExtendedGenerativeConversationModel`. |
165115

166-
## License
116+
Internally the client:
167117

168-
`llm_router_lib` is released under the **MIT License**. See the `LICENSE` file for details.
118+
1. **Validates** the payload (via the corresponding `pydantic` model if a model instance is supplied).
119+
2. **Selects** an appropriate provider using the router’s load‑balancing
File renamed without changes.

0 commit comments

Comments
 (0)