You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update README – reformat description, rename **Plugins** to **llm_router_plugins**, tidy the environment‑variable table alignment, and label the dynamic_weighted strategy as beta.
|`LLM_ROUTER_EP_PREFIX`| Prefix for all API endpoints. |`/api`|
162
-
|`LLM_ROUTER_MINIMUM`| Run service in proxy‑only mode (boolean). |`False`|
163
-
|`LLM_ROUTER_IN_DEBUG`| Run server in debug mode (boolean). |`False`|
164
-
|`LLM_ROUTER_BALANCE_STRATEGY`| Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted`and`first_available` as defined in`constants_base.py`. |`balanced`|
165
-
|`LLM_ROUTER_REDIS_HOST`| Redis host for load‑balancing when a multi‑provider model is available. |`<empty string>`|
166
-
|`LLM_ROUTER_REDIS_PORT`| Redis port for load‑balancing when a multi‑provider model is available. |`6379`|
167
-
|`LLM_ROUTER_SERVER_TYPE`| Server implementation to use (`flask`, `gunicorn`, `waitress`). |`flask`|
168
-
|`LLM_ROUTER_SERVER_PORT`| Port on which the server listens. |`8080`|
169
-
|`LLM_ROUTER_SERVER_HOST`| Host address for the server. |`0.0.0.0`|
170
-
|`LLM_ROUTER_SERVER_WORKERS_COUNT`| Number of workers (used incase when the selected server type supports multiworkers) |`2`|
171
-
|`LLM_ROUTER_SERVER_THREADS_COUNT`| Number of workers threads (used incase when the selected server type supports multithreading) |`8`|
172
-
|`LLM_ROUTER_SERVER_WORKER_CLASS`| If server accepts workers type, its able to set worker class by this environment. |`None`|
173
-
|`LLM_ROUTER_USE_PROMETHEUS`| Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. |`False`|
174
-
|`LLM_ROUTER_FORCE_ANONYMISATION`| Enable whole payload anonymisation. Each key and value is aut-anonymized before sending to model provider. |`False`|
175
-
|`LLM_ROUTER_ENABLE_GENAI_ANONYMIZE_TEXT_EP`| Enable builtin endpoint `/api/anonymize_text_genai` which uses genai to anonymize text |`False`|
|`LLM_ROUTER_EP_PREFIX`| Prefix for all API endpoints. |`/api`|
162
+
|`LLM_ROUTER_MINIMUM`| Run service in proxy‑only mode (boolean). |`False`|
163
+
|`LLM_ROUTER_IN_DEBUG`| Run server in debug mode (boolean). |`False`|
164
+
|`LLM_ROUTER_BALANCE_STRATEGY`| Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted`(beta),`first_available` and `first_available_optim` as defined in`constants_base.py`. |`balanced`|
165
+
|`LLM_ROUTER_REDIS_HOST`| Redis host for load‑balancing when a multi‑provider model is available. |`<empty string>`|
166
+
|`LLM_ROUTER_REDIS_PORT`| Redis port for load‑balancing when a multi‑provider model is available. |`6379`|
167
+
|`LLM_ROUTER_SERVER_TYPE`| Server implementation to use (`flask`, `gunicorn`, `waitress`). |`flask`|
168
+
|`LLM_ROUTER_SERVER_PORT`| Port on which the server listens. |`8080`|
169
+
|`LLM_ROUTER_SERVER_HOST`| Host address for the server. |`0.0.0.0`|
170
+
|`LLM_ROUTER_SERVER_WORKERS_COUNT`| Number of workers (used incase when the selected server type supports multiworkers) |`2`|
171
+
|`LLM_ROUTER_SERVER_THREADS_COUNT`| Number of workers threads (used incase when the selected server type supports multithreading) |`8`|
172
+
|`LLM_ROUTER_SERVER_WORKER_CLASS`| If server accepts workers type, its able to set worker class by this environment. |`None`|
173
+
|`LLM_ROUTER_USE_PROMETHEUS`| Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. |`False`|
174
+
|`LLM_ROUTER_FORCE_ANONYMISATION`| Enable whole payload anonymisation. Each key and value is aut-anonymized before sending to model provider. |`False`|
175
+
|`LLM_ROUTER_ENABLE_GENAI_ANONYMIZE_TEXT_EP`| Enable builtin endpoint `/api/anonymize_text_genai` which uses genai to anonymize text |`False`|
176
176
177
177
### 4️⃣ Run the REST API
178
178
@@ -208,7 +208,7 @@ and reliable routing of requests. The available strategies are:
208
208
characteristics, and you want to prioritize certain providers without needing dynamic adjustments.
209
209
***Implementation:** Implemented in `llm_router_api.base.lb.weighted.WeightedStrategy`.
210
210
211
-
### 3. `dynamic_weighted`
211
+
### 3. `dynamic_weighted` (beta)
212
212
213
213
***Description:** An extension of the `weighted` strategy. It not only uses weights
214
214
but also tracks the latency between successive selections of the same provider.
0 commit comments