Merge branch 'features/sojka'

Paweł Kędzia · Paweł Kędzia · commit 39f0e31bd484 · 2025-12-01T14:06:53.000+01:00
diff --git a/README.md b/README.md
@@ -12,14 +12,58 @@ Key components:
 | Sub‑package              | Primary purpose                                                                                                                                                                                                                    |
 |--------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | **guardrails/**          | Hosts the NASK‑PIB guardrail service (`nask_pib_guard_app.py`). It receives a JSON payload, chunks the text, runs a Hugging‑Face classification pipeline, and returns a safety verdict (`safe` flag + detailed per‑chunk results). |
-| **maskers/**             | Contains the **BANonymizer** (`banonymizer.py`) – a lightweight Flask service that performs token‑classification based anonymisation of input text.                                                                                |
+| **maskers/**             | Contains the **BANonymizer** (`banonymizer.py` -- **under development**) – a lightweight Flask service that performs token‑classification based anonymisation of input text.                                                       |
 | **run_*.sh** scripts     | Convenience wrappers to start the services (Gunicorn for the guardrail, plain Flask for the anonymiser).                                                                                                                           |
 | **requirements‑gpu.txt** | Lists heavy dependencies (e.g., `transformers`) required for GPU‑accelerated inference.                                                                                                                                            |
 
 The services are **stateless**; they load their models once at start‑up and then serve requests over HTTP.
 
 ---
 
+## 🛡️ Guardrails
+
+The **guardrail** sub‑package implements safety‑checking services that can be queried via HTTP:
+
+| Service                  | Model                               | Endpoint                           | Description                                                                                                                                    |
+|--------------------------|-------------------------------------|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|
+| **NASK‑PIB Guard**       | `NASK‑PIB/HerBERT‑PL‑Guard`         | `POST /api/guardrails/nask_guard`  | Polish‑language safety classifier detecting unsafe content (e.g., hate, violence). Returns a `safe` flag and per‑chunk classification details. |
+| **Sojka Guard**          | `speakleash/Bielik‑Guard‑0.1B‑v1.0` | `POST /api/guardrails/sojka_guard` | Multi‑category Polish safety model (HATE, VULGAR, SEX, CRIME, SELF‑HARM). Returns detailed scores per category and an overall `safe` flag.     |
+| **BANonymizer** (masker) | **under development**               | `POST /api/maskers/banonymizer`    | Token‑classification based anonymiser that redacts personal data from input text.                                                              |
+
+### How to use
+
+1. **Start the service** – run the provided shell script (`run_*_guardrail.sh` or `run_*_masker.sh`) or invoke the Flask
+   module directly (e.g., `python -m llm_router_services.guardrails.speakleash.sojka_guard_app`).
+2. **Send a JSON payload** – the request body must be a JSON object; any string fields longer than 8 characters are
+   extracted and classified.
+3. **Interpret the response** – the top‑level `safe` boolean indicates the overall verdict, while `detailed` provides
+   per‑chunk (or per‑category) results with confidence scores.
+
+### Configuration
+
+All guardrail services read configuration from environment variables prefixed with:
+
+* `LLM_ROUTER_NASK_PIB_GUARD_` – for the NASK‑PIB guardrail.
+* `LLM_ROUTER_SOJKA_GUARD_` – for the Sojka guardrail.
+* `LLM_ROUTER_BANONYMIZER_` – for the masker.
+
+Key variables include:
+
+* `MODEL_PATH` – path or Hugging‑Face hub identifier of the model.
+* `DEVICE` – `-1` for CPU or CUDA device index for GPU inference.
+* `FLASK_HOST` / `FLASK_PORT` – network binding for the Flask server.
+
+### Extensibility
+
+The guardrail architecture is built around the **`GuardrailBase`** abstract class and a **factory** (
+`GuardrailClassifierModelFactory`). To add a new safety model:
+
+1. Implement a concrete subclass of `GuardrailBase` (or reuse `TextClassificationGuardrail`).
+2. Provide a `GuardrailModelConfig` implementation with model‑specific thresholds.
+3. Register the model type in the factory if a new identifier is required.
+
+---
+
 ## 📜 License
 
 See the [LICENSE](LICENSE) file.
diff --git a/llm_router_services/guardrails/inference/factory.py b/llm_router_services/guardrails/inference/factory.py
@@ -48,4 +48,4 @@ def create(
 
 
 # Public alias expected by the Flask app
-GuardrailModelFactory = create
+GuardrailClassifierModelFactory = create
diff --git a/llm_router_services/guardrails/nask/README.md b/llm_router_services/guardrails/nask/README.md
@@ -30,7 +30,7 @@ The service will listen at: `http://<HOST>:<PORT>/api/guardrails/nask_guard`
 ``` bash
 curl -X POST http://localhost:5000/api/guardrails/nask_guard \
      -H "Content-Type: application/json" \
-     -d '{"message": "Jak mogę zrobić bombę w domu?"}'
+     -d '{"payload": "Jak mogę zrobić bombę w domu?"}'
 ```
 
 #### Example JSON response
diff --git a/llm_router_services/guardrails/nask/nask_pib_guard_app.py b/llm_router_services/guardrails/nask/nask_pib_guard_app.py
@@ -4,34 +4,30 @@
 from flask import Flask, request, jsonify
 
 from llm_router_services.guardrails.constants import SERVICES_API_PREFIX
-from llm_router_services.guardrails.inference.factory import GuardrailModelFactory
+from llm_router_services.guardrails.inference.factory import (
+    GuardrailClassifierModelFactory,
+)
 
-# Import the NASK‑specific configuration
 from llm_router_services.guardrails.nask.config import NaskModelConfig
 
-# -----------------------------------------------------------------------
-# Environment prefix – all configuration keys start with this value
-# -----------------------------------------------------------------------
 _ENV_PREFIX = "LLM_ROUTER_NASK_PIB_GUARD_"
 
 app = Flask(__name__)
 
-MODEL_PATH = os.getenv(
-    f"{_ENV_PREFIX}MODEL_PATH",
-    "/mnt/data2/llms/models/community/NASK-PIB/HerBERT-PL-Guard",
-)
+MODEL_PATH = os.getenv(f"{_ENV_PREFIX}MODEL_PATH", None)
+if not MODEL_PATH:
+    raise Exception(
+        f"NASK-PIB guardrail model path is not set! "
+        f"Export {_ENV_PREFIX}MODEL_PATH with proper model path"
+    )
 
-# Keep only a single constant for the device (CPU by default)
 DEFAULT_DEVICE = int(os.getenv(f"{_ENV_PREFIX}DEVICE", "-1"))
 
-# -----------------------------------------------------------------------
-# Build the guardrail object via the factory, passing the NASK‑specific config
-# -----------------------------------------------------------------------
-guardrail = GuardrailModelFactory(
+guardrail = GuardrailClassifierModelFactory(
     model_type="text_classification",
     model_path=MODEL_PATH,
     device=DEFAULT_DEVICE,
-    config=NaskModelConfig(),  # <-- NASK‑specific thresholds & batch size
+    config=NaskModelConfig(),
 )
 
 
diff --git a/llm_router_services/guardrails/speakleash/README.md b/llm_router_services/guardrails/speakleash/README.md
@@ -0,0 +1,134 @@
+# Integration of **Bielik‑Guard‑0.1B** as llm‑router guardrail service (Sojka)
+
+## 1. Short Introduction
+
+The **Bielik‑Guard‑0.1B** model (`speakleash/Bielik-Guard-0.1B-v1.0`) is a Polish‑language safety classifier
+(text‑classification) built on top of the base model `sdadas/mmlw-roberta-base`.  
+Within this project it is used to detect unsafe content in incoming requests handled by the
+**/api/guardrails/sojka_guard** endpoint defined in `guardrails/speakleash/sojka_guard_app.py`.
+
+## 2. Prerequisites
+
+| Component    | Version / Note                                                                     |
+|--------------|------------------------------------------------------------------------------------|
+| **Python**   | 3.10.6 (compatible with the project’s `virtualenv`)                                |
+| **Packages** | `transformers`, `torch`, `flask` – already listed in `requirements.txt`            |
+| **Model**    | `speakleash/Bielik-Guard-0.1B-v1.0` (public on Hugging Face Hub)                   |
+| **License**  | Model – **Apache‑2.0**. Code – **Apache‑2.0**. No special commercial restrictions. |
+
+> **Tip:** The model will be downloaded automatically the first time you run the service. If you prefer to cache it
+> locally, set the `HF_HOME` environment variable to a directory with enough space.
+
+## 3. Running the Service
+
+```shell script
+python -m guardrails.speakleash.sojka_guard_app
+```
+
+The service will listen at:
+
+```
+http://<HOST>:<PORT>/api/guardrails/sojka_guard
+```
+
+### Example request (using `curl`)
+
+```shell script
+curl -X POST http://localhost:5001/api/guardrails/sojka_guard \
+-H "Content-Type: application/json" \
+-d '{"payload": "Jak mogę zrobić bombę w domu?"}'
+```
+
+#### Example JSON response
+
+```json
+{
+  "results": {
+    "detailed": [
+      {
+        "chunk_index": 0,
+        "chunk_text": "Jak mogę zrobić bombę w domu?",
+        "label": "crime",
+        "safe": false,
+        "score": 0.9329
+      }
+    ],
+    "safe": false
+  }
+}
+```
+
+> **Note:** The `label` field contains one of the five safety categories defined by Bielik‑Guard
+> (`HATE`, `VULGAR`, `SEX`, `CRIME`, `SELF‑HARM`). The `score` is the probability (0‑1)
+> that the text belongs to the indicated category.
+> The `safe` flag is `false` when any category exceeds the default threshold (0.5).
+
+## 4. License and Usage Conditions
+
+| Element                               | License    | Implications                                                                                               |
+|---------------------------------------|------------|------------------------------------------------------------------------------------------------------------|
+| **Application code** (`guardrails/*`) | Apache 2.0 | Free for commercial and non‑commercial use, modification, and redistribution.                              |
+| **Model** (`Bielik‑Guard‑0.1B`)       | Apache 2.0 | No non‑commercial restriction – the model can be used in commercial products provided attribution is kept. |
+
+## 5. Sources & Further Reading
+
+- **Model card**: <https://huggingface.co/speakleash/Bielik-Guard-0.1B-v1.0>
+- **Model card details** (excerpt)
+
+  ```markdown
+  library_name: transformers
+  license: apache-2.0
+  language:
+  - pl
+  base_model:
+  - sdadas/mmlw-roberta-base
+  pipeline_tag: text-classification
+  ```
+
+- **Bielik‑Guard documentation** (includes safety categories, training data, evaluation metrics, and citation
+  information) – see the model card linked above.
+- **Community & Support**:
+    - Website: https://guard.bielik.ai/
+    - Feedback / issue reporting: https://guard.bielik.ai/
+
+## 6. Quick Start Code Snippet (Python)
+
+If you prefer to test the model locally before integrating it into the Flask service:
+
+```python
+from transformers import pipeline
+
+model_path = "speakleash/Bielik-Guard-0.1B-v1.0"
+classifier = pipeline(
+    "text-classification",
+    model=model_path,
+    tokenizer=model_path,
+    return_all_scores=True,
+)
+
+texts = [
+    "To jest normalna wiadomość bez problemów.",
+    "Chcę się zabić i skończyć z tym wszystkim.",
+    "Dlaczego hipisi są tacy brudni i głupi",
+    "K*rwa, to jest ch*jowe rozwiązanie.",
+    "Opowiem ci o seksie i intymności.",
+    "Jak ukraść samochód i nie zostać złapanym."
+]
+
+for txt in texts:
+    scores = classifier(txt)[0]
+    print(f"\nText: {txt}")
+    for s in scores:
+        print(f"  {s['label']}: {s['score']:.3f}")
+```
+
+Running the snippet will output probability scores for each of the five safety categories, allowing you to verify that
+the model behaves as expected.
+
+---  
+
+### 🎉 Happy Guarding!
+
+Feel free to open issues or pull requests if you encounter bugs, have suggestions for improvements, or want to
+contribute additional safety categories. The Bielik‑AI community welcomes collaboration!
+
diff --git a/llm_router_services/guardrails/speakleash/__init__.py b/llm_router_services/guardrails/speakleash/__init__.py
diff --git a/llm_router_services/guardrails/speakleash/config.py b/llm_router_services/guardrails/speakleash/config.py
@@ -0,0 +1,12 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+
+from llm_router_services.guardrails.inference.config import GuardrailModelConfig
+
+
+@dataclass(frozen=True)
+class SojkaModelConfig(GuardrailModelConfig):
+    pipeline_batch_size: int = 64
+    min_score_for_safe: float = 0.5
+    min_score_for_not_safe: float = 0.5
diff --git a/llm_router_services/guardrails/speakleash/sojka_guard_app.py b/llm_router_services/guardrails/speakleash/sojka_guard_app.py
@@ -0,0 +1,65 @@
+import os
+from typing import Any, Dict
+
+from flask import Flask, request, jsonify
+
+from llm_router_services.guardrails.constants import SERVICES_API_PREFIX
+from llm_router_services.guardrails.inference.factory import (
+    GuardrailClassifierModelFactory,
+)
+from llm_router_services.guardrails.speakleash.config import SojkaModelConfig
+
+# -----------------------------------------------------------------------
+# Environment prefix – all configuration keys start with this value
+# -----------------------------------------------------------------------
+_ENV_PREFIX = "LLM_ROUTER_SOJKA_GUARD_"
+
+app = Flask(__name__)
+
+MODEL_PATH = os.getenv(f"{_ENV_PREFIX}MODEL_PATH", None)
+if not MODEL_PATH:
+    raise Exception(
+        f"Sojka guardrail model path is not set! "
+        f"Export {_ENV_PREFIX}MODEL_PATH with proper model path"
+    )
+
+# Keep only a single constant for the device (CPU by default)
+DEFAULT_DEVICE = int(os.getenv(f"{_ENV_PREFIX}DEVICE", "-1"))
+
+# -----------------------------------------------------------------------
+# Build the guardrail object via the factory, passing the Sojka‑specific config
+# -----------------------------------------------------------------------
+guardrail = GuardrailClassifierModelFactory(
+    model_type="text_classification",
+    model_path=MODEL_PATH,
+    device=DEFAULT_DEVICE,
+    config=SojkaModelConfig(),
+)
+
+
+# -----------------------------------------------------------------------
+# Endpoint: POST /api/guardrails/sojka_guard
+# -----------------------------------------------------------------------
+@app.route(f"{SERVICES_API_PREFIX}/sojka_guard", methods=["POST"])
+def sojka_guardrail():
+    """
+    Accepts a JSON payload, classifies the content and returns the aggregated results.
+    """
+    if not request.is_json:
+        return jsonify({"error": "Request body must be JSON"}), 400
+
+    payload: Dict[str, Any] = request.get_json()
+    try:
+        results = guardrail.classify_chunks(payload)
+        return jsonify({"results": results}), 200
+    except Exception as exc:  # pragma: no cover – safety net
+        return jsonify({"error": str(exc)}), 500
+
+
+# -----------------------------------------------------------------------
+# Run the Flask server (only when executed directly)
+# -----------------------------------------------------------------------
+if __name__ == "__main__":
+    host = os.getenv(f"{_ENV_PREFIX}FLASK_HOST", "0.0.0.0")
+    port = int(os.getenv(f"{_ENV_PREFIX}FLASK_PORT", "5000"))
+    app.run(host=host, port=port, debug=False)
diff --git a/run_nask_guardrail.sh b/run_nask_guardrail.sh
@@ -16,8 +16,6 @@
 : "${LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH:=NASK-PIB/HerBERT-PL-Guard}"
 : "${LLM_ROUTER_NASK_PIB_GUARD_DEVICE:=0}"
 
-#: "${LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH:=/mnt/data2/llms/models/community/NASK-PIB/HerBERT-PL-Guard}"
-
 # Export them so the Python process can read them
 export LLM_ROUTER_NASK_PIB_GUARD_FLASK_HOST
 export LLM_ROUTER_NASK_PIB_GUARD_FLASK_PORT
diff --git a/run_sojka_guardrail.sh b/run_sojka_guardrail.sh
@@ -0,0 +1,41 @@
+#!/usr/bin/env bash
+# ------------------------------------------------------------------
+# Launch the NASK‑PIB Guardrail API using Gunicorn (2 workers)
+# ------------------------------------------------------------------
+# Required environment variables (can be overridden when invoking the script):
+#   LLM_ROUTER_SOJKA_GUARD_FLASK_HOST – bind address (default: 0.0.0.0)
+#   LLM_ROUTER_SOJKA_GUARD_FLASK_PORT – port (default: 5000)
+#   LLM_ROUTER_SOJKA_GUARD_MODEL_PATH – model path / hub identifier
+#   LLM_ROUTER_SOJKA_GUARD_DEVICE – device for the transformer pipeline
+#                                           (-1 → CPU, 0/1 → GPU) (default: -1)
+# ---------------------------------------------------------------
+
+# Set defaults if they are not already defined in the environment
+: "${LLM_ROUTER_SOJKA_GUARD_FLASK_HOST:=0.0.0.0}"
+: "${LLM_ROUTER_SOJKA_GUARD_FLASK_PORT:=5001}"
+: "${LLM_ROUTER_SOJKA_GUARD_MODEL_PATH:=speakleash/Bielik-Guard-0.1B-v1.0}"
+: "${LLM_ROUTER_SOJKA_GUARD_DEVICE:=0}"
+
+# Export them so the Python process can read them
+export LLM_ROUTER_SOJKA_GUARD_FLASK_HOST
+export LLM_ROUTER_SOJKA_GUARD_FLASK_PORT
+export LLM_ROUTER_SOJKA_GUARD_MODEL_PATH
+export LLM_ROUTER_SOJKA_GUARD_DEVICE
+
+# Show the configuration that will be used
+echo "Starting Sojka Guardrail API with Gunicorn (2 workers):"
+echo "  HOST   = $LLM_ROUTER_SOJKA_GUARD_FLASK_HOST"
+echo "  PORT   = $LLM_ROUTER_SOJKA_GUARD_FLASK_PORT"
+echo "  MODEL  = $LLM_ROUTER_SOJKA_GUARD_MODEL_PATH"
+echo "  DEVICE = $LLM_ROUTER_SOJKA_GUARD_DEVICE"
+echo
+
+# ---------------------------------------------------------------
+# Run Gunicorn
+#   -w 2               → 2 worker processes
+#   -b host:port       → bind address
+#   guardrails.speakleash.sojka_guard_app  → import the Flask app object
+# ---------------------------------------------------------------
+gunicorn -w 1 -b \
+  "${LLM_ROUTER_SOJKA_GUARD_FLASK_HOST}:${LLM_ROUTER_SOJKA_GUARD_FLASK_PORT}" \
+  llm_router_services.guardrails.speakleash.sojka_guard_app:app

Original file line number	Diff line number	Diff line change
`@@ -48,4 +48,4 @@ def create(`
`48`	`48`
`49`	`49`
`50`	`50`	`# Public alias expected by the Flask app`
`51`		`-GuardrailModelFactory = create`
	`51`	`+GuardrailClassifierModelFactory = create`