Skip to content

A (work-in-progress) Kubernetes-based Retrieval-Augmented Generation (RAG) service that scrapes social media content, indexes it in a vector database, and provides an API for querying with an LLM.

License

Notifications You must be signed in to change notification settings

codeamt/Istio-rag-service

Repository files navigation

Istio RAG Service

CI/CD Python Version License FastAPI Security UV

A Kubernetes-based Retrieval-Augmented Generation (RAG) service that scrapes social media content, indexes it in a vector database, and provides an API for querying with an LLM.

This service has been enhanced with modern Python development practices including:

  • Configuration management using environment variables
  • Comprehensive logging with detailed error reporting
  • Rate limiting for API calls
  • Data deduplication using SHA checksums
  • Concurrent processing for improved performance
  • Batch processing for efficient indexing
  • Pydantic validation for API response validation

Table of Contents

Architecture

The system consists of multiple microservices orchestrated with Kubernetes and Istio service mesh:

  1. Scraper Service: Collects social media posts from Twitter, Threads, and Bluesky
  2. RAG Service: Handles user queries by retrieving relevant documents and generating responses
  3. Qdrant: Vector database for storing and searching document embeddings
  4. vLLM: High-throughput LLM inference engine
  5. Embedding Service: Generates vector embeddings for text (referenced but not included in this repo)

The Kubernetes configuration includes:

  • Namespace and RBAC definitions
  • Istio Gateway and security configurations
  • Network policies for secure communication
  • Virtual services for traffic routing
  • Observability components (Prometheus, Grafana, Jaeger, Kiali)
  • Service definitions for all components

Services

Scraper Service

  • Collects posts from multiple social media platforms
  • Generates embeddings for posts using the embedding service
  • Stores posts and their embeddings in Qdrant

RAG Service

  • Accepts user queries
  • Retrieves relevant documents from Qdrant
  • Generates responses using vLLM

Prerequisites

For Kubernetes Deployment

For Local Development with Docker Compose

Project Architecture

graph TD
    A[Client] --> B[Istio Gateway]
    B --> C[VirtualService]
    C --> D[RAG Service]
    C --> E[Scraper Service]
    
    D --> F[Qdrant DB]
    D --> G[vLLM Service]
    D --> H[Embedding Service]
    
    E --> F
    E --> H
    E --> I[Twitter API]
    E --> J[Threads API]
    E --> K[Bluesky API]
    
    subgraph Kubernetes Cluster
        B
        C
        D
        E
        F
        G
        H
    end
    
    subgraph External Services
        I
        J
        K
    end
Loading

Local Development with Docker Compose

For local development without Kubernetes, you can use Docker Compose:

# Start all services
docker-compose up -d

# Check service status
docker-compose ps

# View logs
docker-compose logs -f

# Stop services
docker-compose down

Note: The docker-compose setup uses a lightweight model (facebook/opt-125m) for local development. For production, you would use a more capable model.

Development Setup

Using UV (Recommended)

  1. Install UV (Python package manager)

  2. Install dependencies for each service:

    # Install base dependencies
    uv pip install -e ./services/rag_service -e ./services/scraper_service
    
    # Or install with specific extras
    uv pip install -e ./services/rag_service[test] -e ./services/scraper_service[test]  # For testing
    uv pip install -e ./services/rag_service[security] -e ./services/scraper_service[security]  # For security scanning
    uv pip install -e ./services/rag_service[all] -e ./services/scraper_service[all]  # All dependencies
  3. UV automatically manages a virtual environment, so no need to activate/deactivate

Traditional pip Setup

  1. Create a virtual environment:
    python -m venv .venv
    source .venv/bin/activate
  2. Install dependencies for each service:
    # Install base dependencies
    pip install -e ./services/rag_service -e ./services/scraper_service
    
    # Or install with specific extras
    pip install -e ./services/rag_service[test] -e ./services/scraper_service[test]  # For testing
    pip install -e ./services/rag_service[security] -e ./services/scraper_service[security]  # For security scanning
    pip install -e ./services/rag_service[all] -e ./services/scraper_service[all]  # All dependencies

Note: Dependencies are now managed through pyproject.toml files with optional dependency groups. Separate requirements.txt files have been removed.

Installing Test Dependencies

To run tests, install the test dependencies:

# Using UV (recommended)
uv pip install -e ./services/rag_service[test] -e ./services/scraper_service[test]

# Or using traditional pip
pip install -e ./services/rag_service[test] -e ./services/scraper_service[test]

Deployment

Option 1: Using Makefile (Recommended)

# Start minikube with appropriate resources
make start-minikube

# Install Istio
make install-istio

# Build container images
make microservice-container-build

# Deploy all services
make deploy

# Check deployment status
make health-check

Option 2: Using Deployment Script

./deploy-rag.sh

Option 3: Manual Deployment

  1. Apply Kubernetes configurations in order:

    kubectl apply -f k8s/0-namespace.yaml
    kubectl apply -f k8s/1-istio-mtls.yaml
    kubectl apply -f k8s/2-vllm.yaml
    kubectl apply -f k8s/3-qdrant.yaml
    kubectl apply -f k8s/4-scraper-service.yaml
    kubectl apply -f k8s/5-rag-service.yaml
    kubectl apply -f k8s/6-observability.yaml
    kubectl apply -f k8s/7-network-policies.yaml
    kubectl apply -f k8s/8-virtual-services.yaml
  2. Build and push Docker images for the services:

    docker build -t rag-service ./services/rag_service
    docker build -t scraper-service ./services/scraper_service
  3. Update the Kubernetes deployment files with the correct image names.

  4. Initialize the Qdrant collection:

    kubectl exec -it deploy/qdrant -- curl -X PUT "http://localhost:6333/collections/posts" -H "Content-Type: application/json" -d '{
        "name": "posts",
        "vectors": {
            "size": 384,
            "distance": "Cosine"
        }
    }'

API Endpoints

Scraper Service

  • POST /scrape - Start scraping social media platforms for posts matching a query
    {
      "query": "technology",
      "platforms": ["twitter", "bluesky", "threads"]
    }

RAG Service

  • POST /query - Query the RAG system with a question
    {
      "query": "What are the latest trends in AI?",
      "max_results": 5
    }

Configuration

Environment Variables

The services support comprehensive configuration through environment variables. All settings can also be configured via .env files.

Scraper Service Configuration:

  • TWITTER_TOKEN - Bearer token for Twitter API v2 access
  • TWITTER_MAX_REQUESTS - Maximum Twitter API requests (default: 300)
  • TWITTER_TIME_WINDOW - Time window for Twitter rate limiting (default: 900 seconds)
  • THREADS_MAX_REQUESTS - Maximum Threads API requests (default: 100)
  • THREADS_TIME_WINDOW - Time window for Threads rate limiting (default: 3600 seconds)
  • BLUESKY_MAX_REQUESTS - Maximum Bluesky API requests (default: 3000)
  • BLUESKY_TIME_WINDOW - Time window for Bluesky rate limiting (default: 300 seconds)
  • BATCH_SIZE - Number of posts to process in batches (default: 10)
  • HTTP_TIMEOUT - HTTP request timeout in seconds (default: 30)
  • LOG_LEVEL - Logging level (default: INFO)

RAG Service Configuration:

  • HTTP_TIMEOUT - HTTP request timeout in seconds (default: 30)
  • MAX_TOKENS - Maximum tokens for LLM responses (default: 500)
  • LOG_LEVEL - Logging level (default: INFO)

Kubernetes Secrets

Create a Kubernetes secret for API tokens:

kubectl create secret generic api-tokens -n rag-system \
  --from-literal=TWITTER_TOKEN=your_twitter_bearer_token

Monitoring and Observability

The system includes observability configurations for monitoring the services:

  • Prometheus metrics collection
  • Grafana dashboards
  • Jaeger distributed tracing
  • Kiali service mesh visualization

Security Scanning

The project includes automated security scanning in the CI pipeline:

  • Bandit: Static analysis for common Python security issues
  • Safety: Checks dependencies for known security vulnerabilities

To run security scans locally:

# Using UV (recommended)
uv pip install -e ./services/rag_service[security] -e ./services/scraper_service[security]

# Or using traditional pip
pip install -e ./services/rag_service[security] -e ./services/scraper_service[security]

# Run bandit scan
bandit -r services/

# Run safety check
# Note: This requires actual requirements files with pinned versions
safety check

Running Tests

To run the tests for both services:

# Run all tests with UV (recommended)
uv run python run_tests.py

# Or run tests manually with pytest using UV
uv run python -m pytest tests/ -v

# Run tests for a specific service with UV
uv run python -m pytest tests/test_rag_service.py -v
uv run python -m pytest tests/test_scraper_service.py -v

# Or use traditional methods
python run_tests.py
pytest tests/ -v
pytest tests/test_rag_service.py -v
pytest tests/test_scraper_service.py -v

# Check test file syntax (without running tests)
python check_test_syntax.py

Configuration Verification

Before deploying, you can verify that all configuration files are properly structured:

# Using UV (recommended)
uv pip install -e ./services/rag_service[verify] -e ./services/scraper_service[verify]
python verify-config.py

# Or using traditional pip
pip install -e ./services/rag_service[verify] -e ./services/scraper_service[verify]
python verify-config.py

# Or install all dependencies
uv pip install -e ./services/rag_service[all] -e ./services/scraper_service[all]
python verify-config.py

Troubleshooting

  • Check pod status: kubectl get pods -n rag-system
  • Check service logs: kubectl logs -n rag-system deploy/<service-name>
  • Check service status: kubectl get services -n rag-system
  • Check Istio sidecar status: istioctl proxy-status
  • Check virtual services: kubectl get virtualservices -n rag-system
  • Check destination rules: kubectl get destinationrules -n rag-system

About

A (work-in-progress) Kubernetes-based Retrieval-Augmented Generation (RAG) service that scrapes social media content, indexes it in a vector database, and provides an API for querying with an LLM.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published