Skip to content

Conversation

@Xunzhuo
Copy link
Member

@Xunzhuo Xunzhuo commented Nov 17, 2025

This commit introduces a comprehensive decision-based routing system with a flexible plugin architecture, replacing the previous category-based approach.

IntelligentRoute

apiVersion: vllm.ai/v1alpha1
kind: IntelligentRoute
metadata:
  name: ai-gateway-route
  namespace: default
spec:
  signals:
    keywords:
      - name: "thinking"
        operator: "OR"
        keywords: ["urgent", "immediate", "asap", "think", "careful"]
        caseSensitive: false

    domains:
      - name: "business"
        description: "Business and management related queries"
      - name: "law"
        description: "Legal questions and law-related topics"
      - name: "psychology"
        description: "Psychology and mental health topics"
      - name: "biology"
        description: "Biology and life sciences questions"
      - name: "chemistry"
        description: "Chemistry and chemical sciences questions"
      - name: "history"
        description: "Historical questions and cultural topics"
      - name: "health"
        description: "Health and medical information queries"
      - name: "economics"
        description: "Economics and financial topics"
      - name: "math"
        description: "Mathematics and quantitative reasoning"
      - name: "physics"
        description: "Physics and physical sciences"
      - name: "computer science"
        description: "Computer science and programming"
      - name: "philosophy"
        description: "Philosophy and ethical questions"
      - name: "engineering"
        description: "Engineering and technical problem-solving"
      - name: "other"
        description: "General knowledge and miscellaneous topics"

  decisions:
    - name: "business_decision"
      priority: 10
      description: "Business and management related queries"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "business"
      modelRefs:
        - model: "base-model"
          loraName: "social-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a senior business consultant and strategic advisor with expertise in corporate strategy, operations management, financial analysis, marketing, and organizational development. Provide practical, actionable business advice backed by proven methodologies and industry best practices. Consider market dynamics, competitive landscape, and stakeholder interests in your recommendations."
            mode: "replace"

    - name: "law_decision"
      priority: 10
      description: "Legal questions and law-related topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "law"
      modelRefs:
        - model: "base-model"
          loraName: "law-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a knowledgeable legal expert with comprehensive understanding of legal principles, case law, statutory interpretation, and legal procedures across multiple jurisdictions. Provide accurate legal information and analysis while clearly stating that your responses are for informational purposes only and do not constitute legal advice. Always recommend consulting with qualified legal professionals for specific legal matters."
            mode: "replace"

    - name: "psychology_decision"
      priority: 10
      description: "Psychology and mental health topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "psychology"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.92
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a psychology expert with deep knowledge of cognitive processes, behavioral patterns, mental health, developmental psychology, social psychology, and therapeutic approaches. Provide evidence-based insights grounded in psychological research and theory. When discussing mental health topics, emphasize the importance of professional consultation and avoid providing diagnostic or therapeutic advice."
            mode: "replace"

    - name: "biology_decision"
      priority: 10
      description: "Biology and life sciences questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "biology"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a biology expert with comprehensive knowledge spanning molecular biology, genetics, cell biology, ecology, evolution, anatomy, physiology, and biotechnology. Explain biological concepts with scientific accuracy, use appropriate terminology, and provide examples from current research. Connect biological principles to real-world applications and emphasize the interconnectedness of biological systems."
            mode: "replace"

    - name: "chemistry_decision"
      priority: 10
      description: "Chemistry and chemical sciences questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "chemistry"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a chemistry expert specializing in chemical reactions, molecular structures, and laboratory techniques. Provide detailed, step-by-step explanations."
            mode: "replace"

    - name: "history_decision"
      priority: 10
      description: "Historical questions and cultural topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "history"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a historian with expertise across different time periods and cultures. Provide accurate historical context and analysis."
            mode: "replace"

    - name: "health_decision"
      priority: 10
      description: "Health and medical information queries"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "health"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.95
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a health and medical information expert with knowledge of anatomy, physiology, diseases, treatments, preventive care, nutrition, and wellness. Provide accurate, evidence-based health information while emphasizing that your responses are for educational purposes only and should never replace professional medical advice, diagnosis, or treatment. Always encourage users to consult healthcare professionals for medical concerns and emergencies."
            mode: "replace"

    - name: "economics_decision"
      priority: 10
      description: "Economics and financial topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "economics"
      modelRefs:
        - model: "base-model"
          loraName: "social-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are an economics expert with deep understanding of microeconomics, macroeconomics, econometrics, financial markets, monetary policy, fiscal policy, international trade, and economic theory. Analyze economic phenomena using established economic principles, provide data-driven insights, and explain complex economic concepts in accessible terms. Consider both theoretical frameworks and real-world applications in your responses."
            mode: "replace"

    - name: "math_decision"
      priority: 10
      description: "Mathematics and quantitative reasoning"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "math"
      modelRefs:
        - model: "base-model"
          loraName: "math-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a mathematics expert. Provide step-by-step solutions, show your work clearly, and explain mathematical concepts in an understandable way."
            mode: "replace"

    - name: "physics_decision"
      priority: 10
      description: "Physics and physical sciences"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "physics"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a physics expert with deep understanding of physical laws and phenomena. Provide clear explanations with mathematical derivations when appropriate."
            mode: "replace"

    - name: "computer_science_decision"
      priority: 10
      description: "Computer science and programming"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "computer science"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a computer science expert with knowledge of algorithms, data structures, programming languages, and software engineering. Provide clear, practical solutions with code examples when helpful."
            mode: "replace"

    - name: "philosophy_decision"
      priority: 10
      description: "Philosophy and ethical questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "philosophy"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a philosophy expert with comprehensive knowledge of philosophical traditions, ethical theories, logic, metaphysics, epistemology, political philosophy, and the history of philosophical thought. Engage with complex philosophical questions by presenting multiple perspectives, analyzing arguments rigorously, and encouraging critical thinking. Draw connections between philosophical concepts and contemporary issues while maintaining intellectual honesty about the complexity and ongoing nature of philosophical debates."
            mode: "replace"

    - name: "engineering_decision"
      priority: 10
      description: "Engineering and technical problem-solving"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "engineering"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are an engineering expert with knowledge across multiple engineering disciplines including mechanical, electrical, civil, chemical, software, and systems engineering. Apply engineering principles, design methodologies, and problem-solving approaches to provide practical solutions. Consider safety, efficiency, sustainability, and cost-effectiveness in your recommendations. Use technical precision while explaining concepts clearly, and emphasize the importance of proper engineering practices and standards."
            mode: "replace"

    - name: "thinking_decision"
      priority: 20
      description: "Complex reasoning and multi-step thinking"
      signals:
        operator: "OR"
        conditions:
          - type: "keyword"
            name: "thinking"
      modelRefs:
        - model: "base-model"
          loraName: "general-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a thinking expert, should think multiple steps before answering. Please answer the question step by step."
            mode: "replace"

    - name: "general_decision"
      priority: 1
      description: "General knowledge and miscellaneous topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "other"
      modelRefs:
        - model: "base-model"
          loraName: "general-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.75
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a helpful and knowledgeable assistant. Provide accurate, helpful responses across a wide range of topics."
            mode: "replace"

IntelligentPool:

apiVersion: vllm.ai/v1alpha1
kind: IntelligentPool
metadata:
  name: ai-gateway-pool
  namespace: default
spec:
  defaultModel: "general-expert"
  models:
    - name: "base-model"
      reasoningFamily: "qwen3"
      loras:
        - name: "science-expert"
          description: "Specialized for science domains: biology, chemistry, physics, health, engineering"
        - name: "social-expert"
          description: "Optimized for social sciences: business, economics"
        - name: "math-expert"
          description: "Fine-tuned for mathematics and quantitative reasoning"
        - name: "law-expert"
          description: "Specialized for legal questions and law-related topics"
        - name: "humanities-expert"
          description: "Optimized for humanities: psychology, history, philosophy"
        - name: "general-expert"
          description: "General-purpose adapter for diverse topics"

Core Changes

1. Decision-Based Architecture

  • Replaced Category-based routing with Decision-based routing
  • Decisions combine multiple rules (keyword, embedding, domain) using AND/OR operators
  • Added DecisionEngine for evaluating rule combinations and selecting optimal decisions
  • Support for priority and confidence-based decision selection strategies

2. Plugin System

  • Introduced flexible plugin architecture for Decision-level configurations
  • Supported plugin types: semantic-cache, jailbreak, pii, system_prompt
  • Each plugin has type-specific configuration stored as raw JSON
  • Helper methods for type-safe plugin configuration access

3. Model References

  • Renamed ModelScores to ModelRefs, removed score field
  • Currently supports single model per decision (maxItems: 1)
  • Simplified model selection logic based on decision priority

4. Kubernetes CRD Integration

  • Added IntelligentPool and IntelligentRoute CRDs
  • CRD converter translates Kubernetes resources to internal config
  • Kubernetes controller watches and syncs CRD changes
  • Updated CRD schemas to use modelRefs and plugins arrays

Key Components

Decision Engine (pkg/decision/engine.go)

  • Evaluates rule combinations with AND/OR logic
  • Calculates confidence scores for matching decisions
  • Supports priority and confidence selection strategies

Configuration (pkg/config/config.go)

  • Decision structure with Rules, ModelRefs, and Plugins
  • Plugin configuration structs for each plugin type
  • Helper methods for accessing plugin configurations

CRD Types (pkg/apis/vllm.ai/v1alpha1/)

  • IntelligentPool: defines available models and their configurations
  • IntelligentRoute: defines routing decisions and rules
  • ModelRef: model reference without score field
  • DecisionPlugin: plugin configuration with type and raw config

Kubernetes Integration (pkg/k8s/)

  • Controller: watches CRDs and updates internal config
  • Converter: converts CRDs to internal config format
  • Comprehensive test coverage for CRD conversion

  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].
Detailed Checklist (Click to Expand)

Thank you for your contribution to semantic-router! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

  • [Bugfix] for bug fixes.
  • [CI/Build] for build or continuous integration improvements.
  • [Doc] for documentation fixes and improvements.
  • [Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
  • [Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
  • [Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

  • Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
  • The code need to be well-documented to ensure future contributors can easily understand the code.
  • Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

@netlify
Copy link

netlify bot commented Nov 17, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit f9a0096
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/691c8d5ed7946e0008521bb3
😎 Deploy Preview https://deploy-preview-681--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Nov 17, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • .crd-ref-docs.yaml
  • .github/workflows/integration-test-dynamic-config.yml
  • test_file.txt
  • .github/workflows/integration-test-docker.yml

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/helm/semantic-router/crds/vllm.ai_intelligentpools.yaml
  • deploy/helm/semantic-router/crds/vllm.ai_intelligentroutes.yaml
  • deploy/helm/semantic-router/templates/clusterrole.yaml
  • deploy/helm/semantic-router/templates/clusterrolebinding.yaml
  • deploy/kubernetes/crds/vllm.ai_intelligentpools.yaml
  • deploy/kubernetes/crds/vllm.ai_intelligentroutes.yaml
  • deploy/helm/semantic-router/README.md
  • deploy/helm/semantic-router/templates/deployment.yaml
  • deploy/helm/semantic-router/values.yaml
  • deploy/kserve/configmap-router-config.yaml
  • deploy/kserve/example-multi-model-config.yaml
  • deploy/kubernetes/ai-gateway/semantic-router-values/values.yaml
  • deploy/kubernetes/ai-gateway/semantic-router/config.yaml
  • deploy/kubernetes/aibrix/semantic-router-values/values.yaml
  • deploy/kubernetes/aibrix/semantic-router/config.yaml
  • deploy/kubernetes/istio/config.yaml
  • deploy/kubernetes/observability/dashboard/config.yaml
  • deploy/openshift/config-openshift.yaml

📁 e2e

Owners: @Xunzhuo
Files changed:

  • e2e/profiles/ai-gateway/values.yaml
  • e2e/profiles/dynamic-config/crds/intelligentpool.yaml
  • e2e/profiles/dynamic-config/crds/intelligentroute.yaml
  • e2e/profiles/dynamic-config/profile.go
  • e2e/profiles/dynamic-config/values.yaml
  • e2e/cmd/e2e/main.go
  • e2e/profiles/ai-gateway/profile.go

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/deploy/helm/semantic-router/templates/vllm.ai_intelligentpools.yaml
  • src/semantic-router/deploy/helm/semantic-router/templates/vllm.ai_intelligentroutes.yaml
  • src/semantic-router/deploy/kubernetes/crds/deploy/vllm.ai_intelligentpools.yaml
  • src/semantic-router/deploy/kubernetes/crds/deploy/vllm.ai_intelligentroutes.yaml
  • src/semantic-router/examples/decision-based-routing.yaml
  • src/semantic-router/pkg/apis/vllm.ai/v1alpha1/types_route.go
  • src/semantic-router/pkg/decision/engine.go
  • src/semantic-router/pkg/decision/engine_test.go
  • src/semantic-router/pkg/extproc/req_filter_header_mutation.go
  • src/semantic-router/pkg/k8s/controller.go
  • src/semantic-router/pkg/k8s/converter.go
  • src/semantic-router/pkg/k8s/converter_test.go
  • src/semantic-router/pkg/k8s/reconciler.go
  • src/semantic-router/pkg/k8s/testdata/README.md
  • src/semantic-router/pkg/k8s/testdata/base-config.yaml
  • src/semantic-router/pkg/k8s/testdata/input/01-basic.yaml
  • src/semantic-router/pkg/k8s/testdata/input/02-keyword-only.yaml
  • src/semantic-router/pkg/k8s/testdata/input/03-embedding-only.yaml
  • src/semantic-router/pkg/k8s/testdata/input/04-domain-only.yaml
  • src/semantic-router/pkg/k8s/testdata/input/05-keyword-embedding.yaml
  • src/semantic-router/pkg/k8s/testdata/input/06-keyword-domain.yaml
  • src/semantic-router/pkg/k8s/testdata/input/07-domain-embedding.yaml
  • src/semantic-router/pkg/k8s/testdata/input/08-keyword-embedding-domain.yaml
  • src/semantic-router/pkg/k8s/testdata/input/09-keyword-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/10-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/11-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/12-keyword-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/13-keyword-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/14-domain-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/15-keyword-embedding-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/input/16-keyword-embedding-domain-no-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/01-basic.yaml
  • src/semantic-router/pkg/k8s/testdata/output/02-keyword-only.yaml
  • src/semantic-router/pkg/k8s/testdata/output/03-embedding-only.yaml
  • src/semantic-router/pkg/k8s/testdata/output/04-domain-only.yaml
  • src/semantic-router/pkg/k8s/testdata/output/05-keyword-embedding.yaml
  • src/semantic-router/pkg/k8s/testdata/output/06-keyword-domain.yaml
  • src/semantic-router/pkg/k8s/testdata/output/07-domain-embedding.yaml
  • src/semantic-router/pkg/k8s/testdata/output/08-keyword-embedding-domain.yaml
  • src/semantic-router/pkg/k8s/testdata/output/09-keyword-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/10-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/11-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/12-keyword-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/13-keyword-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/14-domain-embedding-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/15-keyword-embedding-domain-plugin.yaml
  • src/semantic-router/pkg/k8s/testdata/output/16-keyword-embedding-domain-no-plugin.yaml
  • src/semantic-router/cmd/main.go
  • src/semantic-router/go.mod
  • src/semantic-router/go.sum
  • src/semantic-router/pkg/apis/vllm.ai/v1alpha1/register.go
  • src/semantic-router/pkg/apis/vllm.ai/v1alpha1/types.go
  • src/semantic-router/pkg/apis/vllm.ai/v1alpha1/zz_generated.deepcopy.go
  • src/semantic-router/pkg/apiserver/route_system_prompt.go
  • src/semantic-router/pkg/apiserver/server.go
  • src/semantic-router/pkg/apiserver/server_test.go
  • src/semantic-router/pkg/classification/classifier.go
  • src/semantic-router/pkg/classification/classifier_test.go
  • src/semantic-router/pkg/classification/embedding_classifier.go
  • src/semantic-router/pkg/classification/keyword_classifier.go
  • src/semantic-router/pkg/classification/keyword_entropy_test.go
  • src/semantic-router/pkg/classification/mcp_classifier.go
  • src/semantic-router/pkg/config/config.go
  • src/semantic-router/pkg/config/config_test.go
  • src/semantic-router/pkg/config/helper.go
  • src/semantic-router/pkg/config/loader.go
  • src/semantic-router/pkg/config/validator.go
  • src/semantic-router/pkg/extproc/extproc_test.go
  • src/semantic-router/pkg/extproc/processor_req_body.go
  • src/semantic-router/pkg/extproc/processor_req_header.go
  • src/semantic-router/pkg/extproc/processor_res_header.go
  • src/semantic-router/pkg/extproc/recorder.go
  • src/semantic-router/pkg/extproc/req_filter_cache.go
  • src/semantic-router/pkg/extproc/req_filter_classification.go
  • src/semantic-router/pkg/extproc/req_filter_jailbreak.go
  • src/semantic-router/pkg/extproc/req_filter_pii.go
  • src/semantic-router/pkg/extproc/req_filter_reason.go
  • src/semantic-router/pkg/extproc/req_filter_sys_prompt.go
  • src/semantic-router/pkg/extproc/router.go
  • src/semantic-router/pkg/extproc/server.go
  • src/semantic-router/pkg/headers/headers.go
  • src/semantic-router/pkg/utils/pii/policy.go

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/docs/api/crd-reference.md
  • website/docs/api/classification.md
  • website/docs/installation/configuration.md
  • website/docs/training/model-performance-eval.md
  • website/docs/tutorials/intelligent-route/domain-routing.md
  • website/docs/tutorials/intelligent-route/embedding-routing.md
  • website/sidebars.ts

📁 config

Owners: @rootfs, @Xunzhuo
Files changed:

  • config/config.yaml
  • config/intelligent-routing/in-tree/bert_classification.yaml
  • config/intelligent-routing/in-tree/embedding.yaml
  • config/intelligent-routing/in-tree/generic_categories.yaml
  • config/intelligent-routing/in-tree/keyword.yaml
  • config/intelligent-routing/in-tree/lora_routing.yaml
  • config/intelligent-routing/out-tree/config-mcp-classifier.yaml
  • config/observability/config.tracing.yaml
  • config/prompt-guard/jailbreak_domain.yaml
  • config/prompt-guard/pii_domain.yaml
  • config/semantic-cache/config.hybrid.yaml
  • config/testing/config.e2e.yaml
  • config/testing/config.testing.yaml

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/linter/yaml/.yamllint
  • tools/make/docs.mk
  • tools/make/golang.mk
  • tools/make/linter.mk

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 51c60a3 to eb0d095 Compare November 17, 2025 07:45
@Xunzhuo Xunzhuo changed the title feat: implement decision-based routing with plugin architecture [Feat]: Implement signal/decision-based routing with dynamic plugin architecture Nov 17, 2025
@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 25f435e to 51cefd8 Compare November 17, 2025 10:52
@Xunzhuo Xunzhuo changed the title [Feat]: Implement signal/decision-based routing with dynamic plugin architecture [Feat]: Implement Signal Decision-based routing with dynamic plugin architecture Nov 17, 2025
@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from fa33e47 to d7b2c50 Compare November 17, 2025 12:16
@Xunzhuo Xunzhuo changed the title [Feat]: Implement Signal Decision-based routing with dynamic plugin architecture [Feat]: Implement Signal-Decision Driven Routing with Dynamic Plugin Architecture Nov 18, 2025
@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch 2 times, most recently from f70cd77 to 316ccb7 Compare November 18, 2025 09:31
@Xunzhuo Xunzhuo marked this pull request as ready for review November 18, 2025 09:45
@Xunzhuo Xunzhuo changed the title [Feat]: Implement Signal-Decision Driven Routing with Dynamic Plugin Architecture [Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture Nov 18, 2025
@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch 3 times, most recently from 7e1d896 to 57046e1 Compare November 18, 2025 11:09
Signed-off-by: bitliu <bitliu@tencent.com>
@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 57046e1 to 91171f6 Compare November 18, 2025 11:53
@rootfs rootfs requested a review from Copilot November 18, 2025 13:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive decision-based routing system with a flexible plugin architecture, replacing the previous category-based approach. The changes enable signal-driven semantic routing with dynamic plugin configurations through Kubernetes CRDs (IntelligentRoute and IntelligentPool).

Key changes:

  • Replaced Category-based routing with Decision-based routing that combines multiple rules using AND/OR operators
  • Introduced a flexible plugin architecture for decision-level configurations (semantic-cache, jailbreak, pii, system_prompt)
  • Renamed ModelScores to ModelRefs and removed the score field, simplifying model selection
  • Added Kubernetes CRD integration with IntelligentPool and IntelligentRoute resources

Reviewed Changes

Copilot reviewed 79 out of 146 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/extproc/extproc_test.go Removed deprecated PII policy tests and category-based test cases
pkg/decision/engine.go New decision engine for evaluating rule combinations with AND/OR logic
pkg/decision/engine_test.go Tests for decision engine evaluation logic
pkg/config/validator.go Updated validation to check decisions instead of categories
pkg/config/loader.go Added config update notification channel for dynamic updates
pkg/config/helper.go Refactored helpers to use decisions instead of categories
pkg/config/config_test.go Updated tests to use decisions and removed category-based tests
pkg/config/config.go Added Decision, ModelRef, and plugin configuration structures
pkg/classification/mcp_classifier.go Updated to use decisions instead of categories
pkg/classification/keyword_entropy_test.go Removed deprecated keyword entropy test file
pkg/classification/keyword_classifier.go Changed Category field to Name in keyword rules
pkg/classification/embedding_classifier.go Renamed Keywords to Candidates in embedding rules
pkg/classification/classifier_test.go Removed category-based test cases
pkg/classification/classifier.go Added decision evaluation engine and refactored to use decisions
pkg/apiserver/server_test.go Removed deprecated system prompt endpoint tests
pkg/apiserver/server.go Updated to use global config instead of loading from file
pkg/apiserver/route_system_prompt.go Updated to work with decisions instead of categories
pkg/apis/vllm.ai/v1alpha1/zz_generated.deepcopy.go Generated deepcopy methods for new CRD types
pkg/apis/vllm.ai/v1alpha1/types_route.go New file defining IntelligentRoute CRD structure
pkg/apis/vllm.ai/v1alpha1/types.go Replaced SemanticRoute with IntelligentPool and IntelligentRoute
pkg/apis/vllm.ai/v1alpha1/register.go Updated to register new CRD types
pkg/apis/vllm.ai/v1alpha1/filter_types.go Removed deprecated filter types file
cmd/main.go Added Kubernetes controller initialization for CRD-based config
deploy/kubernetes/observability/dashboard/config.yaml Removed deprecated pii_policy configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Xunzhuo Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 2672c15 to f9a0096 Compare November 18, 2025 15:14
@rootfs rootfs merged commit fa74d0e into main Nov 18, 2025
31 of 32 checks passed
@rootfs
Copy link
Collaborator

rootfs commented Nov 18, 2025

thank you! this is really impressive!

yossiovadia added a commit to yossiovadia/semantic-router that referenced this pull request Nov 18, 2025
…lm-project#681 merge

After PR vllm-project#681 merge, Categories no longer have ModelScores field.
The reasoning config moved to Decisions.ModelRefs, but there's no
direct mapping from category names to decision names.

Set useReasoning=false as safe default until proper category-to-decision
mapping is implemented.

Related: PR vllm-project#648, PR vllm-project#681
Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
yossiovadia added a commit to yossiovadia/semantic-router that referenced this pull request Nov 19, 2025
After PR vllm-project#681 introduced decision-based routing, PII detection requires a
decision to be selected. Using model="MoM" triggers domain classification,
but PII test data is domain-agnostic, so no domain matches, no decision is
selected, and PII detection gets disabled.

Solution: Use model="base-model" directly which matches all decisions in
the CRD. This ensures a decision is selected and PII detection is enabled.

This still tests LoRA PII auto-detection as configured in the classifier
settings, but ensures the decision-based PII plugin is activated.

Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
yossiovadia added a commit to yossiovadia/semantic-router that referenced this pull request Nov 19, 2025
After PR vllm-project#681 introduced decision-based routing, PII detection requires a
decision to be selected. Using model="MoM" triggers domain classification,
but PII test data is domain-agnostic, so no domain matches, no decision is
selected, and PII detection gets disabled.

Solution: Use model="base-model" directly which matches all decisions in
the CRD. This ensures a decision is selected and PII detection is enabled.

This still tests LoRA PII auto-detection as configured in the classifier
settings, but ensures the decision-based PII plugin is activated.

Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
szedan-rh pushed a commit to szedan-rh/semantic-router that referenced this pull request Nov 19, 2025
…rchitecture (vllm-project#681)

* feat: implement decision-based routing with plugin architecture

Signed-off-by: bitliu <bitliu@tencent.com>

* fix ci

Signed-off-by: bitliu <bitliu@tencent.com>

---------

Signed-off-by: bitliu <bitliu@tencent.com>
yossiovadia added a commit to yossiovadia/semantic-router that referenced this pull request Nov 20, 2025
…lm-project#681 merge

After PR vllm-project#681 merge, Categories no longer have ModelScores field.
The reasoning config moved to Decisions.ModelRefs, but there's no
direct mapping from category names to decision names.

Set useReasoning=false as safe default until proper category-to-decision
mapping is implemented.

Related: PR vllm-project#648, PR vllm-project#681
Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
yossiovadia added a commit to yossiovadia/semantic-router that referenced this pull request Nov 20, 2025
After PR vllm-project#681 introduced decision-based routing, PII detection requires a
decision to be selected. Using model="MoM" triggers domain classification,
but PII test data is domain-agnostic, so no domain matches, no decision is
selected, and PII detection gets disabled.

Solution: Use model="base-model" directly which matches all decisions in
the CRD. This ensures a decision is selected and PII detection is enabled.

This still tests LoRA PII auto-detection as configured in the classifier
settings, but ensures the decision-based PII plugin is activated.

Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants