[Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture #681

Xunzhuo · 2025-11-17T07:42:37Z

This commit introduces a comprehensive decision-based routing system with a flexible plugin architecture, replacing the previous category-based approach.

IntelligentRoute

apiVersion: vllm.ai/v1alpha1
kind: IntelligentRoute
metadata:
  name: ai-gateway-route
  namespace: default
spec:
  signals:
    keywords:
      - name: "thinking"
        operator: "OR"
        keywords: ["urgent", "immediate", "asap", "think", "careful"]
        caseSensitive: false

    domains:
      - name: "business"
        description: "Business and management related queries"
      - name: "law"
        description: "Legal questions and law-related topics"
      - name: "psychology"
        description: "Psychology and mental health topics"
      - name: "biology"
        description: "Biology and life sciences questions"
      - name: "chemistry"
        description: "Chemistry and chemical sciences questions"
      - name: "history"
        description: "Historical questions and cultural topics"
      - name: "health"
        description: "Health and medical information queries"
      - name: "economics"
        description: "Economics and financial topics"
      - name: "math"
        description: "Mathematics and quantitative reasoning"
      - name: "physics"
        description: "Physics and physical sciences"
      - name: "computer science"
        description: "Computer science and programming"
      - name: "philosophy"
        description: "Philosophy and ethical questions"
      - name: "engineering"
        description: "Engineering and technical problem-solving"
      - name: "other"
        description: "General knowledge and miscellaneous topics"

  decisions:
    - name: "business_decision"
      priority: 10
      description: "Business and management related queries"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "business"
      modelRefs:
        - model: "base-model"
          loraName: "social-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a senior business consultant and strategic advisor with expertise in corporate strategy, operations management, financial analysis, marketing, and organizational development. Provide practical, actionable business advice backed by proven methodologies and industry best practices. Consider market dynamics, competitive landscape, and stakeholder interests in your recommendations."
            mode: "replace"

    - name: "law_decision"
      priority: 10
      description: "Legal questions and law-related topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "law"
      modelRefs:
        - model: "base-model"
          loraName: "law-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a knowledgeable legal expert with comprehensive understanding of legal principles, case law, statutory interpretation, and legal procedures across multiple jurisdictions. Provide accurate legal information and analysis while clearly stating that your responses are for informational purposes only and do not constitute legal advice. Always recommend consulting with qualified legal professionals for specific legal matters."
            mode: "replace"

    - name: "psychology_decision"
      priority: 10
      description: "Psychology and mental health topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "psychology"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.92
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a psychology expert with deep knowledge of cognitive processes, behavioral patterns, mental health, developmental psychology, social psychology, and therapeutic approaches. Provide evidence-based insights grounded in psychological research and theory. When discussing mental health topics, emphasize the importance of professional consultation and avoid providing diagnostic or therapeutic advice."
            mode: "replace"

    - name: "biology_decision"
      priority: 10
      description: "Biology and life sciences questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "biology"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a biology expert with comprehensive knowledge spanning molecular biology, genetics, cell biology, ecology, evolution, anatomy, physiology, and biotechnology. Explain biological concepts with scientific accuracy, use appropriate terminology, and provide examples from current research. Connect biological principles to real-world applications and emphasize the interconnectedness of biological systems."
            mode: "replace"

    - name: "chemistry_decision"
      priority: 10
      description: "Chemistry and chemical sciences questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "chemistry"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a chemistry expert specializing in chemical reactions, molecular structures, and laboratory techniques. Provide detailed, step-by-step explanations."
            mode: "replace"

    - name: "history_decision"
      priority: 10
      description: "Historical questions and cultural topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "history"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a historian with expertise across different time periods and cultures. Provide accurate historical context and analysis."
            mode: "replace"

    - name: "health_decision"
      priority: 10
      description: "Health and medical information queries"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "health"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.95
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a health and medical information expert with knowledge of anatomy, physiology, diseases, treatments, preventive care, nutrition, and wellness. Provide accurate, evidence-based health information while emphasizing that your responses are for educational purposes only and should never replace professional medical advice, diagnosis, or treatment. Always encourage users to consult healthcare professionals for medical concerns and emergencies."
            mode: "replace"

    - name: "economics_decision"
      priority: 10
      description: "Economics and financial topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "economics"
      modelRefs:
        - model: "base-model"
          loraName: "social-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are an economics expert with deep understanding of microeconomics, macroeconomics, econometrics, financial markets, monetary policy, fiscal policy, international trade, and economic theory. Analyze economic phenomena using established economic principles, provide data-driven insights, and explain complex economic concepts in accessible terms. Consider both theoretical frameworks and real-world applications in your responses."
            mode: "replace"

    - name: "math_decision"
      priority: 10
      description: "Mathematics and quantitative reasoning"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "math"
      modelRefs:
        - model: "base-model"
          loraName: "math-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a mathematics expert. Provide step-by-step solutions, show your work clearly, and explain mathematical concepts in an understandable way."
            mode: "replace"

    - name: "physics_decision"
      priority: 10
      description: "Physics and physical sciences"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "physics"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a physics expert with deep understanding of physical laws and phenomena. Provide clear explanations with mathematical derivations when appropriate."
            mode: "replace"

    - name: "computer_science_decision"
      priority: 10
      description: "Computer science and programming"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "computer science"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a computer science expert with knowledge of algorithms, data structures, programming languages, and software engineering. Provide clear, practical solutions with code examples when helpful."
            mode: "replace"

    - name: "philosophy_decision"
      priority: 10
      description: "Philosophy and ethical questions"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "philosophy"
      modelRefs:
        - model: "base-model"
          loraName: "humanities-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a philosophy expert with comprehensive knowledge of philosophical traditions, ethical theories, logic, metaphysics, epistemology, political philosophy, and the history of philosophical thought. Engage with complex philosophical questions by presenting multiple perspectives, analyzing arguments rigorously, and encouraging critical thinking. Draw connections between philosophical concepts and contemporary issues while maintaining intellectual honesty about the complexity and ongoing nature of philosophical debates."
            mode: "replace"

    - name: "engineering_decision"
      priority: 10
      description: "Engineering and technical problem-solving"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "engineering"
      modelRefs:
        - model: "base-model"
          loraName: "science-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are an engineering expert with knowledge across multiple engineering disciplines including mechanical, electrical, civil, chemical, software, and systems engineering. Apply engineering principles, design methodologies, and problem-solving approaches to provide practical solutions. Consider safety, efficiency, sustainability, and cost-effectiveness in your recommendations. Use technical precision while explaining concepts clearly, and emphasize the importance of proper engineering practices and standards."
            mode: "replace"

    - name: "thinking_decision"
      priority: 20
      description: "Complex reasoning and multi-step thinking"
      signals:
        operator: "OR"
        conditions:
          - type: "keyword"
            name: "thinking"
      modelRefs:
        - model: "base-model"
          loraName: "general-expert"
          useReasoning: true
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a thinking expert, should think multiple steps before answering. Please answer the question step by step."
            mode: "replace"

    - name: "general_decision"
      priority: 1
      description: "General knowledge and miscellaneous topics"
      signals:
        operator: "OR"
        conditions:
          - type: "domain"
            name: "other"
      modelRefs:
        - model: "base-model"
          loraName: "general-expert"
          useReasoning: false
      plugins:
        - type: "pii"
          configuration:
            enabled: true
            pii_types_allowed: []
        - type: "semantic-cache"
          configuration:
            enabled: true
            similarity_threshold: 0.75
        - type: "system_prompt"
          configuration:
            enabled: true
            system_prompt: "You are a helpful and knowledgeable assistant. Provide accurate, helpful responses across a wide range of topics."
            mode: "replace"

IntelligentPool:

apiVersion: vllm.ai/v1alpha1
kind: IntelligentPool
metadata:
  name: ai-gateway-pool
  namespace: default
spec:
  defaultModel: "general-expert"
  models:
    - name: "base-model"
      reasoningFamily: "qwen3"
      loras:
        - name: "science-expert"
          description: "Specialized for science domains: biology, chemistry, physics, health, engineering"
        - name: "social-expert"
          description: "Optimized for social sciences: business, economics"
        - name: "math-expert"
          description: "Fine-tuned for mathematics and quantitative reasoning"
        - name: "law-expert"
          description: "Specialized for legal questions and law-related topics"
        - name: "humanities-expert"
          description: "Optimized for humanities: psychology, history, philosophy"
        - name: "general-expert"
          description: "General-purpose adapter for diverse topics"

Core Changes

1. Decision-Based Architecture

Replaced Category-based routing with Decision-based routing
Decisions combine multiple rules (keyword, embedding, domain) using AND/OR operators
Added DecisionEngine for evaluating rule combinations and selecting optimal decisions
Support for priority and confidence-based decision selection strategies

2. Plugin System

Introduced flexible plugin architecture for Decision-level configurations
Supported plugin types: semantic-cache, jailbreak, pii, system_prompt
Each plugin has type-specific configuration stored as raw JSON
Helper methods for type-safe plugin configuration access

3. Model References

Renamed ModelScores to ModelRefs, removed score field
Currently supports single model per decision (maxItems: 1)
Simplified model selection logic based on decision priority

4. Kubernetes CRD Integration

Added IntelligentPool and IntelligentRoute CRDs
CRD converter translates Kubernetes resources to internal config
Kubernetes controller watches and syncs CRD changes
Updated CRD schemas to use modelRefs and plugins arrays

Key Components

Decision Engine (pkg/decision/engine.go)

Evaluates rule combinations with AND/OR logic
Calculates confidence scores for matching decisions
Supports priority and confidence selection strategies

Configuration (pkg/config/config.go)

Decision structure with Rules, ModelRefs, and Plugins
Plugin configuration structs for each plugin type
Helper methods for accessing plugin configurations

CRD Types (pkg/apis/vllm.ai/v1alpha1/)

IntelligentPool: defines available models and their configurations
IntelligentRoute: defines routing decisions and rules
ModelRef: model reference without score field
DecisionPlugin: plugin configuration with type and raw config

Kubernetes Integration (pkg/k8s/)

Controller: watches CRDs and updates internal config
Converter: converts CRDs to internal config format
Comprehensive test coverage for CRD conversion

Make sure the code changes pass the pre-commit checks.
Sign-off your commit by using -s when doing git commit
Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].

Detailed Checklist (Click to Expand)

Thank you for your contribution to semantic-router! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

[Bugfix] for bug fixes.
[CI/Build] for build or continuous integration improvements.
[Doc] for documentation fixes and improvements.
[Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
[Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
[Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
The code need to be well-documented to ensure future contributors can easily understand the code.
Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

netlify · 2025-11-17T07:42:43Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`f9a0096`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/691c8d5ed7946e0008521bb3
😎 Deploy Preview	https://deploy-preview-681--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-11-17T07:42:51Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

.crd-ref-docs.yaml
.github/workflows/integration-test-dynamic-config.yml
test_file.txt
.github/workflows/integration-test-docker.yml

📁 `deploy`

Owners: @rootfs, @Xunzhuo
Files changed:

deploy/helm/semantic-router/crds/vllm.ai_intelligentpools.yaml
deploy/helm/semantic-router/crds/vllm.ai_intelligentroutes.yaml
deploy/helm/semantic-router/templates/clusterrole.yaml
deploy/helm/semantic-router/templates/clusterrolebinding.yaml
deploy/kubernetes/crds/vllm.ai_intelligentpools.yaml
deploy/kubernetes/crds/vllm.ai_intelligentroutes.yaml
deploy/helm/semantic-router/README.md
deploy/helm/semantic-router/templates/deployment.yaml
deploy/helm/semantic-router/values.yaml
deploy/kserve/configmap-router-config.yaml
deploy/kserve/example-multi-model-config.yaml
deploy/kubernetes/ai-gateway/semantic-router-values/values.yaml
deploy/kubernetes/ai-gateway/semantic-router/config.yaml
deploy/kubernetes/aibrix/semantic-router-values/values.yaml
deploy/kubernetes/aibrix/semantic-router/config.yaml
deploy/kubernetes/istio/config.yaml
deploy/kubernetes/observability/dashboard/config.yaml
deploy/openshift/config-openshift.yaml

📁 `e2e`

Owners: @Xunzhuo
Files changed:

e2e/profiles/ai-gateway/values.yaml
e2e/profiles/dynamic-config/crds/intelligentpool.yaml
e2e/profiles/dynamic-config/crds/intelligentroute.yaml
e2e/profiles/dynamic-config/profile.go
e2e/profiles/dynamic-config/values.yaml
e2e/cmd/e2e/main.go
e2e/profiles/ai-gateway/profile.go

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/deploy/helm/semantic-router/templates/vllm.ai_intelligentpools.yaml
src/semantic-router/deploy/helm/semantic-router/templates/vllm.ai_intelligentroutes.yaml
src/semantic-router/deploy/kubernetes/crds/deploy/vllm.ai_intelligentpools.yaml
src/semantic-router/deploy/kubernetes/crds/deploy/vllm.ai_intelligentroutes.yaml
src/semantic-router/examples/decision-based-routing.yaml
src/semantic-router/pkg/apis/vllm.ai/v1alpha1/types_route.go
src/semantic-router/pkg/decision/engine.go
src/semantic-router/pkg/decision/engine_test.go
src/semantic-router/pkg/extproc/req_filter_header_mutation.go
src/semantic-router/pkg/k8s/controller.go
src/semantic-router/pkg/k8s/converter.go
src/semantic-router/pkg/k8s/converter_test.go
src/semantic-router/pkg/k8s/reconciler.go
src/semantic-router/pkg/k8s/testdata/README.md
src/semantic-router/pkg/k8s/testdata/base-config.yaml
src/semantic-router/pkg/k8s/testdata/input/01-basic.yaml
src/semantic-router/pkg/k8s/testdata/input/02-keyword-only.yaml
src/semantic-router/pkg/k8s/testdata/input/03-embedding-only.yaml
src/semantic-router/pkg/k8s/testdata/input/04-domain-only.yaml
src/semantic-router/pkg/k8s/testdata/input/05-keyword-embedding.yaml
src/semantic-router/pkg/k8s/testdata/input/06-keyword-domain.yaml
src/semantic-router/pkg/k8s/testdata/input/07-domain-embedding.yaml
src/semantic-router/pkg/k8s/testdata/input/08-keyword-embedding-domain.yaml
src/semantic-router/pkg/k8s/testdata/input/09-keyword-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/10-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/11-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/12-keyword-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/13-keyword-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/14-domain-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/15-keyword-embedding-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/input/16-keyword-embedding-domain-no-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/01-basic.yaml
src/semantic-router/pkg/k8s/testdata/output/02-keyword-only.yaml
src/semantic-router/pkg/k8s/testdata/output/03-embedding-only.yaml
src/semantic-router/pkg/k8s/testdata/output/04-domain-only.yaml
src/semantic-router/pkg/k8s/testdata/output/05-keyword-embedding.yaml
src/semantic-router/pkg/k8s/testdata/output/06-keyword-domain.yaml
src/semantic-router/pkg/k8s/testdata/output/07-domain-embedding.yaml
src/semantic-router/pkg/k8s/testdata/output/08-keyword-embedding-domain.yaml
src/semantic-router/pkg/k8s/testdata/output/09-keyword-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/10-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/11-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/12-keyword-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/13-keyword-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/14-domain-embedding-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/15-keyword-embedding-domain-plugin.yaml
src/semantic-router/pkg/k8s/testdata/output/16-keyword-embedding-domain-no-plugin.yaml
src/semantic-router/cmd/main.go
src/semantic-router/go.mod
src/semantic-router/go.sum
src/semantic-router/pkg/apis/vllm.ai/v1alpha1/register.go
src/semantic-router/pkg/apis/vllm.ai/v1alpha1/types.go
src/semantic-router/pkg/apis/vllm.ai/v1alpha1/zz_generated.deepcopy.go
src/semantic-router/pkg/apiserver/route_system_prompt.go
src/semantic-router/pkg/apiserver/server.go
src/semantic-router/pkg/apiserver/server_test.go
src/semantic-router/pkg/classification/classifier.go
src/semantic-router/pkg/classification/classifier_test.go
src/semantic-router/pkg/classification/embedding_classifier.go
src/semantic-router/pkg/classification/keyword_classifier.go
src/semantic-router/pkg/classification/keyword_entropy_test.go
src/semantic-router/pkg/classification/mcp_classifier.go
src/semantic-router/pkg/config/config.go
src/semantic-router/pkg/config/config_test.go
src/semantic-router/pkg/config/helper.go
src/semantic-router/pkg/config/loader.go
src/semantic-router/pkg/config/validator.go
src/semantic-router/pkg/extproc/extproc_test.go
src/semantic-router/pkg/extproc/processor_req_body.go
src/semantic-router/pkg/extproc/processor_req_header.go
src/semantic-router/pkg/extproc/processor_res_header.go
src/semantic-router/pkg/extproc/recorder.go
src/semantic-router/pkg/extproc/req_filter_cache.go
src/semantic-router/pkg/extproc/req_filter_classification.go
src/semantic-router/pkg/extproc/req_filter_jailbreak.go
src/semantic-router/pkg/extproc/req_filter_pii.go
src/semantic-router/pkg/extproc/req_filter_reason.go
src/semantic-router/pkg/extproc/req_filter_sys_prompt.go
src/semantic-router/pkg/extproc/router.go
src/semantic-router/pkg/extproc/server.go
src/semantic-router/pkg/headers/headers.go
src/semantic-router/pkg/utils/pii/policy.go

📁 `website`

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

website/docs/api/crd-reference.md
website/docs/api/classification.md
website/docs/installation/configuration.md
website/docs/training/model-performance-eval.md
website/docs/tutorials/intelligent-route/domain-routing.md
website/docs/tutorials/intelligent-route/embedding-routing.md
website/sidebars.ts

📁 `config`

Owners: @rootfs, @Xunzhuo
Files changed:

config/config.yaml
config/intelligent-routing/in-tree/bert_classification.yaml
config/intelligent-routing/in-tree/embedding.yaml
config/intelligent-routing/in-tree/generic_categories.yaml
config/intelligent-routing/in-tree/keyword.yaml
config/intelligent-routing/in-tree/lora_routing.yaml
config/intelligent-routing/out-tree/config-mcp-classifier.yaml
config/observability/config.tracing.yaml
config/prompt-guard/jailbreak_domain.yaml
config/prompt-guard/pii_domain.yaml
config/semantic-cache/config.hybrid.yaml
config/testing/config.e2e.yaml
config/testing/config.testing.yaml

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/linter/yaml/.yamllint
tools/make/docs.mk
tools/make/golang.mk
tools/make/linter.mk

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Signed-off-by: bitliu <bitliu@tencent.com>

Copilot

Pull Request Overview

This PR introduces a comprehensive decision-based routing system with a flexible plugin architecture, replacing the previous category-based approach. The changes enable signal-driven semantic routing with dynamic plugin configurations through Kubernetes CRDs (IntelligentRoute and IntelligentPool).

Key changes:

Replaced Category-based routing with Decision-based routing that combines multiple rules using AND/OR operators
Introduced a flexible plugin architecture for decision-level configurations (semantic-cache, jailbreak, pii, system_prompt)
Renamed ModelScores to ModelRefs and removed the score field, simplifying model selection
Added Kubernetes CRD integration with IntelligentPool and IntelligentRoute resources

Reviewed Changes

Copilot reviewed 79 out of 146 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
pkg/extproc/extproc_test.go	Removed deprecated PII policy tests and category-based test cases
pkg/decision/engine.go	New decision engine for evaluating rule combinations with AND/OR logic
pkg/decision/engine_test.go	Tests for decision engine evaluation logic
pkg/config/validator.go	Updated validation to check decisions instead of categories
pkg/config/loader.go	Added config update notification channel for dynamic updates
pkg/config/helper.go	Refactored helpers to use decisions instead of categories
pkg/config/config_test.go	Updated tests to use decisions and removed category-based tests
pkg/config/config.go	Added Decision, ModelRef, and plugin configuration structures
pkg/classification/mcp_classifier.go	Updated to use decisions instead of categories
pkg/classification/keyword_entropy_test.go	Removed deprecated keyword entropy test file
pkg/classification/keyword_classifier.go	Changed Category field to Name in keyword rules
pkg/classification/embedding_classifier.go	Renamed Keywords to Candidates in embedding rules
pkg/classification/classifier_test.go	Removed category-based test cases
pkg/classification/classifier.go	Added decision evaluation engine and refactored to use decisions
pkg/apiserver/server_test.go	Removed deprecated system prompt endpoint tests
pkg/apiserver/server.go	Updated to use global config instead of loading from file
pkg/apiserver/route_system_prompt.go	Updated to work with decisions instead of categories
pkg/apis/vllm.ai/v1alpha1/zz_generated.deepcopy.go	Generated deepcopy methods for new CRD types
pkg/apis/vllm.ai/v1alpha1/types_route.go	New file defining IntelligentRoute CRD structure
pkg/apis/vllm.ai/v1alpha1/types.go	Replaced SemanticRoute with IntelligentPool and IntelligentRoute
pkg/apis/vllm.ai/v1alpha1/register.go	Updated to register new CRD types
pkg/apis/vllm.ai/v1alpha1/filter_types.go	Removed deprecated filter types file
cmd/main.go	Added Kubernetes controller initialization for CRD-based config
deploy/kubernetes/observability/dashboard/config.yaml	Removed deprecated pii_policy configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: bitliu <bitliu@tencent.com>

rootfs · 2025-11-18T16:02:53Z

thank you! this is really impressive!

…lm-project#681 merge After PR vllm-project#681 merge, Categories no longer have ModelScores field. The reasoning config moved to Decisions.ModelRefs, but there's no direct mapping from category names to decision names. Set useReasoning=false as safe default until proper category-to-decision mapping is implemented. Related: PR vllm-project#648, PR vllm-project#681 Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

After PR vllm-project#681 introduced decision-based routing, PII detection requires a decision to be selected. Using model="MoM" triggers domain classification, but PII test data is domain-agnostic, so no domain matches, no decision is selected, and PII detection gets disabled. Solution: Use model="base-model" directly which matches all decisions in the CRD. This ensures a decision is selected and PII detection is enabled. This still tests LoRA PII auto-detection as configured in the classifier settings, but ensures the decision-based PII plugin is activated. Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

…rchitecture (vllm-project#681) * feat: implement decision-based routing with plugin architecture Signed-off-by: bitliu <bitliu@tencent.com> * fix ci Signed-off-by: bitliu <bitliu@tencent.com> --------- Signed-off-by: bitliu <bitliu@tencent.com>

…lm-project#681 merge After PR vllm-project#681 merge, Categories no longer have ModelScores field. The reasoning config moved to Decisions.ModelRefs, but there's no direct mapping from category names to decision names. Set useReasoning=false as safe default until proper category-to-decision mapping is implemented. Related: PR vllm-project#648, PR vllm-project#681 Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

After PR vllm-project#681 introduced decision-based routing, PII detection requires a decision to be selected. Using model="MoM" triggers domain classification, but PII test data is domain-agnostic, so no domain matches, no decision is selected, and PII detection gets disabled. Solution: Use model="base-model" directly which matches all decisions in the CRD. This ensures a decision is selected and PII detection is enabled. This still tests LoRA PII auto-detection as configured in the classifier settings, but ensures the decision-based PII plugin is activated. Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Nov 17, 2025

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 51c60a3 to eb0d095 Compare November 17, 2025 07:45

Xunzhuo changed the title ~~feat: implement decision-based routing with plugin architecture~~ [Feat]: Implement signal/decision-based routing with dynamic plugin architecture Nov 17, 2025

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 25f435e to 51cefd8 Compare November 17, 2025 10:52

Xunzhuo changed the title ~~[Feat]: Implement signal/decision-based routing with dynamic plugin architecture~~ [Feat]: Implement Signal Decision-based routing with dynamic plugin architecture Nov 17, 2025

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from fa33e47 to d7b2c50 Compare November 17, 2025 12:16

Xunzhuo changed the title ~~[Feat]: Implement Signal Decision-based routing with dynamic plugin architecture~~ [Feat]: Implement Signal-Decision Driven Routing with Dynamic Plugin Architecture Nov 18, 2025

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch 2 times, most recently from f70cd77 to 316ccb7 Compare November 18, 2025 09:31

Xunzhuo marked this pull request as ready for review November 18, 2025 09:45

Xunzhuo requested review from rootfs and wangchen615 as code owners November 18, 2025 09:45

Xunzhuo changed the title ~~[Feat]: Implement Signal-Decision Driven Routing with Dynamic Plugin Architecture~~ [Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture Nov 18, 2025

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch 3 times, most recently from 7e1d896 to 57046e1 Compare November 18, 2025 11:09

feat: implement decision-based routing with plugin architecture

91171f6

Signed-off-by: bitliu <bitliu@tencent.com>

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 57046e1 to 91171f6 Compare November 18, 2025 11:53

rootfs requested a review from Copilot November 18, 2025 13:56

Copilot AI reviewed Nov 18, 2025

View reviewed changes

Xunzhuo added 2 commits November 18, 2025 22:00

Merge branch 'main' into feat/decision-based-routing-with-plugins

3abbf0f

fix ci

f9a0096

Signed-off-by: bitliu <bitliu@tencent.com>

Xunzhuo force-pushed the feat/decision-based-routing-with-plugins branch from 2672c15 to f9a0096 Compare November 18, 2025 15:14

rootfs merged commit fa74d0e into main Nov 18, 2025
31 of 32 checks passed

yehudit1987 mentioned this pull request Nov 19, 2025

[E2E] Add Signal-Decision Engine Test Coverage for New Plugin Architecture #692

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture #681

[Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture #681

Uh oh!

Xunzhuo commented Nov 17, 2025 •

edited

Loading

Uh oh!

netlify bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

rootfs commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture #681

[Feat]: Signal-Decision Driven Semantic Routing with Dynamic Plugin Architecture #681

Uh oh!

Conversation

Xunzhuo commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Core Changes

1. Decision-Based Architecture

2. Plugin System

3. Model References

4. Kubernetes CRD Integration

Key Components

Decision Engine (pkg/decision/engine.go)

Configuration (pkg/config/config.go)

CRD Types (pkg/apis/vllm.ai/v1alpha1/)

Kubernetes Integration (pkg/k8s/)

PR Title and Classification

Code Quality

DCO and Signed-off-by

What to Expect for the Reviews

Uh oh!

netlify bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 deploy

📁 e2e

📁 src

📁 website

📁 config

📁 tools

🎉 Thanks for your contributions!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

rootfs commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Xunzhuo commented Nov 17, 2025 •

edited

Loading

netlify bot commented Nov 17, 2025 •

edited

Loading

github-actions bot commented Nov 17, 2025 •

edited

Loading

📁 `Root Directory`

📁 `deploy`

📁 `e2e`

📁 `src`

📁 `website`

📁 `config`

📁 `tools`