|
1 | 1 | --- |
2 | 2 | description: >- |
3 | | - Learn how to implement distributed tracing in Seldon Core 2 using |
4 | | - OpenTelemetry and Jaeger for monitoring and debugging ML model serving |
5 | | - pipelines. |
| 3 | + This guide walks you through setting up Jaeger Tracing for Seldon Core v2 on Kubernetes. By the end of this guide, you will be able to visualize inference traces through your Core 2 components. |
6 | 4 | --- |
7 | 5 |
|
8 | | -# Tracing |
| 6 | +## Prerequisites |
9 | 7 |
|
10 | | -We support Open Telemetry tracing. By default all components will attempt to send OLTP events to`seldon-collector.seldon-mesh:4317` which will export to Jaeger at `simplest-collector.seldon-mesh:4317`. |
| 8 | +* Set up and connect to a Kubernetes cluster running version 1.27 or later. For instructions on connecting to your Kubernetes cluster, refer to the documentation provided by your cloud provider. |
| 9 | +* Install [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl), the Kubernetes command-line tool. |
| 10 | +* Install [Helm](https://helm.sh/docs/intro/install/), the package manager for Kubernetes. |
| 11 | +* Install [Seldon Core 2](../installation/README.md) |
| 12 | +* Install [cert-manager](https://cert-manager.io/docs/installation/kubectl/) in the namespace `cert-manager`. |
11 | 13 |
|
12 | | -The components can be installed from the `tracing/k8s` folder. In future an Ansible playbook will\ |
13 | | -be created. This installs a Open Telemetry collector and a simple Jaeger install with a service that\ |
14 | | -can be port forwarded to at `simplest.seldon-mesh:16686`. |
| 14 | +To set up Jaeger Tracing for Seldon Core 2 on Kubernetes and visualize inference traces of the Seldon Core 2 components. You need to do the following: |
| 15 | +1. [Create a namespace](#create-a-namespace) |
| 16 | +2. [Install Jaeger Operator](#install-jaeger-operator) |
| 17 | +3. [Deploy a Jaeger instance](#deploy-a-minimal-jaeger-instance) |
| 18 | +4. [Configure Core 2](#configure-seldon-core-2) |
| 19 | +5. [Generate traffic](#generate-traffic) |
| 20 | +6. [Visualize the traces](#cccess-the-jaeger-ui) |
15 | 21 |
|
16 | | -An example Jaeger trace is show below: |
| 22 | +## Create a namespace |
| 23 | +Create a dedicated namespace to install the Jaeger Operator and tracing resources: |
| 24 | +```bash |
| 25 | +kubectl create namespace tracing |
| 26 | +``` |
| 27 | +## Install Jaeger Operator |
| 28 | + |
| 29 | +The Jaeger Operator manages Jaeger instances in the Kubernetes cluster. Use the [Helm chart](s://github.com/jaegertracing/helm-charts/tree/v2) for Jaeger v2. |
| 30 | + |
| 31 | +1. Add the Jaeger to the Helm repository: |
| 32 | +```bash |
| 33 | +helm repo add jaegertracing s://jaegertracing.github.io/helm-charts |
| 34 | +helm repo update |
| 35 | +``` |
| 36 | +2. Create a minimal `tracing-values.yaml`: |
| 37 | +```bash |
| 38 | +rbac: |
| 39 | + clusterRole: true |
| 40 | + create: true |
| 41 | + pspEnabled: false |
| 42 | +``` |
| 43 | +3. Install or upgrade the Jaeger Operator in the tracing namespace: |
| 44 | +```bash |
| 45 | +helm upgrade tracing jaegertracing/jaeger-operator \ |
| 46 | + --version 2.57.0 \ |
| 47 | + -f tracing-values.yaml \ |
| 48 | + -n tracing \ |
| 49 | + --install |
| 50 | +``` |
| 51 | +4. Validate that the Jaeger Operator Pod is running: |
| 52 | +```bash |
| 53 | +kubectl get pods -n tracing |
| 54 | +``` |
| 55 | +Output is similar to: |
| 56 | +```bash |
| 57 | +NAME READY STATUS RESTARTS AGE |
| 58 | +tracing-jaeger-operator-549b79b848-h4p4d 1/1 Running 0 96s |
| 59 | +``` |
| 60 | +## Deploy a minimal Jaeger instance |
| 61 | +Install a simple Jaeger custom resource in the namespace `seldon-mesh`, where Seldon Core 2 is running. |
| 62 | +{% hint style="info" %} |
| 63 | +This CR is suitable for local development, demos, and quick-start scenarios. It is not recommended for production because all components and trace data are ephemeral. |
| 64 | +{% endhint %} |
| 65 | + |
| 66 | +1. Create a manifest file named `jaeger-simplest.yaml` with these contents: |
| 67 | +```bash |
| 68 | +apiVersion: jaegertracing.io/v1 |
| 69 | +kind: Jaeger |
| 70 | +metadata: |
| 71 | + name: simplest |
| 72 | + namespace: seldon-mesh |
| 73 | +``` |
| 74 | +2. Apply the manifest: |
| 75 | +```bash |
| 76 | +kubectl apply -f jaeger-simplest.yaml |
| 77 | +``` |
| 78 | +3. Verify that the Jaeger all-in-one pod is running: |
| 79 | +```bash |
| 80 | +kubectl get pods -n seldon-mesh | grep simplest |
| 81 | +``` |
| 82 | +Output is similar to: |
| 83 | +```bash |
| 84 | +NAME READY STATUS RESTARTS AGE |
| 85 | +simplest-8686f5d96-4ptb4 1/1 Running 0 45s |
| 86 | +``` |
| 87 | +This `simplest` Jaeger CR does the following: |
| 88 | + |
| 89 | +- **All-in-one pod**: Deploys a single pod running the collector, agent, query service, and UI, using in-memory storage. |
| 90 | + |
| 91 | +- **Core 2 integration**: receives spans from Seldon Core 2 components and exposes a UI for viewing traces. |
| 92 | + |
| 93 | +## Configure Seldon Core 2 |
| 94 | + |
| 95 | +To enable tracing, configure the OpenTelemetry exporter endpoint in the [SeldonRuntime](../kubernetes/resources/seldonruntime.md) resource so that traces are sent to the Jaeger collector service created by the simplest Jaeger Custom Resource. The Seldon Runtime helm chart is located [here](https://github.com/SeldonIO/seldon-core/blob/v2/k8s/helm-charts/seldon-core-v2-runtime/values.yaml). |
| 96 | + |
| 97 | +1. Find the `seldonruntime` Custom Resource that needs to be updated using: `kubectl get seldonruntimes -n seldon-mesh` |
| 98 | +2. Patch your Custom Resource to include `tracingConfig` under `spec.config` using: |
| 99 | + |
| 100 | +```bash |
| 101 | +kubectl patch seldonruntime seldon -n seldon-mesh \ |
| 102 | + --type merge \ |
| 103 | + -p '{"spec":{"config":{"kafkaConfig":{"bootstrap.servers":"seldon-kafka-bootstrap.seldon-mesh:9092","consumer":{"auto.offset.reset":"earliest"},"topics":{"numPartitions":4}},"scalingConfig":{"servers":{}},"tracingConfig":{"otelExporterEndpoint":"simplest-collector.seldon-mesh:4317"}}}}' |
| 104 | +``` |
| 105 | +Output is similar to: |
| 106 | + |
| 107 | +```bash |
| 108 | +seldonruntime.mlops.seldon.io/seldon patched |
| 109 | +``` |
| 110 | +3. Check the updated `.yaml` file, using: `kubectl get seldonruntime seldon -n seldon-mesh -o yaml ` |
| 111 | + |
| 112 | +Output is similar to: |
| 113 | + |
| 114 | +```bash |
| 115 | +spec: |
| 116 | + config: |
| 117 | + agentConfig: |
| 118 | + rclone: {} |
| 119 | + kafkaConfig: |
| 120 | + bootstrap.servers: seldon-kafka-bootstrap.seldon-mesh:9092 |
| 121 | + consumer: |
| 122 | + auto.offset.reset: earliest |
| 123 | + topics: |
| 124 | + numPartitions: 4 |
| 125 | + scalingConfig: |
| 126 | + servers: {} |
| 127 | + serviceConfig: {} |
| 128 | + tracingConfig: |
| 129 | + otelExporterEndpoint: simplest-collector.seldon-mesh:4317 |
| 130 | +``` |
| 131 | + |
| 132 | +4. Restart the following Core 2 component Pods so they pick up the new tracing configuration from the `seldon-tracing` ConfigMap in the `seldon-mesh` namespace. |
| 133 | + |
| 134 | +- seldon-dataflow-engine |
| 135 | + |
| 136 | +- seldon-pipeline-gateway |
| 137 | + |
| 138 | +- seldon-model-gateway |
| 139 | + |
| 140 | +- seldon-scheduler |
| 141 | + |
| 142 | +- Servers |
| 143 | + |
| 144 | +After restart, these components reads the updated tracing config and start emitting traces to Jaeger. |
| 145 | + |
| 146 | +## Generate traffic |
| 147 | + |
| 148 | +To visualize traces, send requests to your models or pipelines deployed in Seldon Core 2. Each [inference request](../installation/test-installation.md) should produce a trace that shows the path through the Core 2 components such as gateways, dataflow engine, server agents in the Jaeger UI. |
| 149 | + |
| 150 | +## Access the Jaeger UI |
| 151 | + |
| 152 | +1. Port-forward the Jaeger query service to your local machine: |
| 153 | +```bash |
| 154 | +kubectl port-forward svc/simplest-query -n seldon-mesh 16686:16686 |
| 155 | +``` |
| 156 | +2. Open the Jaeger UI in your browser: |
| 157 | +```bash |
| 158 | +http://localhost:16686 |
| 159 | +``` |
| 160 | +You can now explore traces emitted by Seldon Core 2 components. |
| 161 | + |
| 162 | +An example Jaeger trace is shown below: |
17 | 163 |
|
18 | 164 | .png>) |
0 commit comments