Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

43 changes: 43 additions & 0 deletions cmd/metric-app/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
package main

import (
"net/http"
"os"

"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
// Get cluster and workload information from environment variables
clusterName := os.Getenv("CLUSTER_NAME")
if clusterName == "" {
clusterName = "unknown"
}

workloadName := os.Getenv("WORKLOAD_NAME")
if workloadName == "" {
workloadName = "unknown"
}

// Define a simple gauge metric for health with labels
workloadHealth := prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "workload_health",
Help: "Indicates if the workload is healthy (1=healthy, 0=unhealthy)",
},
[]string{"cluster_name", "workload_name"},
)

// Set it to 1 (healthy) with labels
workloadHealth.WithLabelValues(clusterName, workloadName).Set(1)

// Register metric with Prometheus default registry
prometheus.MustRegister(workloadHealth)

// Expose metrics endpoint
http.Handle("/metrics", promhttp.Handler())

// Start HTTP server
http.ListenAndServe(":8080", nil)

Check failure on line 42 in cmd/metric-app/main.go

View workflow job for this annotation

GitHub Actions / Lint

Error return value of `http.ListenAndServe` is not checked (errcheck)
}
17 changes: 17 additions & 0 deletions docker/metric-app.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Build stage
FROM golang:1.24-alpine AS builder
WORKDIR /workspace
# Copy go mod files
COPY go.mod go.sum ./
RUN go mod download
# Copy source code
COPY cmd/metric-app/ ./cmd/metric-app/
# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o metric-app ./cmd/metric-app/main.go

# Run stage
FROM alpine:3.18
WORKDIR /app
COPY --from=builder /workspace/metric-app .
EXPOSE 8080
CMD ["./metric-app"]
117 changes: 117 additions & 0 deletions examples/prometheus/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Prometheus Setup for MetricCollector Testing

This directory contains manifests to deploy Prometheus for testing the MetricCollector controller with the sample-metric-app.

## Prerequisites

- Kind cluster running (e.g., cluster-1, cluster-2, or cluster-3)
- `test-ns` namespace exists
- `ghcr.io/metric-app:6d6cd69` image loaded into the cluster

## Quick Start

```bash
# Switch to target cluster
kubectl config use-context kind-cluster-1

# Create namespace if needed
kubectl create namespace test-ns --dry-run=client -o yaml | kubectl apply -f -

# Deploy Prometheus
kubectl apply -f rbac.yaml
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Deploy sample-metric-app (from parent examples directory)
kubectl apply -f ../sample-metric-app.yaml

# Wait for pods to be ready
kubectl wait --for=condition=ready pod -l app=prometheus -n test-ns --timeout=60s
kubectl wait --for=condition=ready pod -l app=sample-metric-app -n test-ns --timeout=60s
```

## Verify Setup

### Check Prometheus is scraping metrics

```bash
# Port-forward Prometheus
kubectl port-forward -n test-ns svc/prometheus 9090:9090
```

Open http://localhost:9090 in your browser and:
1. Go to Status > Targets - you should see `sample-metric-app` pod listed
2. Go to Graph and query: `workload_health` - you should see metrics

### Test with prometheus-test tool

```bash
# In another terminal (while port-forward is running)
cd tools/prometheus-test
go build -o prometheus-test main.go
./prometheus-test http://localhost:9090 workload_health

# Or with namespace filter
./prometheus-test http://localhost:9090 workload_health test-ns
```

Expected output:
```
Querying Prometheus at: http://localhost:9090
Query: workload_health{namespace="test-ns"}

Status: success
Result Type: vector
Number of results: 1

Result 1:
Labels:
app: sample-metric-app
namespace: test-ns
pod: sample-metric-app-xxxxx
Timestamp: 1732032000.0
Value: 1
```

## Configuration Details

### Prometheus ConfigMap
The Prometheus configuration discovers pods with these annotations:
- `prometheus.io/scrape: "true"` - Enable scraping
- `prometheus.io/port: "8080"` - Port to scrape
- `prometheus.io/path: "/metrics"` - Metrics endpoint path

The sample-metric-app already has these annotations configured.

### RBAC
Prometheus needs permissions to discover and scrape pods. The `rbac.yaml` creates:
- ServiceAccount for Prometheus
- ClusterRole with pod discovery permissions
- ClusterRoleBinding to grant permissions

## Testing MetricCollector

Once Prometheus is running, create a MetricCollector CR:

```bash
kubectl apply -f - <<EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: MetricCollector
metadata:
name: sample-collector
namespace: test-ns
spec:
workloadSelector:
labelSelector:
matchLabels:
app: sample-metric-app
namespaces:
- test-ns
metricsEndpoint:
sourceType: prometheus
prometheusEndpoint:
url: http://prometheus.test-ns.svc.cluster.local:9090
collectionInterval: "30s"
metricsToCollect:
- workload_health
44 changes: 44 additions & 0 deletions examples/prometheus/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: test-ns
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- test-ns
relabel_configs:
# Only scrape pods with prometheus.io/scrape annotation
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
# Use the port from prometheus.io/port annotation or default pod IP
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
# Use the path from prometheus.io/path annotation or default /metrics
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# Add pod metadata as labels
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
- source_labels: [__meta_kubernetes_pod_label_app]
target_label: app
# Add CLUSTER_NAME and WORKLOAD_NAME from env vars if present
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
48 changes: 48 additions & 0 deletions examples/prometheus/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: test-ns
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: prom/prometheus:v2.47.0
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
- '--web.enable-lifecycle'
ports:
- name: web
containerPort: 9090
volumeMounts:
- name: prometheus-config
mountPath: /etc/prometheus
- name: prometheus-storage
mountPath: /prometheus
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
volumes:
- name: prometheus-config
configMap:
name: prometheus-config
- name: prometheus-storage
emptyDir: {}
39 changes: 39 additions & 0 deletions examples/prometheus/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: test-ns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: test-ns
16 changes: 16 additions & 0 deletions examples/prometheus/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: test-ns
labels:
app: prometheus
spec:
type: ClusterIP
ports:
- name: web
port: 9090
targetPort: 9090
protocol: TCP
selector:
app: prometheus
22 changes: 22 additions & 0 deletions examples/sample-metric-app-crp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: sample-metric-app
spec:
resourceSelectors:
- group: ""
version: v1
kind: Namespace
name: test-ns
- group: "rbac.authorization.k8s.io"
version: v1
kind: ClusterRole
name: prometheus
- group: "rbac.authorization.k8s.io"
version: v1
kind: ClusterRoleBinding
name: prometheus
policy:
placementType: PickAll
strategy:
type: RollingUpdate
Loading
Loading