Skip to content

Commit 15785d5

Browse files
author
Arvind Thirumurugan
committed
update README
Signed-off-by: Arvind Thirumurugan <arvindth@microsoft.com>
1 parent bb5b1a3 commit 15785d5

File tree

1 file changed

+158
-29
lines changed
  • approval-controller-metric-collector

1 file changed

+158
-29
lines changed

approval-controller-metric-collector/README.md

Lines changed: 158 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -35,45 +35,67 @@ make setup-clusters
3535
```
3636

3737
This will create:
38-
- 1 hub cluster
39-
- 3 member clusters (or the number specified in MEMBER_CLUSTER_COUNT)
38+
- 1 hub cluster (context: `kind-hub`)
39+
- 3 member clusters (contexts: `kind-cluster-1`, `kind-cluster-2`, `kind-cluster-3`)
4040

41-
### 2. Deploy Prometheus on Member Clusters
41+
### 2. Register Member Clusters with Hub
4242

43-
On each member cluster, deploy Prometheus to collect metrics:
43+
Switch to hub cluster context and register the member clusters:
4444

4545
```bash
46-
# Create namespace for test workloads
47-
kubectl create ns test-ns
46+
cd /path/to/approval-controller-metric-collector/approval-request-controller
47+
48+
# Switch to hub cluster
49+
kubectl config use-context kind-hub
50+
51+
# Register member clusters with the hub
52+
kubectl apply -f ./examples/membercluster/
4853

49-
# Deploy Prometheus
50-
kubectl apply -f ./approval-request-controller/examples/prometheus/
54+
# Verify clusters are registered
55+
kubectl get cluster -A
5156
```
5257

53-
This deploys:
54-
- Prometheus server
55-
- ServiceMonitor configuration
56-
- RBAC permissions
58+
the output should look something like this,
5759

58-
### 3. Deploy Sample Metric Application
60+
```bash
61+
NAME JOINED AGE MEMBER-AGENT-LAST-SEEN NODE-COUNT AVAILABLE-CPU AVAILABLE-MEMORY
62+
kind-cluster-1 True 40s 29s 0 0 0
63+
kind-cluster-2 True 40s 3s 0 0 0
64+
kind-cluster-3 True 40s 37s 0 0 0
65+
```
66+
Wait until all member clusters show as joined.
5967

60-
The sample metric application image will be automatically built and loaded to all member clusters when you run the `install-on-member.sh` script in Step 5 below. This ensures the `metric-app:local` image is available on each cluster before propagating the deployment from hub to member clusters.
68+
### 3. Deploy Prometheus
6169

62-
After the metric collector is installed (Step 5), deploy the sample application that exposes health metrics:
70+
Create the prometheus namespace and deploy Prometheus for metrics collection:
6371

6472
```bash
65-
kubectl apply -f ./approval-request-controller/examples/sample-metric-app/sample-metric-app.yaml
73+
# Create prometheus namespace
74+
kubectl create ns prometheus
75+
76+
# Deploy Prometheus (ConfigMap, Deployment, Service, RBAC)
77+
kubectl apply -f ./examples/prometheus/
6678
```
6779

68-
This creates a deployment in the `test-ns` namespace with Prometheus annotations for metric scraping. When this deployment is propagated to member clusters via ClusterResourcePlacement, the image will already be available locally.
80+
This deploys Prometheus configured to scrape pods from all namespaces with the proper annotations.
6981

70-
### 4. Install Approval Request Controller (Hub Cluster)
82+
### 4. Deploy Sample Metric Application
7183

72-
On the hub cluster, install the approval request controller using the provided installation script:
84+
Create the test namespace and deploy the sample application:
7385

7486
```bash
75-
cd approval-request-controller
87+
# Create test namespace
88+
kubectl create ns test-ns
89+
90+
# Deploy sample metric app (this will be propagated to member clusters)
91+
kubectl apply -f ./examples/sample-metric-app/
92+
```
7693

94+
### 5. Install Approval Request Controller (Hub Cluster)
95+
96+
Install the approval request controller on the hub cluster:
97+
98+
```bash
7799
# Run the installation script
78100
./install-on-hub.sh
79101
```
@@ -85,12 +107,22 @@ The script performs the following:
85107
4. Installs the controller via Helm with the custom CRDs (MetricCollector, MetricCollectorReport, WorkloadTracker)
86108
5. Verifies the installation
87109

88-
### 5. Install Metric Collector (Member Clusters)
110+
### 6. Configure Workload Tracker
111+
112+
Apply the WorkloadTracker to define which workloads to monitor:
113+
114+
```bash
115+
kubectl apply -f ./examples/workloadtracker/
116+
```
117+
118+
This tells the approval controller which workloads to track and what health thresholds to use.
119+
120+
### 7. Install Metric Collector (Member Clusters)
89121

90-
On each member cluster, install the metric collector. The installation script will automatically build and load both the metric-collector and metric-app images to all specified member clusters:
122+
Install the metric collector on all member clusters:
91123

92124
```bash
93-
cd standalone-metric-collector
125+
cd ../standalone-metric-collector
94126

95127
# Run the installation script for all member clusters
96128
# This builds both metric-collector and metric-app images and loads them into each cluster
@@ -106,32 +138,129 @@ The script performs the following for each member cluster:
106138

107139
The `metric-app:local` image is pre-loaded so it's available when you propagate the sample-metric-app deployment from hub to member clusters.
108140

141+
### 8. Create Staged Update
142+
143+
Switch back to hub cluster and create a staged update run:
144+
145+
```bash
146+
# Switch to hub cluster
147+
kubectl config use-context kind-hub
148+
149+
cd ../approval-request-controller
150+
151+
# Apply ClusterStagedUpdateStrategy
152+
kubectl apply -f ./examples/updateRun/example-csus.yaml
153+
154+
# Apply ClusterResourcePlacement
155+
kubectl apply -f ./examples/updateRun/example-crp.yaml
156+
```
157+
158+
```bash
159+
# Verify CRP is created
160+
kubectl get crp -A
161+
```
162+
163+
```bash
164+
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
165+
example-crp 1 True 1 4s
166+
proemetheus-crp 1 True 1 True 1 3m1s
167+
```
168+
169+
```bash
170+
# Apply ClusterStagedUpdateRun to start the staged rollout
171+
kubectl apply -f ./examples/updateRun/example-csur.yaml
172+
```
173+
174+
```bash
175+
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED PROGRESSING SUCCEEDED AGE
176+
example-cluster-staged-run example-crp 0 0 True True 5s
177+
```
178+
179+
### 9. Monitor the Staged Rollout
180+
181+
Watch the staged update progress:
182+
183+
```bash
184+
# Check the staged update run status
185+
kubectl get csur -A
186+
```
187+
188+
```bash
189+
# Check approval requests (should be auto-approved based on metrics)
190+
kubectl get clusterapprovalrequest -A
191+
```
192+
193+
```bash
194+
NAME UPDATE-RUN STAGE APPROVED AGE
195+
example-cluster-staged-run-after-prod example-cluster-staged-run prod True 2m9s
196+
```
197+
198+
```bash
199+
# Check metric collector reports
200+
kubectl get metriccollectorreport -A
201+
```
202+
203+
```bash
204+
NAMESPACE NAME WORKLOADS LAST-COLLECTION AGE
205+
fleet-member-kind-cluster-1 mc-example-cluster-staged-run-staging 1 27s 2m57s
206+
```
207+
208+
Eventually all the approval requests are approved for the updateRun.
209+
210+
The approval controller will automatically approve stages when the metric collectors report that workloads are healthy.
211+
1. Builds the `metric-collector:latest` image
212+
2. Builds the `metric-app:local` image
213+
3. Loads both images into each kind cluster
214+
4. Creates hub token secret with proper RBAC
215+
5. Installs the metric-collector via Helm
216+
217+
The `metric-app:local` image is pre-loaded so it's available when you propagate the sample-metric-app deployment from hub to member clusters.
218+
109219
## Verification
110220

111221
### Check Controller Status
112222

113223
On the hub cluster:
114224
```bash
225+
kubectl config use-context kind-hub
115226
kubectl get pods -n fleet-system
116-
kubectl logs -n fleet-system deployment/approval-request-controller
227+
kubectl logs -n fleet-system deployment/approval-request-controller -f
117228
```
118229

119230
On member clusters:
120231
```bash
121-
kubectl get pods -n fleet-system
122-
kubectl logs -n fleet-system deployment/metric-collector
232+
kubectl config use-context kind-cluster-1
233+
kubectl get pods -n default
234+
kubectl logs -n default deployment/metric-collector -f
123235
```
124236

125237
### Check Metrics Collection
126238

239+
Verify that MetricCollector resources exist on member clusters:
240+
```bash
241+
kubectl config use-context kind-cluster-1
242+
kubectl get metriccollector -A
243+
```
244+
127245
Verify that MetricCollectorReports are being created on the hub:
128246
```bash
129-
kubectl get metriccollectorreports -A
247+
kubectl config use-context kind-hub
248+
kubectl get metriccollectorreport -A
130249
```
131250

132-
### Test Staged Rollout
251+
### Monitor Staged Update Progress
133252

134-
Create a staged rollout with approval requirements and watch the controller automatically approve based on workload health metrics.
253+
Watch the approval process:
254+
```bash
255+
# Watch staged update run status
256+
kubectl get csur -A -w
257+
258+
# Check cluster approval requests
259+
kubectl get clusterapprovalrequest -A
260+
261+
# View approval request details
262+
kubectl describe clusterapprovalrequest <approval-request-name>
263+
```
135264

136265
## How It Works
137266

0 commit comments

Comments
 (0)