-
Notifications
You must be signed in to change notification settings - Fork 0
System Monitoring
Garot Conklin edited this page Feb 6, 2025
·
1 revision
Example configuration for a comprehensive system monitoring dashboard.
version: "1.0"
dashboards:
- name: "System Monitoring"
description: "System performance and health metrics"
layout_type: "ordered"
widgets:
- title: "CPU Usage"
type: "timeseries"
query: "avg:system.cpu.user{*} by {host}"
- title: "Memory Usage"
type: "timeseries"
query: "avg:system.mem.used{*} by {host}"
- title: "Disk Usage"
type: "query_value"
query: "avg:system.disk.used{*} by {host}"
- title: "Network Traffic"
type: "timeseries"
query: "avg:system.net.bytes_rcvd{*} by {host}"version: "1.0"
defaults:
refresh_interval: 300
tags:
- "env:production"
- "team:infrastructure"
dashboards:
- name: "Advanced System Monitoring"
description: "Detailed system metrics with alerts"
layout_type: "ordered"
template_variables:
- name: "env"
prefix: "env"
default: "prod"
widgets:
- title: "CPU Utilization"
type: "timeseries"
query: "avg:system.cpu.user{env:$env} by {host}"
conditional_formats:
- comparator: ">"
value: 80
palette: "red"
- comparator: ">"
value: 60
palette: "yellow"
- title: "Memory Usage %"
type: "query_value"
query: "avg:system.mem.used{env:$env} by {host}"
precision: 2-
Timeseries
- Historical data visualization
- Trend analysis
- Multiple metrics comparison
-
Query Value
- Current value display
- Threshold indicators
- Status monitoring
-
Heat Map
- Resource utilization patterns
- Performance bottlenecks
- Anomaly detection
- Use appropriate time windows
- Set meaningful thresholds
- Group related metrics
- Include host-level granularity
- Implement proper tagging