-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration Guide
Garot Conklin edited this page Feb 12, 2025
·
1 revision
This guide provides detailed information about configuring monitors using the DataDog Monitor Deployer.
Monitor configurations can be defined in either YAML or JSON format. YAML is recommended for better readability.
monitors:
- name: "Monitor Name"
type: "monitor_type"
query: "monitor_query"
message: "alert_message"
tags: []
options: {}| Field | Type | Description |
|---|---|---|
name |
string | Display name of the monitor |
type |
string | Type of monitor (e.g., "metric alert", "log alert") |
query |
string | Monitor query/condition |
message |
string | Alert notification message |
| Field | Type | Description |
|---|---|---|
tags |
array | List of tags for categorization |
priority |
integer | Alert priority (1-5) |
restricted_roles |
array | Roles with access to the monitor |
options |
object | Additional monitor options |
-
metric alert- Threshold alerts on metrics -
service check- Status-based monitoring -
event alert- Event-based monitoring -
query alert- Complex query monitoring -
composite- Combined monitor conditions -
log alert- Log-based monitoring -
process alert- Process monitoring -
trace-analytics alert- APM monitoring -
slo alert- SLO monitoring -
event-v2 alert- Enhanced event monitoring -
audit alert- Audit log monitoring -
rum alert- Real user monitoring -
ci-pipelines alert- CI pipeline monitoring -
error-tracking alert- Error tracking
options:
notify_no_data: true
no_data_timeframe: 10
notify_audit: false
timeout_h: 0
evaluation_delay: 900
new_host_delay: 300
include_tags: true
require_full_window: false
renotify_interval: 60options:
thresholds:
critical: 90
warning: 80
ok: 70
critical_recovery: 85
warning_recovery: 75options:
notification_preset_name: "custom"
notification_targets:
- type: "slack"
channel: "#alerts"
- type: "email"
address: "team@example.com"
- type: "pagerduty"
service_key: "key123"template:
defaults:
tags:
- "team:platform"
- "env:production"
options:
notify_no_data: true
evaluation_delay: 900
monitors:
- template: base
name: "CPU Alert"
type: "metric alert"
query: "avg(last_5m):avg:system.cpu.user{*} > 80"template:
variables:
threshold: 80
service: "web"
team: "platform"
monitors:
- name: "{{ service }} CPU Usage"
type: "metric alert"
query: "avg(last_5m):avg:system.cpu.user{service:{{ service }}} > {{ threshold }}"
tags:
- "team:{{ team }}"monitors:
- name: "Service Health"
type: "composite"
query: "12345 && 67890"
message: "Multiple conditions met"
options:
notify_no_data: falsedowntime:
scope: "env:production"
start: "2024-03-01T00:00:00Z"
end: "2024-03-02T00:00:00Z"
message: "Scheduled maintenance"groups:
infrastructure:
monitors:
- name: "CPU Alert"
type: "metric alert"
query: "avg(last_5m):avg:system.cpu.user{*} > 80"
- name: "Memory Alert"
type: "metric alert"
query: "avg(last_5m):avg:system.mem.used{*} > 90"monitors:
- name: "${SERVICE_NAME} Alert"
type: "metric alert"
query: "avg(last_5m):avg:system.cpu.user{service:${SERVICE_NAME}} > ${THRESHOLD}"environments:
production:
threshold: 90
notification_channel: "#prod-alerts"
staging:
threshold: 80
notification_channel: "#staging-alerts"The tool validates configurations against a JSON schema that ensures:
- Required fields are present
- Field types are correct
- Values are within allowed ranges
- Enum values are valid
Queries are validated for:
- Syntax correctness
- Metric existence
- Tag validity
- Function support
-
Naming Conventions
- Use descriptive names
- Include environment/service
- Be consistent
-
Organization
- Group related monitors
- Use templates for common patterns
- Maintain clear structure
-
Version Control
- Commit configurations
- Use meaningful commits
- Review changes
-
Documentation
- Comment complex queries
- Include runbooks
- Document variables
- Monitor Types - Examples of different monitor types
- Templating Guide - Advanced templating usage
- Best Practices - Configuration best practices
- DataDog API Documentation