Skip to content

Commit 7bfee9d

Browse files
committed
add windows_service service monitor
1 parent df7b7b4 commit 7bfee9d

File tree

5 files changed

+153
-0
lines changed

5 files changed

+153
-0
lines changed

host/windows/README.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
<!-- BEGIN_TF_DOCS -->
2+
## Requirements
3+
4+
| Name | Version |
5+
|------|---------|
6+
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | ~> 1.5 |
7+
| <a name="requirement_datadog"></a> [datadog](#requirement\_datadog) | >= 3.37 |
8+
| <a name="requirement_null"></a> [null](#requirement\_null) | >= 3.1.0 |
9+
10+
## Providers
11+
12+
| Name | Version |
13+
|------|---------|
14+
| <a name="provider_datadog"></a> [datadog](#provider\_datadog) | >= 3.37 |
15+
16+
## Modules
17+
18+
No modules.
19+
20+
## Resources
21+
22+
| Name | Type |
23+
|------|------|
24+
| [datadog_monitor.windows_service](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
25+
26+
## Inputs
27+
28+
| Name | Description | Type | Default | Required |
29+
|------|-------------|------|---------|:--------:|
30+
| <a name="input_additional_tags"></a> [additional\_tags](#input\_additional\_tags) | Additional tags to apply to all monitors | `list(string)` | `[]` | no |
31+
| <a name="input_alert_critical_priority"></a> [alert\_critical\_priority](#input\_alert\_critical\_priority) | Priority for alerts within critical threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no |
32+
| <a name="input_alert_message"></a> [alert\_message](#input\_alert\_message) | Message to prepend to alert notifications | `string` | `"Alert"` | no |
33+
| <a name="input_alert_nodata_priority"></a> [alert\_nodata\_priority](#input\_alert\_nodata\_priority) | Priority for alerts within warning threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no |
34+
| <a name="input_base_tags"></a> [base\_tags](#input\_base\_tags) | Base tags to apply to all monitors | `list(string)` | `[]` | no |
35+
| <a name="input_cost_center"></a> [cost\_center](#input\_cost\_center) | Cost Center of the monitored resource (leave blank to omit tag) | `string` | `null` | no |
36+
| <a name="input_dashboard_link"></a> [dashboard\_link](#input\_dashboard\_link) | Dashboard link to include in message | `string` | `null` | no |
37+
| <a name="input_env"></a> [env](#input\_env) | Environment the monitored resource is in (leave blank to omit tag) | `string` | `null` | no |
38+
| <a name="input_evaluation_delay"></a> [evaluation\_delay](#input\_evaluation\_delay) | Monitor evaluation delay (see [https://docs.datadoghq.com/monitors/configuration/?tab=thresholdalert#set-alert-conditions](Datadog Docs)) | `number` | `900` | no |
39+
| <a name="input_monitor_exclude_tags"></a> [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no |
40+
| <a name="input_monitor_include_tags"></a> [monitor\_include\_tags](#input\_monitor\_include\_tags) | Tags to be included in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no |
41+
| <a name="input_new_group_delay"></a> [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before generating alerts for a new resource | `number` | `300` | no |
42+
| <a name="input_notify_alert_override"></a> [notify\_alert\_override](#input\_notify\_alert\_override) | List of notifications for alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
43+
| <a name="input_notify_crit_override"></a> [notify\_crit\_override](#input\_notify\_crit\_override) | List of notifications for 24x7 alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
44+
| <a name="input_notify_default"></a> [notify\_default](#input\_notify\_default) | List of alert notifications (can be overridden based on alert type) | `list(string)` | n/a | yes |
45+
| <a name="input_notify_no_data"></a> [notify\_no\_data](#input\_notify\_no\_data) | Alert if no matching data is found | `bool` | `false` | no |
46+
| <a name="input_notify_nodata_override"></a> [notify\_nodata\_override](#input\_notify\_nodata\_override) | List of notifications for no data (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
47+
| <a name="input_notify_nonprod_override"></a> [notify\_nonprod\_override](#input\_notify\_nonprod\_override) | List of notifications for non-prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
48+
| <a name="input_notify_prod_override"></a> [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
49+
| <a name="input_notify_recovery_override"></a> [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
50+
| <a name="input_notify_warn_override"></a> [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
51+
| <a name="input_renotify_interval"></a> [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no |
52+
| <a name="input_runbook_link"></a> [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no |
53+
| <a name="input_service"></a> [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no |
54+
| <a name="input_team"></a> [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no |
55+
| <a name="input_timeout_h"></a> [timeout\_h](#input\_timeout\_h) | Auto-resolve alert in specified hours if condition no longer matches | `number` | `0` | no |
56+
| <a name="input_title_prefix"></a> [title\_prefix](#input\_title\_prefix) | Prefix all alerts with specified value in brackets | `string` | `null` | no |
57+
| <a name="input_title_suffix"></a> [title\_suffix](#input\_title\_suffix) | Suffix all alerts with specified value in parenthesis | `string` | `null` | no |
58+
| <a name="input_warn_priority"></a> [warn\_priority](#input\_warn\_priority) | Priority for alerts with no data (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no |
59+
| <a name="input_windows_service_alert_enabled"></a> [windows\_service\_alert\_enabled](#input\_windows\_service\_alert\_enabled) | Enable or disable the Windows service alert monitor | `bool` | `true` | no |
60+
| <a name="input_windows_service_alert_operator"></a> [windows\_service\_alert\_operator](#input\_windows\_service\_alert\_operator) | Operator for the Windows service alert threshold comparison | `string` | `"<"` | no |
61+
| <a name="input_windows_service_alert_threshold_critical"></a> [windows\_service\_alert\_threshold\_critical](#input\_windows\_service\_alert\_threshold\_critical) | Critical threshold for the Windows service alert | `number` | `1` | no |
62+
| <a name="input_windows_service_alert_threshold_warning"></a> [windows\_service\_alert\_threshold\_warning](#input\_windows\_service\_alert\_threshold\_warning) | Warning threshold for the Windows service alert | `number` | `2` | no |
63+
| <a name="input_windows_service_alert_timeframe"></a> [windows\_service\_alert\_timeframe](#input\_windows\_service\_alert\_timeframe) | Timeframe for the Windows service alert evaluation | `string` | `"5m"` | no |
64+
| <a name="input_windows_service_alert_use_message"></a> [windows\_service\_alert\_use\_message](#input\_windows\_service\_alert\_use\_message) | Whether to use the base message for the Windows service alert | `bool` | `true` | no |
65+
66+
## Outputs
67+
68+
No outputs.
69+
<!-- END_TF_DOCS -->

host/windows/common.tf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../common/common.tf

host/windows/main.tf

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
locals {
2+
# these must be defined but do not need to be overridden
3+
monitor_alert_default_priority = null
4+
monitor_warn_default_priority = null
5+
monitor_nodata_default_priority = null
6+
7+
title_prefix = var.title_prefix == null ? "" : "[${var.title_prefix}]"
8+
title_suffix = var.title_suffix == null ? "" : " (${var.title_suffix})"
9+
}
10+
11+
resource "datadog_monitor" "windows_service" {
12+
count = var.windows_service_alert_enabled ? 1 : 0
13+
14+
name = join("", [local.title_prefix, "Windows Service Alert - {{host.name}}", local.title_suffix])
15+
message = var.windows_service_alert_use_message ? local.query_alert_base_message : ""
16+
tags = concat(local.common_tags, var.base_tags, var.additional_tags)
17+
type = "service check"
18+
19+
evaluation_delay = var.evaluation_delay
20+
notify_no_data = false
21+
renotify_interval = 0
22+
notify_audit = false
23+
timeout_h = var.timeout_h
24+
include_tags = false
25+
require_full_window = true
26+
27+
query = <<EOQ
28+
service_check("windows_service.state")${local.service_filter}.last("${var.windows_service_alert_timeframe}") ${var.windows_service_alert_operator} ${var.windows_service_alert_threshold_critical}
29+
EOQ
30+
31+
monitor_thresholds {
32+
warning = var.windows_service_alert_threshold_warning
33+
critical = var.windows_service_alert_threshold_critical
34+
}
35+
}

host/windows/variables.tf

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
variable "windows_service_alert_enabled" {
2+
description = "Enable or disable the Windows service alert monitor"
3+
type = bool
4+
default = true
5+
}
6+
7+
variable "windows_service_alert_use_message" {
8+
description = "Whether to use the base message for the Windows service alert"
9+
type = bool
10+
default = true
11+
}
12+
13+
variable "windows_service_alert_timeframe" {
14+
description = "Timeframe for the Windows service alert evaluation"
15+
type = string
16+
default = "5m"
17+
}
18+
19+
variable "windows_service_alert_operator" {
20+
description = "Operator for the Windows service alert threshold comparison"
21+
type = string
22+
default = "<"
23+
}
24+
25+
variable "windows_service_alert_threshold_critical" {
26+
description = "Critical threshold for the Windows service alert"
27+
type = number
28+
default = 1
29+
}
30+
31+
variable "windows_service_alert_threshold_warning" {
32+
description = "Warning threshold for the Windows service alert"
33+
type = number
34+
default = 2
35+
}
36+
37+
variable "base_tags" {
38+
description = "Base tags to apply to all monitors"
39+
type = list(string)
40+
default = []
41+
}
42+
43+
variable "additional_tags" {
44+
description = "Additional tags to apply to all monitors"
45+
type = list(string)
46+
default = []
47+
}

host/windows/versions.tf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../common/versions.tf

0 commit comments

Comments
 (0)