Skip to content

Commit ca8132d

Browse files
committed
add elasticache memory monitor
1 parent 926bed0 commit ca8132d

File tree

3 files changed

+76
-1
lines changed

3 files changed

+76
-1
lines changed

aws/elasticache/README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ No modules.
4040
| [datadog_monitor.hit_rate](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
4141
| [datadog_monitor.hit_rate_anomaly](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
4242
| [datadog_monitor.max_connections](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
43+
| [datadog_monitor.memory_utilization](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
4344
| [datadog_monitor.swap_usage](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
4445

4546
## Inputs
@@ -97,6 +98,12 @@ No modules.
9798
| <a name="input_max_connections_threshold_critical"></a> [max\_connections\_threshold\_critical](#input\_max\_connections\_threshold\_critical) | Critical threshold (connections) | `number` | `64000` | no |
9899
| <a name="input_max_connections_threshold_warning"></a> [max\_connections\_threshold\_warning](#input\_max\_connections\_threshold\_warning) | Warning threshold (connections) | `number` | `60000` | no |
99100
| <a name="input_max_connections_use_message"></a> [max\_connections\_use\_message](#input\_max\_connections\_use\_message) | Whether to use the query alert base message for max connections monitor | `bool` | `false` | no |
101+
| <a name="input_memory_utilization_enabled"></a> [memory\_utilization\_enabled](#input\_memory\_utilization\_enabled) | Enable memory utilization monitor | `bool` | `false` | no |
102+
| <a name="input_memory_utilization_evaluation_window"></a> [memory\_utilization\_evaluation\_window](#input\_memory\_utilization\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`) | `string` | `"last_1h"` | no |
103+
| <a name="input_memory_utilization_no_data_window"></a> [memory\_utilization\_no\_data\_window](#input\_memory\_utilization\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `15` | no |
104+
| <a name="input_memory_utilization_threshold_critical"></a> [memory\_utilization\_threshold\_critical](#input\_memory\_utilization\_threshold\_critical) | Critical threshold (percentage) | `number` | `80` | no |
105+
| <a name="input_memory_utilization_threshold_warning"></a> [memory\_utilization\_threshold\_warning](#input\_memory\_utilization\_threshold\_warning) | Warning threshold (percentage) | `number` | `70` | no |
106+
| <a name="input_memory_utilization_use_message"></a> [memory\_utilization\_use\_message](#input\_memory\_utilization\_use\_message) | Whether to use the query alert base message for memory utilization monitor | `bool` | `false` | no |
100107
| <a name="input_monitor_exclude_tags"></a> [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no |
101108
| <a name="input_monitor_include_tags"></a> [monitor\_include\_tags](#input\_monitor\_include\_tags) | Tags to be included in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no |
102109
| <a name="input_new_group_delay"></a> [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before generating alerts for a new resource | `number` | `300` | no |
@@ -109,7 +116,7 @@ No modules.
109116
| <a name="input_notify_prod_override"></a> [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
110117
| <a name="input_notify_recovery_override"></a> [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
111118
| <a name="input_notify_warn_override"></a> [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no |
112-
| <a name="input_renotify_interval"></a> [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no |
119+
| <a name="input_renotify_interval"></a> [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no |
113120
| <a name="input_runbook_link"></a> [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no |
114121
| <a name="input_service"></a> [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no |
115122
| <a name="input_swap_usage_enabled"></a> [swap\_usage\_enabled](#input\_swap\_usage\_enabled) | Enable swap usage monitor | `bool` | `false` | no |

aws/elasticache/main.tf

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,3 +212,32 @@ END
212212
warning = var.swap_usage_threshold_warning
213213
}
214214
}
215+
216+
resource "datadog_monitor" "memory_utilization" {
217+
count = var.memory_utilization_enabled ? 1 : 0
218+
219+
name = join("", [local.title_prefix, "Elasticache Memory Utilization - {{replication_group.name}} - {{value}}%", local.title_suffix])
220+
include_tags = false
221+
message = var.memory_utilization_use_message ? local.query_alert_base_message : ""
222+
tags = concat(local.common_tags, var.base_tags, var.additional_tags)
223+
type = "query alert"
224+
225+
evaluation_delay = var.evaluation_delay
226+
new_group_delay = var.new_group_delay
227+
notify_no_data = var.notify_no_data
228+
no_data_timeframe = var.memory_utilization_no_data_window
229+
renotify_interval = var.renotify_interval
230+
require_full_window = true
231+
timeout_h = var.timeout_h
232+
233+
query = <<END
234+
avg(${var.memory_utilization_evaluation_window}):
235+
avg:aws.elasticache.database_memory_usage_percentage${local.query_filter} by {replication_group,region,aws_account,env,datadog_managed}
236+
>= ${var.memory_utilization_threshold_critical}
237+
END
238+
239+
monitor_thresholds {
240+
critical = var.memory_utilization_threshold_critical
241+
warning = var.memory_utilization_threshold_warning
242+
}
243+
}

aws/elasticache/variables.tf

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,3 +321,42 @@ variable "swap_usage_use_message" {
321321
type = bool
322322
default = false
323323
}
324+
325+
########################################
326+
# Memory Utilization
327+
########################################
328+
variable "memory_utilization_enabled" {
329+
default = false
330+
description = "Enable memory utilization monitor"
331+
type = bool
332+
}
333+
334+
variable "memory_utilization_evaluation_window" {
335+
default = "last_1h"
336+
description = "Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`)"
337+
type = string
338+
}
339+
340+
variable "memory_utilization_no_data_window" {
341+
default = 15
342+
description = "No data threshold (in minutes, 0 to disable)"
343+
type = number
344+
}
345+
346+
variable "memory_utilization_threshold_critical" {
347+
default = 80
348+
description = "Critical threshold (percentage)"
349+
type = number
350+
}
351+
352+
variable "memory_utilization_threshold_warning" {
353+
default = 70
354+
description = "Warning threshold (percentage)"
355+
type = number
356+
}
357+
358+
variable "memory_utilization_use_message" {
359+
description = "Whether to use the query alert base message for memory utilization monitor"
360+
type = bool
361+
default = false
362+
}

0 commit comments

Comments
 (0)