Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions aws/rds/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ No modules.
| [datadog_monitor.cpu_utilization](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
| [datadog_monitor.cpu_utilization_anomaly](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
| [datadog_monitor.used_storage](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |
| [datadog_monitor.used_storage_aurora](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource |

## Inputs

Expand All @@ -44,10 +45,10 @@ No modules.
| <a name="input_alert_critical_priority"></a> [alert\_critical\_priority](#input\_alert\_critical\_priority) | Priority for alerts within critical threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no |
| <a name="input_alert_message"></a> [alert\_message](#input\_alert\_message) | Message to prepend to alert notifications | `string` | `"Alert"` | no |
| <a name="input_alert_nodata_priority"></a> [alert\_nodata\_priority](#input\_alert\_nodata\_priority) | Priority for alerts within warning threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no |
| <a name="input_base_tags"></a> [base\_tags](#input\_base\_tags) | Base tags (key:value format) to add to this type of check (combined with `local.tags` and `var.additional_tags`, generally you should not change this) | `list(string)` | <pre>[<br> "resource:rds"<br>]</pre> | no |
| <a name="input_base_tags"></a> [base\_tags](#input\_base\_tags) | Base tags (key:value format) to add to this type of check (combined with `local.tags` and `var.additional_tags`, generally you should not change this) | `list(string)` | <pre>[<br/> "resource:rds"<br/>]</pre> | no |
| <a name="input_connection_count_anomaly_deviations"></a> [connection\_count\_anomaly\_deviations](#input\_connection\_count\_anomaly\_deviations) | Standard deviations | `number` | `3` | no |
| <a name="input_connection_count_anomaly_enabled"></a> [connection\_count\_anomaly\_enabled](#input\_connection\_count\_anomaly\_enabled) | Enable CPU utilization anomaly monitor | `bool` | `true` | no |
| <a name="input_connection_count_anomaly_evaluation_window"></a> [connection\_count\_anomaly\_evaluation\_window](#input\_connection\_count\_anomaly\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_1h"` | no |
| <a name="input_connection_count_anomaly_evaluation_window"></a> [connection\_count\_anomaly\_evaluation\_window](#input\_connection\_count\_anomaly\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_4h"` | no |
| <a name="input_connection_count_anomaly_no_data_window"></a> [connection\_count\_anomaly\_no\_data\_window](#input\_connection\_count\_anomaly\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no |
| <a name="input_connection_count_anomaly_recovery_window"></a> [connection\_count\_anomaly\_recovery\_window](#input\_connection\_count\_anomaly\_recovery\_window) | Recovery window for anomaly monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_15m"` | no |
| <a name="input_connection_count_anomaly_rollup"></a> [connection\_count\_anomaly\_rollup](#input\_connection\_count\_anomaly\_rollup) | Rollup interval (must be sized based on evaluation window/span and seasonaility) | `number` | `60` | no |
Expand Down Expand Up @@ -75,6 +76,7 @@ No modules.
| <a name="input_cpu_utilization_threshold_warning"></a> [cpu\_utilization\_threshold\_warning](#input\_cpu\_utilization\_threshold\_warning) | Warning threshold (percent) | `number` | `80` | no |
| <a name="input_cpu_utilization_use_message"></a> [cpu\_utilization\_use\_message](#input\_cpu\_utilization\_use\_message) | Whether to use the query alert base message for CPU utilization monitor | `bool` | `false` | no |
| <a name="input_dashboard_link"></a> [dashboard\_link](#input\_dashboard\_link) | Dashboard link to include in message | `string` | `null` | no |
| <a name="input_db_type"></a> [db\_type](#input\_db\_type) | Database type (mysql, aurora) | `string` | `"mysql"` | no |
| <a name="input_env"></a> [env](#input\_env) | Environment the monitored resource is in (leave blank to omit tag) | `string` | `null` | no |
| <a name="input_evaluation_delay"></a> [evaluation\_delay](#input\_evaluation\_delay) | Monitor evaluation delay (see [https://docs.datadoghq.com/monitors/configuration/?tab=thresholdalert#set-alert-conditions](Datadog Docs)) | `number` | `900` | no |
| <a name="input_monitor_exclude_tags"></a> [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no |
Expand All @@ -96,6 +98,8 @@ No modules.
| <a name="input_timeout_h"></a> [timeout\_h](#input\_timeout\_h) | Auto-resolve alert in specified hours if condition no longer matches | `number` | `0` | no |
| <a name="input_title_prefix"></a> [title\_prefix](#input\_title\_prefix) | Prefix all alerts with specified value in brackets | `string` | `null` | no |
| <a name="input_title_suffix"></a> [title\_suffix](#input\_title\_suffix) | Suffix all alerts with specified value in parenthesis | `string` | `null` | no |
| <a name="input_used_storage_aurora_threshold_critical"></a> [used\_storage\_aurora\_threshold\_critical](#input\_used\_storage\_aurora\_threshold\_critical) | Critical threshold for GB storage available forAurora RDS instances (in GB) | `number` | `1` | no |
| <a name="input_used_storage_aurora_threshold_warning"></a> [used\_storage\_aurora\_threshold\_warning](#input\_used\_storage\_aurora\_threshold\_warning) | Warning threshold for GB storage available forAurora RDS instances (in GB) | `number` | `5` | no |
| <a name="input_used_storage_enabled"></a> [used\_storage\_enabled](#input\_used\_storage\_enabled) | Enable used storage monitor | `bool` | `true` | no |
| <a name="input_used_storage_evaluation_window"></a> [used\_storage\_evaluation\_window](#input\_used\_storage\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_15m"` | no |
| <a name="input_used_storage_no_data_window"></a> [used\_storage\_no\_data\_window](#input\_used\_storage\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no |
Expand Down
31 changes: 30 additions & 1 deletion aws/rds/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ END
}

resource "datadog_monitor" "used_storage" {
count = var.used_storage_enabled ? 1 : 0
count = (var.used_storage_enabled && var.db_type != "aurora") ? 1 : 0

name = join("", [local.title_prefix, "RDS instance storage - {{dbinstanceidentifier.name}} - {{value}}% used", local.title_suffix])
include_tags = false
Expand Down Expand Up @@ -140,3 +140,32 @@ END
warning = var.used_storage_threshold_warning
}
}

resource "datadog_monitor" "used_storage_aurora" {
count = (var.used_storage_enabled && var.db_type == "aurora") ? 1 : 0

name = join("", [local.title_prefix, "RDS Aurora instance storage - {{dbinstanceidentifier.name}} - {{value}} GB available", local.title_suffix])
include_tags = false
message = var.used_storage_use_message ? local.query_alert_base_message : ""
tags = concat(local.common_tags, var.base_tags, var.additional_tags)
type = "query alert"

evaluation_delay = var.evaluation_delay
new_group_delay = var.new_group_delay
notify_no_data = var.notify_no_data
no_data_timeframe = var.used_storage_no_data_window
renotify_interval = var.renotify_interval
require_full_window = true
timeout_h = var.timeout_h

query = <<END
max(${var.used_storage_evaluation_window}):(
default(avg:aws.rds.free_local_storage${local.query_filter} by {dbinstanceidentifier,region,aws_account,env,datadog_managed}, 0) /
) <= ${var.used_storage_aurora_threshold_critical}
END

monitor_thresholds {
critical = var.used_storage_aurora_threshold_critical
warning = var.used_storage_aurora_threshold_warning
}
}
18 changes: 18 additions & 0 deletions aws/rds/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ variable "base_tags" {
type = list(string)
}

variable "db_type" {
default = "mysql"
description = "Database type (mysql, aurora)"
type = string
}

########################################
# Connection Rate (anomaly detection)
########################################
Expand Down Expand Up @@ -217,12 +223,24 @@ variable "used_storage_threshold_critical" {
type = number
}

variable "used_storage_aurora_threshold_critical" {
default = 1
description = "Critical threshold for GB storage available forAurora RDS instances (in GB)"
type = number
}

variable "used_storage_threshold_warning" {
default = 80
description = "Warning threshold (percentage, 0-100)"
type = number
}

variable "used_storage_aurora_threshold_warning" {
default = 5
description = "Warning threshold for GB storage available forAurora RDS instances (in GB)"
type = number
}

variable "used_storage_use_message" {
description = "Whether to use the query alert base message for used storage monitor"
type = bool
Expand Down
Loading