Skip to content

Application Metrics

Garot Conklin edited this page Feb 6, 2025 · 1 revision

Application Metrics Dashboard

Example configurations for monitoring application performance and health.

Web Application Dashboard

version: "1.0"
dashboards:
  - name: "Web Application Monitoring"
    description: "Key performance indicators for web applications"
    layout_type: "ordered"
    template_variables:
      - name: "service"
        prefix: "service"
        default: "web-api"
    widgets:
      - title: "Request Rate"
        type: "timeseries"
        query: "sum:http.requests{service:$service} by {endpoint}.as_rate()"

      - title: "Error Rate"
        type: "timeseries"
        query: "sum:http.errors{service:$service} by {endpoint}.as_rate()"

      - title: "Response Time (p95)"
        type: "timeseries"
        query: "p95:http.response.time{service:$service} by {endpoint}"

      - title: "Active Users"
        type: "query_value"
        query: "sum:users.active{service:$service}"

Database Monitoring

version: "1.0"
dashboards:
  - name: "Database Performance"
    description: "Database performance metrics"
    layout_type: "ordered"
    template_variables:
      - name: "db"
        prefix: "database"
        default: "main"
    widgets:
      - title: "Query Response Time"
        type: "timeseries"
        query: "avg:database.query.time{database:$db} by {query_type}"

      - title: "Active Connections"
        type: "query_value"
        query: "avg:database.connections.active{database:$db}"

      - title: "Cache Hit Ratio"
        type: "timeseries"
        query: "avg:database.cache.hit_ratio{database:$db}"

Service Health Dashboard

version: "1.0"
dashboards:
  - name: "Service Health"
    description: "Service health and availability metrics"
    layout_type: "ordered"
    widgets:
      - title: "Service Status"
        type: "check_status"
        check: "service.up"
        grouping: "cluster"

      - title: "Error Logs"
        type: "log_stream"
        query: "status:error service:$service"

      - title: "Dependency Map"
        type: "service_map"
        service: "$service"

Best Practices

  • Use appropriate time aggregations
  • Monitor error rates and latencies
  • Track user experience metrics
  • Set up meaningful alerts
  • Use template variables for flexibility

Common Metrics

Web Applications

  • Request rates
  • Error rates
  • Response times
  • User sessions

Databases

  • Query performance
  • Connection pools
  • Cache efficiency
  • Storage metrics

Services

  • Availability
  • Error rates
  • Resource usage
  • Dependency health

Related Resources

Clone this wiki locally