Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Caddy Prometheus Alert Rules

3 Prometheus alerting rules for Caddy. Exported via Embedded exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

4.5. Embedded exporter (3 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/caddy/embedded-exporter.yml
critical

4.5.1. Caddy Reverse Proxy Down

Caddy reverse proxy upstream {{ $labels.upstream }} is unhealthy

- alert: CaddyReverseProxyDown
  expr: caddy_reverse_proxy_upstreams_healthy == 0
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Caddy Reverse Proxy Down (instance {{ $labels.instance }})
    description: "Caddy reverse proxy upstream {{ $labels.upstream }} is unhealthy\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.5.2. Caddy high HTTP 4xx error rate service

Caddy service 4xx error rate is above 5%

- alert: CaddyHighHTTP4xxErrorRateService
  expr: sum(rate(caddy_http_request_duration_seconds_count{code=~"4.."}[3m])) by (instance) / sum(rate(caddy_http_request_duration_seconds_count[3m])) by (instance) * 100 > 5 and sum(rate(caddy_http_request_duration_seconds_count[3m])) by (instance) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Caddy high HTTP 4xx error rate service (instance {{ $labels.instance }})
    description: "Caddy service 4xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.5.3. Caddy high HTTP 5xx error rate service

Caddy service 5xx error rate is above 5%

- alert: CaddyHighHTTP5xxErrorRateService
  expr: sum(rate(caddy_http_request_duration_seconds_count{code=~"5.."}[3m])) by (instance) / sum(rate(caddy_http_request_duration_seconds_count[3m])) by (instance) * 100 > 5 and sum(rate(caddy_http_request_duration_seconds_count[3m])) by (instance) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Caddy high HTTP 5xx error rate service (instance {{ $labels.instance }})
    description: "Caddy service 5xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"