Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Traefik Prometheus Alert Rules

6 Prometheus alerting rules for Traefik. Exported via Embedded exporter v2, Embedded exporter v1. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

4.4.1. Embedded exporter v2 (3 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/traefik/embedded-exporter-v2.yml
critical

4.4.1.1. Traefik service down

All Traefik services are down

- alert: TraefikServiceDown
  expr: count(traefik_service_server_up) by (service) == 0
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Traefik service down (instance {{ $labels.instance }})
    description: "All Traefik services are down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.4.1.2. Traefik high HTTP 4xx error rate service

Traefik service 4xx error rate is above 5%

- alert: TraefikHighHTTP4xxErrorRateService
  expr: sum(rate(traefik_service_requests_total{code=~"4.*"}[3m])) by (service) / sum(rate(traefik_service_requests_total[3m])) by (service) * 100 > 5 and sum(rate(traefik_service_requests_total[3m])) by (service) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Traefik high HTTP 4xx error rate service (instance {{ $labels.instance }})
    description: "Traefik service 4xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.4.1.3. Traefik high HTTP 5xx error rate service

Traefik service 5xx error rate is above 5%

- alert: TraefikHighHTTP5xxErrorRateService
  expr: sum(rate(traefik_service_requests_total{code=~"5.*"}[3m])) by (service) / sum(rate(traefik_service_requests_total[3m])) by (service) * 100 > 5 and sum(rate(traefik_service_requests_total[3m])) by (service) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Traefik high HTTP 5xx error rate service (instance {{ $labels.instance }})
    description: "Traefik service 5xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

4.4.2. Embedded exporter v1 (3 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/traefik/embedded-exporter-v1.yml
critical

4.4.2.1. Traefik backend down

All Traefik backends are down

- alert: TraefikBackendDown
  expr: count(traefik_backend_server_up) by (backend) == 0
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Traefik backend down (instance {{ $labels.instance }})
    description: "All Traefik backends are down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.4.2.2. Traefik high HTTP 4xx error rate backend

Traefik backend 4xx error rate is above 5%

- alert: TraefikHighHTTP4xxErrorRateBackend
  expr: sum(rate(traefik_backend_requests_total{code=~"4.*"}[3m])) by (backend) / sum(rate(traefik_backend_requests_total[3m])) by (backend) * 100 > 5 and sum(rate(traefik_backend_requests_total[3m])) by (backend) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Traefik high HTTP 4xx error rate backend (instance {{ $labels.instance }})
    description: "Traefik backend 4xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

4.4.2.3. Traefik high HTTP 5xx error rate backend

Traefik backend 5xx error rate is above 5%

- alert: TraefikHighHTTP5xxErrorRateBackend
  expr: sum(rate(traefik_backend_requests_total{code=~"5.*"}[3m])) by (backend) / sum(rate(traefik_backend_requests_total[3m])) by (backend) * 100 > 5 and sum(rate(traefik_backend_requests_total[3m])) by (backend) > 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Traefik high HTTP 5xx error rate backend (instance {{ $labels.instance }})
    description: "Traefik backend 5xx error rate is above 5%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"