Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Promtail Prometheus Alert Rules

2 Prometheus alerting rules for Promtail. Exported via Embedded exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

12.3. Embedded exporter (2 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/promtail/embedded-exporter.yml
critical

12.3.1. Promtail request errors

The {{ $labels.job }} {{ $labels.route }} is experiencing {{ printf "%.2f" $value }}% errors.

- alert: PromtailRequestErrors
  expr: 100 * sum(rate(promtail_request_duration_seconds_count{status_code=~"5..|failed"}[1m])) by (namespace, job, route, instance) / sum(rate(promtail_request_duration_seconds_count[1m])) by (namespace, job, route, instance) > 10 and sum(rate(promtail_request_duration_seconds_count[1m])) by (namespace, job, route, instance) > 0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: Promtail request errors (instance {{ $labels.instance }})
    description: "The {{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}% errors.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

12.3.2. Promtail request latency

The {{ $labels.job }} {{ $labels.route }} is experiencing {{ printf "%.2f" $value }}s 99th percentile latency.

- alert: PromtailRequestLatency
  expr: histogram_quantile(0.99, sum(rate(promtail_request_duration_seconds_bucket[5m])) by (namespace, job, route, le)) > 1
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: Promtail request latency (instance {{ $labels.instance }})
    description: "The {{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}s 99th percentile latency.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"