Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Zookeeper Prometheus Alert Rules

4 Prometheus alerting rules for Zookeeper. Exported via cloudflare/kafka_zookeeper_exporter, dabealu/zookeeper-exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/zookeeper/dabealu-zookeeper-exporter.yml
critical

3.2.2.1. Zookeeper Down

Zookeeper down on instance {{ $labels.instance }}

  # 1m delay allows a restart without triggering an alert.
- alert: ZookeeperDown
  expr: zk_up == 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Zookeeper Down (instance {{ $labels.instance }})
    description: "Zookeeper down on instance {{ $labels.instance }}\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

3.2.2.2. Zookeeper missing leader

Zookeeper cluster has no node marked as leader

- alert: ZookeeperMissingLeader
  expr: sum(zk_server_leader) == 0
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Zookeeper missing leader (instance {{ $labels.instance }})
    description: "Zookeeper cluster has no node marked as leader\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

3.2.2.3. Zookeeper Too Many Leaders

Zookeeper cluster has {{ $value }} nodes marked as leader (expected 1), indicating a split-brain

- alert: ZookeeperTooManyLeaders
  expr: sum(zk_server_leader) > 1
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Zookeeper Too Many Leaders (instance {{ $labels.instance }})
    description: "Zookeeper cluster has {{ $value }} nodes marked as leader (expected 1), indicating a split-brain\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

3.2.2.4. Zookeeper Not Ok

Zookeeper instance {{ $labels.instance }} is not ok (ruok check failed)

- alert: ZookeeperNotOk
  expr: zk_ruok == 0
  for: 3m
  labels:
    severity: warning
  annotations:
    summary: Zookeeper Not Ok (instance {{ $labels.instance }})
    description: "Zookeeper instance {{ $labels.instance }} is not ok (ruok check failed)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"