Question 1

What are Prometheus alerting rules?

Accepted Answer

Prometheus alerting rules are PromQL-based conditions evaluated by the Prometheus server. When a condition is true for a specified duration, an alert fires and is routed by AlertManager to receivers like Slack, PagerDuty, or email. Rules are defined as YAML files and cover metrics thresholds, absence of expected data, and rate-of-change conditions.

Question 2

How do I use these Prometheus alert rules?

Accepted Answer

Find the service you want to monitor, copy the YAML snippet for any rule, and paste it into your Prometheus rules file (e.g., alerts/my-service.yml). Reload Prometheus to apply the rules. Adjust thresholds to match your workload — the values provided are sensible defaults but may need tuning.

Question 3

What exporters and services are covered?

Accepted Answer

Awesome Prometheus Alerts covers 93 services across 13 categories: Basic resource monitoring (Prometheus self-monitoring, Host and hardware, S.M.A.R.T Device Monitoring, IPMI, Docker containers, Blackbox, Windows Server, VMware, Proxmox VE, Netdata, eBPF, Process Exporter, Systemd); Databases (MySQL, PostgreSQL, SQL Server, Oracle Database, Patroni, PGBouncer, Redis, Memcached, MongoDB, Elasticsearch, OpenSearch, Meilisearch, Cassandra, Clickhouse, CouchDB, Solr); Message brokers (RabbitMQ, Zookeeper, Kafka, Pulsar, Nats); Proxies, load balancers and service meshes (Nginx, Apache, HaProxy, Traefik, Caddy, Envoy, Linkerd, Istio); Runtimes (PHP-FPM, JVM, Golang, Ruby, Python, Sidekiq); Data engineering (Apache Flink, Apache Spark, Hadoop); Orchestrators (Kubernetes, Nomad, Consul, Etcd, OpenStack); CI/CD (Jenkins, ArgoCD, FluxCD, GitLab CI, Spinnaker); Network and security (SpeedTest, SSL/TLS, cert-manager, Juniper, CoreDNS, Freeswitch, Hashicorp Vault, Keycloak, Cloudflare, SNMP, Cilium, WireGuard); Storage (Ceph, ZFS, OpenEBS, Minio); Cloud providers (AWS CloudWatch, Google Cloud Stackdriver, DigitalOcean, Azure); Observability (Thanos, Loki, Promtail, Cortex, Grafana Tempo, Grafana Mimir, Grafana Alloy, OpenTelemetry Collector, Jaeger); Other (APC UPS, Graph Node, LiteLLM).

Question 4

What is the difference between warning and critical severity?

Accepted Answer

Critical alerts require immediate human attention — the system is down or severely degraded and revenue or reliability is directly impacted. Warning alerts need attention soon but are not immediately urgent. Info alerts are awareness-only, such as configuration changes or underutilized resources. Set up AlertManager routes to page on-call engineers only for critical alerts.

Question 5

What is PromQL?

Accepted Answer

PromQL (Prometheus Query Language) is the functional query language used to select, filter, and aggregate time-series data in Prometheus. Alert rules use PromQL expressions — for example, rate(http_requests_total[5m]) > 100 fires when request rate exceeds 100/s over a 5-minute window.

Question 6

Can I contribute new alert rules?

Accepted Answer

Yes! Contributions are welcome. Open a pull request on GitHub at https://github.com/samber/awesome-prometheus-alerts with your new rules added to the _data/rules.yml file. Follow the existing format: provide a clear rule name, a description explaining what the alert means and why it matters, a tested PromQL expression, an appropriate severity, and a sensible "for" duration to avoid false positives.

Question 7

What is AlertManager and how does it relate to these rules?

Accepted Answer

AlertManager is the component that receives firing alerts from Prometheus and handles deduplication, grouping, silencing, and routing to receivers (Slack, PagerDuty, email, webhooks). The alert rules in this collection fire alerts from Prometheus — AlertManager then decides who to notify and when. See the AlertManager Configuration guide on this site for setup examples.

Question 8

How do I silence or suppress an alert?

Accepted Answer

AlertManager supports silences — time-bounded mutes applied via its UI or API that suppress notifications without disabling the rule. For recurring suppression (nights, weekends, deployments), use inhibition rules or time-based PromQL patterns. See the Sleep Peacefully guide on this site for timezone-aware suppression examples using day_of_week() and hour() functions.

Question 9

What is the license for these alert rules?

Accepted Answer

The alert rules and content are licensed under Creative Commons CC BY 4.0 — you are free to use, adapt, and redistribute them, including commercially, as long as you provide attribution. The site source code is licensed under MIT. See the LICENSE file in the GitHub repository for details.

Awesome Prometheus Alert Rules

Popular services

Browse by category

Prometheus self-monitoring

Host and hardware

S.M.A.R.T Device Monitoring

IPMI

Docker containers

Blackbox

Windows Server

VMware

Proxmox VE

Netdata

eBPF

Process Exporter

Systemd

MySQL

PostgreSQL

SQL Server

Oracle Database

Patroni

PGBouncer

Redis

Memcached

MongoDB

Elasticsearch

OpenSearch

Meilisearch

Cassandra

Clickhouse

CouchDB

Solr

RabbitMQ

Zookeeper

Kafka

Pulsar

Nats

Nginx

Apache

HaProxy

Traefik

Caddy

Envoy

Linkerd

Istio

PHP-FPM

JVM

Golang

Ruby

Python

Sidekiq

Apache Flink

Apache Spark

Hadoop

Kubernetes

Nomad

Consul

Etcd

OpenStack

Jenkins

ArgoCD

FluxCD

GitLab CI

Spinnaker

SpeedTest

SSL/TLS

cert-manager

Juniper

CoreDNS

Freeswitch

Hashicorp Vault

Keycloak

Cloudflare

SNMP

Cilium

WireGuard

Ceph

ZFS

OpenEBS

Minio