Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
Prometheus

Awesome Prometheus Alert Rules

951 copy-pasteable Prometheus alerting rules. Find, copy, and deploy alerts in seconds.

951
alert rules
111
exporters
13
categories
7.9k
engineers starred
7,944 / 10,000 — help us reach 10k! Star it

Browse by category

Basic resource monitoring

13 services · 153 rules

Databases

16 services · 184 rules

Message brokers

5 services · 52 rules

Runtimes

6 services · 35 rules

Data engineering

3 services · 30 rules

Orchestrators

5 services · 77 rules

CI/CD

5 services · 56 rules

Network and security

12 services · 70 rules

Storage

4 services · 21 rules

Cloud providers

4 services · 33 rules

Observability

9 services · 153 rules

Other

2 services · 12 rules

Frequently asked questions

What are Prometheus alerting rules?

Prometheus alerting rules are PromQL-based conditions evaluated by the Prometheus server. When a condition is true for a specified duration, an alert fires and is routed by AlertManager to receivers like Slack, PagerDuty, or email. Rules are defined as YAML files and cover metrics thresholds, absence of expected data, and rate-of-change conditions.

How do I use these Prometheus alert rules?

Find the service you want to monitor, copy the YAML snippet for any rule, and paste it into your Prometheus rules file (e.g., alerts/my-service.yml). Reload Prometheus to apply the rules. Adjust thresholds to match your workload — the values provided are sensible defaults but may need tuning.

What exporters and services are covered?

Awesome Prometheus Alerts covers 92 services across 13 categories: Basic resource monitoring (Prometheus self-monitoring, Host and hardware, S.M.A.R.T Device Monitoring, IPMI, Docker containers, Blackbox, Windows Server, VMware, Proxmox VE, Netdata, eBPF, Process Exporter, Systemd); Databases (MySQL, PostgreSQL, SQL Server, Oracle Database, Patroni, PGBouncer, Redis, Memcached, MongoDB, Elasticsearch, OpenSearch, Meilisearch, Cassandra, Clickhouse, CouchDB, Solr); Message brokers (RabbitMQ, Zookeeper, Kafka, Pulsar, Nats); Proxies, load balancers and service meshes (Nginx, Apache, HaProxy, Traefik, Caddy, Envoy, Linkerd, Istio); Runtimes (PHP-FPM, JVM, Golang, Ruby, Python, Sidekiq); Data engineering (Apache Flink, Apache Spark, Hadoop); Orchestrators (Kubernetes, Nomad, Consul, Etcd, OpenStack); CI/CD (Jenkins, ArgoCD, FluxCD, GitLab CI, Spinnaker); Network and security (SpeedTest, SSL/TLS, cert-manager, Juniper, CoreDNS, Freeswitch, Hashicorp Vault, Keycloak, Cloudflare, SNMP, Cilium, WireGuard); Storage (Ceph, ZFS, OpenEBS, Minio); Cloud providers (AWS CloudWatch, Google Cloud Stackdriver, DigitalOcean, Azure); Observability (Thanos, Loki, Promtail, Cortex, Grafana Tempo, Grafana Mimir, Grafana Alloy, OpenTelemetry Collector, Jaeger); Other (APC UPS, Graph Node).

What is the difference between warning and critical severity?

Critical alerts require immediate human attention — the system is down or severely degraded and revenue or reliability is directly impacted. Warning alerts need attention soon but are not immediately urgent. Info alerts are awareness-only, such as configuration changes or underutilized resources. Set up AlertManager routes to page on-call engineers only for critical alerts.

What is PromQL?

PromQL (Prometheus Query Language) is the functional query language used to select, filter, and aggregate time-series data in Prometheus. Alert rules use PromQL expressions — for example, rate(http_requests_total[5m]) > 100 fires when request rate exceeds 100/s over a 5-minute window.

Can I contribute new alert rules?

Yes! Contributions are welcome. Open a pull request on GitHub at https://github.com/samber/awesome-prometheus-alerts with your new rules added to the _data/rules.yml file. Follow the existing format: provide a clear rule name, a description explaining what the alert means and why it matters, a tested PromQL expression, an appropriate severity, and a sensible "for" duration to avoid false positives.

What is AlertManager and how does it relate to these rules?

AlertManager is the component that receives firing alerts from Prometheus and handles deduplication, grouping, silencing, and routing to receivers (Slack, PagerDuty, email, webhooks). The alert rules in this collection fire alerts from Prometheus — AlertManager then decides who to notify and when. See the AlertManager Configuration guide on this site for setup examples.

How do I silence or suppress an alert?

AlertManager supports silences — time-bounded mutes applied via its UI or API that suppress notifications without disabling the rule. For recurring suppression (nights, weekends, deployments), use inhibition rules or time-based PromQL patterns. See the Sleep Peacefully guide on this site for timezone-aware suppression examples using day_of_week() and hour() functions.

What is the license for these alert rules?

The alert rules and content are licensed under Creative Commons CC BY 4.0 — you are free to use, adapt, and redistribute them, including commercially, as long as you provide attribution. The site source code is licensed under MIT. See the LICENSE file in the GitHub repository for details.