Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

PostgreSQL Prometheus Alert Rules

20 Prometheus alerting rules for PostgreSQL. Exported via prometheus-community/postgres_exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/postgresql/postgres-exporter.yml
critical

2.2.1. Postgresql down

Postgresql instance is down

  # 1m delay allows a restart without triggering an alert.
- alert: PostgresqlDown
  expr: pg_up == 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: Postgresql down (instance {{ $labels.instance }})
    description: "Postgresql instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

2.2.2. Postgresql restarted

Postgresql restarted

- alert: PostgresqlRestarted
  expr: time() - pg_postmaster_start_time_seconds < 60
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Postgresql restarted (instance {{ $labels.instance }})
    description: "Postgresql restarted\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

2.2.3. Postgresql exporter error

Postgresql exporter is showing errors. A query may be buggy in query.yaml

- alert: PostgresqlExporterError
  expr: pg_exporter_last_scrape_error > 0
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: Postgresql exporter error (instance {{ $labels.instance }})
    description: "Postgresql exporter is showing errors. A query may be buggy in query.yaml\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.4. Postgresql table not auto vacuumed

Table {{ $labels.relname }} has not been auto vacuumed for 10 days

- alert: PostgresqlTableNotAutoVacuumed
  expr: ((pg_stat_user_tables_n_tup_del + pg_stat_user_tables_n_tup_upd + pg_stat_user_tables_n_tup_hot_upd) > pg_settings_autovacuum_vacuum_threshold) and (time() - pg_stat_user_tables_last_autovacuum) > 60 * 60 * 24 * 10
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Postgresql table not auto vacuumed (instance {{ $labels.instance }})
    description: "Table {{ $labels.relname }} has not been auto vacuumed for 10 days\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.5. Postgresql table not auto analyzed

Table {{ $labels.relname }} has not been auto analyzed for 10 days

- alert: PostgresqlTableNotAutoAnalyzed
  expr: ((pg_stat_user_tables_n_tup_del + pg_stat_user_tables_n_tup_upd + pg_stat_user_tables_n_tup_hot_upd) > pg_settings_autovacuum_analyze_threshold) and (time() - pg_stat_user_tables_last_autoanalyze) > 24 * 60 * 60 * 10
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Postgresql table not auto analyzed (instance {{ $labels.instance }})
    description: "Table {{ $labels.relname }} has not been auto analyzed for 10 days\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.6. Postgresql too many connections

PostgreSQL instance has too many connections (> 80%).

- alert: PostgresqlTooManyConnections
  expr: sum by (instance, job, server) (pg_stat_activity_count) > min by (instance, job, server) (pg_settings_max_connections * 0.8)
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: Postgresql too many connections (instance {{ $labels.instance }})
    description: "PostgreSQL instance has too many connections (> 80%).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

2.2.7. Postgresql not enough connections

PostgreSQL instance should have more connections (> 5)

- alert: PostgresqlNotEnoughConnections
  expr: sum by (datname) (pg_stat_activity_count{datname!~"template.*|postgres"}) < 5
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: Postgresql not enough connections (instance {{ $labels.instance }})
    description: "PostgreSQL instance should have more connections (> 5)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.8. Postgresql dead locks

PostgreSQL has dead-locks ({{ $value }} in the last minute)

- alert: PostgresqlDeadLocks
  expr: increase(pg_stat_database_deadlocks{datname!~"template.*|postgres",datid!="0"}[1m]) > 5
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Postgresql dead locks (instance {{ $labels.instance }})
    description: "PostgreSQL has dead-locks ({{ $value }} in the last minute)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.9. Postgresql high rollback rate

Ratio of transactions being aborted compared to committed is > 2 %

- alert: PostgresqlHighRollbackRate
  expr: sum by (namespace,datname,instance) (rate(pg_stat_database_xact_rollback{datname!~"template.*|postgres",datid!="0"}[3m])) / (sum by (namespace,datname,instance) (rate(pg_stat_database_xact_rollback{datname!~"template.*|postgres",datid!="0"}[3m])) + sum by (namespace,datname,instance) (rate(pg_stat_database_xact_commit{datname!~"template.*|postgres",datid!="0"}[3m]))) > 0.02 and (sum by (namespace,datname,instance) (rate(pg_stat_database_xact_rollback{datname!~"template.*|postgres",datid!="0"}[3m])) + sum by (namespace,datname,instance) (rate(pg_stat_database_xact_commit{datname!~"template.*|postgres",datid!="0"}[3m]))) > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Postgresql high rollback rate (instance {{ $labels.instance }})
    description: "Ratio of transactions being aborted compared to committed is > 2 %\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

2.2.10. Postgresql commit rate low

Postgresql seems to be processing very few transactions

- alert: PostgresqlCommitRateLow
  expr: increase(pg_stat_database_xact_commit{datname!~"template.*|postgres",datid!="0"}[5m]) < 5
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: Postgresql commit rate low (instance {{ $labels.instance }})
    description: "Postgresql seems to be processing very few transactions\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.11. Postgresql low XID consumption

Postgresql seems to be consuming transaction IDs very slowly

  # pg_txid_current is not a default postgres_exporter metric. You need to define a custom query. See https://github.com/samber/awesome-prometheus-alerts/issues/289#issuecomment-1164842737
- alert: PostgresqlLowXIDConsumption
  expr: rate(pg_txid_current[1m]) < 5
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: Postgresql low XID consumption (instance {{ $labels.instance }})
    description: "Postgresql seems to be consuming transaction IDs very slowly\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.12. Postgresql unused replication slot

Unused Replication Slots

- alert: PostgresqlUnusedReplicationSlot
  expr: (pg_replication_slots_active == 0) and (pg_replication_is_replica == 0)
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: Postgresql unused replication slot (instance {{ $labels.instance }})
    description: "Unused Replication Slots\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.13. Postgresql too many dead tuples

PostgreSQL dead tuples is too large

- alert: PostgresqlTooManyDeadTuples
  expr: ((pg_stat_user_tables_n_dead_tup > 10000) / (pg_stat_user_tables_n_live_tup + pg_stat_user_tables_n_dead_tup)) >= 0.1 and (pg_stat_user_tables_n_live_tup + pg_stat_user_tables_n_dead_tup) > 0
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: Postgresql too many dead tuples (instance {{ $labels.instance }})
    description: "PostgreSQL dead tuples is too large\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
info

2.2.14. Postgresql configuration changed

Postgres Database configuration change has occurred

- alert: PostgresqlConfigurationChanged
  expr: {__name__=~"pg_settings_.*",__name__!="pg_settings_transaction_read_only"} != ON(__name__, instance) {__name__=~"pg_settings_.*",__name__!="pg_settings_transaction_read_only"} OFFSET 5m
  for: 0m
  labels:
    severity: info
  annotations:
    summary: Postgresql configuration changed (instance {{ $labels.instance }})
    description: "Postgres Database configuration change has occurred\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.15. Postgresql SSL compression active

Database allows connections with SSL compression enabled. This may add significant jitter in replication delay. Replicas should turn off SSL compression via `sslcompression=0` in `recovery.conf`.

  # pg_stat_ssl_compression is not a default postgres_exporter metric and is only available on PostgreSQL 9.5-13 (removed in PG 14). See https://github.com/samber/awesome-prometheus-alerts/issues/289#issuecomment-1164842737
- alert: PostgresqlSSLCompressionActive
  expr: sum by (instance) (pg_stat_ssl_compression) > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Postgresql SSL compression active (instance {{ $labels.instance }})
    description: "Database allows connections with SSL compression enabled. This may add significant jitter in replication delay. Replicas should turn off SSL compression via `sslcompression=0` in `recovery.conf`.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

2.2.16. Postgresql too many locks acquired

Too many locks acquired on the database. If this alert happens frequently, we may need to increase the postgres setting max_locks_per_transaction.

- alert: PostgresqlTooManyLocksAcquired
  expr: ((sum by (instance) (pg_locks_count)) / (pg_settings_max_locks_per_transaction * pg_settings_max_connections)) > 0.20 and (pg_settings_max_locks_per_transaction * pg_settings_max_connections) > 0
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: Postgresql too many locks acquired (instance {{ $labels.instance }})
    description: "Too many locks acquired on the database. If this alert happens frequently, we may need to increase the postgres setting max_locks_per_transaction.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.17. Postgresql bloat index high (> 80%)

The index {{ $labels.idxname }} is bloated. You should execute `REINDEX INDEX CONCURRENTLY {{ $labels.idxname }};`

  # See https://github.com/samber/awesome-prometheus-alerts/issues/289#issuecomment-1164842737
- alert: PostgresqlBloatIndexHigh(>80%)
  expr: pg_bloat_btree_bloat_pct > 80 and on (idxname) (pg_bloat_btree_real_size > 100000000)
  for: 1h
  labels:
    severity: warning
  annotations:
    summary: Postgresql bloat index high (> 80%) (instance {{ $labels.instance }})
    description: "The index {{ $labels.idxname }} is bloated. You should execute `REINDEX INDEX CONCURRENTLY {{ $labels.idxname }};`\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.18. Postgresql bloat table high (> 80%)

The table {{ $labels.relname }} is bloated. You should execute `VACUUM {{ $labels.relname }};`

  # See https://github.com/samber/awesome-prometheus-alerts/issues/289#issuecomment-1164842737
- alert: PostgresqlBloatTableHigh(>80%)
  expr: pg_bloat_table_bloat_pct > 80 and on (relname) (pg_bloat_table_real_size > 200000000)
  for: 1h
  labels:
    severity: warning
  annotations:
    summary: Postgresql bloat table high (> 80%) (instance {{ $labels.instance }})
    description: "The table {{ $labels.relname }} is bloated. You should execute `VACUUM {{ $labels.relname }};`\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.19. Postgresql invalid index

The table {{ $labels.relname }} has an invalid index: {{ $labels.indexrelname }}. You should execute `DROP INDEX {{ $labels.indexrelname }};`

  # See https://github.com/samber/awesome-prometheus-alerts/issues/289#issuecomment-1164842737
- alert: PostgresqlInvalidIndex
  expr: pg_general_index_info_pg_relation_size{indexrelname=~".*ccnew.*"}
  for: 6h
  labels:
    severity: warning
  annotations:
    summary: Postgresql invalid index (instance {{ $labels.instance }})
    description: "The table {{ $labels.relname }} has an invalid index: {{ $labels.indexrelname }}. You should execute `DROP INDEX {{ $labels.indexrelname }};`\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

2.2.20. Postgresql replication lag

The PostgreSQL replication lag is high (> 5s)

- alert: PostgresqlReplicationLag
  expr: pg_replication_lag_seconds > 5
  for: 30s
  labels:
    severity: warning
  annotations:
    summary: Postgresql replication lag (instance {{ $labels.instance }})
    description: "The PostgreSQL replication lag is high (> 5s)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"