critical
3.1.1.1. RabbitMQ node down
Less than 3 nodes running in RabbitMQ cluster
# 1m delay allows a restart without triggering an alert.
- alert: RabbitMQNodeDown
expr: sum(rabbitmq_build_info) < 3
for: 1m
labels:
severity: critical
annotations:
summary: RabbitMQ node down (instance {{ $labels.instance }})
description: "Less than 3 nodes running in RabbitMQ cluster\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" critical
3.1.1.2. RabbitMQ node not distributed
Distribution link to peer {{ $labels.peer }} is not 'up' (state {{ $value }})
# 1m delay allows a restart without triggering an alert.
- alert: RabbitMQNodeNotDistributed
expr: erlang_vm_dist_node_state < 3
for: 1m
labels:
severity: critical
annotations:
summary: RabbitMQ node not distributed (instance {{ $labels.instance }})
description: "Distribution link to peer {{ $labels.peer }} is not 'up' (state {{ $value }})\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.3. RabbitMQ instances different versions
Running different version of RabbitMQ in the same cluster, can lead to failure.
- alert: RabbitMQInstancesDifferentVersions
expr: count(count(rabbitmq_build_info) by (rabbitmq_version)) > 1
for: 1h
labels:
severity: warning
annotations:
summary: RabbitMQ instances different versions (instance {{ $labels.instance }})
description: "Running different version of RabbitMQ in the same cluster, can lead to failure.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.4. RabbitMQ memory high
A node use more than 90% of allocated RAM
- alert: RabbitMQMemoryHigh
expr: rabbitmq_process_resident_memory_bytes / rabbitmq_resident_memory_limit_bytes * 100 > 90 and rabbitmq_resident_memory_limit_bytes > 0
for: 2m
labels:
severity: warning
annotations:
summary: RabbitMQ memory high (instance {{ $labels.instance }})
description: "A node use more than 90% of allocated RAM\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.5. RabbitMQ file descriptors usage
A node use more than 90% of file descriptors
- alert: RabbitMQFileDescriptorsUsage
expr: rabbitmq_process_open_fds / rabbitmq_process_max_fds * 100 > 90 and rabbitmq_process_max_fds > 0
for: 2m
labels:
severity: warning
annotations:
summary: RabbitMQ file descriptors usage (instance {{ $labels.instance }})
description: "A node use more than 90% of file descriptors\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.6. RabbitMQ too many ready messages
RabbitMQ too many ready messages on queue {{ $labels.queue }} ({{ $value }})
- alert: RabbitMQTooManyReadyMessages
expr: sum(rabbitmq_queue_messages_ready) BY (queue) > 1000
for: 1m
labels:
severity: warning
annotations:
summary: RabbitMQ too many ready messages (instance {{ $labels.instance }})
description: "RabbitMQ too many ready messages on queue {{ $labels.queue }} ({{ $value }})\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.7. RabbitMQ too many unack messages
Too many unacknowledged messages on queue {{ $labels.queue }} ({{ $value }})
- alert: RabbitMQTooManyUnackMessages
expr: sum(rabbitmq_queue_messages_unacked) BY (queue) > 1000
for: 1m
labels:
severity: warning
annotations:
summary: RabbitMQ too many unack messages (instance {{ $labels.instance }})
description: "Too many unacknowledged messages on queue {{ $labels.queue }} ({{ $value }})\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.8. RabbitMQ too many connections
The total connections of a node is too high
- alert: RabbitMQTooManyConnections
expr: rabbitmq_connections > 1000
for: 2m
labels:
severity: warning
annotations:
summary: RabbitMQ too many connections (instance {{ $labels.instance }})
description: "The total connections of a node is too high\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.9. RabbitMQ no queue consumer
A queue has less than 1 consumer
- alert: RabbitMQNoQueueConsumer
expr: rabbitmq_queue_consumers < 1
for: 1m
labels:
severity: warning
annotations:
summary: RabbitMQ no queue consumer (instance {{ $labels.instance }})
description: "A queue has less than 1 consumer\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" warning
3.1.1.10. RabbitMQ unroutable messages
A queue has unroutable messages ({{ $value }} in the last 5m)
# Threshold of 3 avoids noise from occasional misroutes. Adjust based on your expected traffic patterns.
- alert: RabbitMQUnroutableMessages
expr: increase(rabbitmq_channel_messages_unroutable_returned_total[5m]) > 3 or increase(rabbitmq_channel_messages_unroutable_dropped_total[5m]) > 3
for: 2m
labels:
severity: warning
annotations:
summary: RabbitMQ unroutable messages (instance {{ $labels.instance }})
description: "A queue has unroutable messages ({{ $value }} in the last 5m)\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"