Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Process Exporter Prometheus Alert Rules

10 Prometheus alerting rules for Process Exporter. Exported via ncabatoff/process-exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/process-exporter/process-exporter.yml
warning

1.12.1. Process exporter group down

No processes found for group {{ $labels.groupname }}. The service may have stopped. (instance {{ $labels.instance }})

- alert: ProcessExporterGroupDown
  expr: namedprocess_namegroup_num_procs == 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter group down (instance {{ $labels.instance }})
    description: "No processes found for group {{ $labels.groupname }}. The service may have stopped. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.2. Process exporter high memory usage

Process group {{ $labels.groupname }} is using {{ $value | humanize }}B of resident memory. (instance {{ $labels.instance }})

  # Threshold of 4GB is arbitrary and depends on the process being monitored. Adjust per group.
- alert: ProcessExporterHighMemoryUsage
  expr: namedprocess_namegroup_memory_bytes{memtype="resident"} > 4e+09
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high memory usage (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} is using {{ $value | humanize }}B of resident memory. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.3. Process exporter high CPU usage

Process group {{ $labels.groupname }} is using {{ $value }}% CPU (core-equivalent). (instance {{ $labels.instance }})

  # Value is core-equivalent %: 100% = 1 full core, 200% = 2 cores, etc. Threshold of 80% is per-core. Adjust based on expected workload.
- alert: ProcessExporterHighCPUUsage
  expr: rate(namedprocess_namegroup_cpu_seconds_total[5m]) * 100 > 80
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high CPU usage (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} is using {{ $value }}% CPU (core-equivalent). (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.4. Process exporter high file descriptor usage

Process group {{ $labels.groupname }} is using more than 80% of its file descriptor limit. (instance {{ $labels.instance }})

- alert: ProcessExporterHighFileDescriptorUsage
  expr: namedprocess_namegroup_worst_fd_ratio > 0.8
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high file descriptor usage (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} is using more than 80% of its file descriptor limit. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

1.12.5. Process exporter file descriptors exhausted

Process group {{ $labels.groupname }} has nearly exhausted its file descriptor limit. (instance {{ $labels.instance }})

- alert: ProcessExporterFileDescriptorsExhausted
  expr: namedprocess_namegroup_worst_fd_ratio > 0.95
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: Process exporter file descriptors exhausted (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} has nearly exhausted its file descriptor limit. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.6. Process exporter high swap usage

Process group {{ $labels.groupname }} is using {{ $value | humanize }}B of swap. (instance {{ $labels.instance }})

  # Threshold of 512MB is arbitrary. Adjust per group and environment.
- alert: ProcessExporterHighSwapUsage
  expr: namedprocess_namegroup_memory_bytes{memtype="swapped"} > 512e+06
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high swap usage (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} is using {{ $value | humanize }}B of swap. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.7. Process exporter zombie processes

Process group {{ $labels.groupname }} has {{ $value }} zombie processes. (instance {{ $labels.instance }})

- alert: ProcessExporterZombieProcesses
  expr: namedprocess_namegroup_states{state="Zombie"} > 5
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter zombie processes (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} has {{ $value }} zombie processes. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.8. Process exporter high context switching

Process group {{ $labels.groupname }} has a high rate of context switches ({{ $value }}/s). (instance {{ $labels.instance }})

  # Filters to voluntary switches only — involuntary switches are normal under CPU contention. Threshold of 50000/s is a rough default. Adjust based on workload.
- alert: ProcessExporterHighContextSwitching
  expr: rate(namedprocess_namegroup_context_switches_total{ctxswitchtype="voluntary"}[5m]) > 50000
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high context switching (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} has a high rate of context switches ({{ $value }}/s). (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.12.9. Process exporter high disk write IO

Process group {{ $labels.groupname }} is performing {{ $value | humanize }}B/s of disk writes. (instance {{ $labels.instance }})

  # Threshold of 100MB/s is arbitrary. Adjust per group.
- alert: ProcessExporterHighDiskWriteIO
  expr: rate(namedprocess_namegroup_write_bytes_total[5m]) > 100e+06
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Process exporter high disk write IO (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} is performing {{ $value | humanize }}B/s of disk writes. (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
info

1.12.10. Process exporter process restarting

Process group {{ $labels.groupname }} has restarted (oldest process start time changed). (instance {{ $labels.instance }})

  # Detects restarts by watching for changes in the oldest process start time within the group.
- alert: ProcessExporterProcessRestarting
  expr: changes(namedprocess_namegroup_oldest_start_time_seconds[5m]) > 0 and namedprocess_namegroup_num_procs > 0
  for: 0m
  labels:
    severity: info
  annotations:
    summary: Process exporter process restarting (instance {{ $labels.instance }})
    description: "Process group {{ $labels.groupname }} has restarted (oldest process start time changed). (instance {{ $labels.instance }})\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"