What is the Prometheus alert rule for "Netdata high cpu usage"?

Netdata high CPU usage (> 80%) PromQL expression: netdata_cpu_cpu_percentage_average{dimension="idle"} < 20. Severity: warning. Duration: 5m.

What is the Prometheus alert rule for "Netdata CPU steal noisy neighbor"?

CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit. PromQL expression: netdata_cpu_cpu_percentage_average{dimension="steal"} > 10. Severity: warning. Duration: 5m.

What is the Prometheus alert rule for "Netdata predicted disk full"?

Netdata predicted disk full in 24 hours PromQL expression: predict_linear(netdata_disk_space_GB_average{dimension=~"avail|cached"}[3h], 24 * 3600) < 0. Severity: warning.

What is the Prometheus alert rule for "Netdata MD mismatch cnt unsynchronized blocks"?

RAID Array have unsynchronized blocks PromQL expression: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024. Severity: warning. Duration: 2m.

What is the Prometheus alert rule for "Netdata disk reallocated sectors"?

Disk reallocated sectors detected ({{ $value }} sectors) PromQL expression: increase(netdata_smartd_log_reallocated_sectors_count_sectors_average[1m]) > 0. Severity: info.

What is the Prometheus alert rule for "Netdata disk current pending sector"?

Disk current pending sector PromQL expression: netdata_smartd_log_current_pending_sector_count_sectors_average > 0. Severity: warning.

What is the Prometheus alert rule for "Netdata reported uncorrectable disk sectors"?

Reported uncorrectable disk sectors ({{ $value }} sectors) PromQL expression: increase(netdata_smartd_log_offline_uncorrectable_sector_count_sectors_average[2m]) > 0. Severity: warning.

Netdata Prometheus Alert Rules

Q: What is the Prometheus alert rule for "Netdata high memory usage"?

Netdata high memory usage (> 80%) PromQL expression: 100 / netdata_system_ram_MiB_average * netdata_system_ram_MiB_average{dimension=~"free|cached"} 0. Severity: warning. Duration: 5m.

Q: What is the Prometheus alert rule for "Netdata low disk space"?

Netdata low disk space (> 80%) PromQL expression: 100 / netdata_disk_space_GB_average * netdata_disk_space_GB_average{dimension=~"avail|cached"} 0. Severity: warning. Duration: 5m.

Q: What is the Prometheus alert rule for "Netdata MD mismatch cnt unsynchronized blocks"?

RAID Array have unsynchronized blocks PromQL expression: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024. Severity: warning. Duration: 2m.

9 Prometheus alerting rules for Netdata. Exported via Embedded exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

groups:
- name: EmbeddedExporter
  rules:
      # This is a gauge metric (not a counter). Checking idle < 20% means CPU usage > 80%.
    - alert: NetdataHighCpuUsage
      expr: netdata_cpu_cpu_percentage_average{dimension="idle"} < 20
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: Netdata high cpu usage (instance {{ $labels.instance }})
        description: "Netdata high CPU usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataCPUStealNoisyNeighbor
      expr: netdata_cpu_cpu_percentage_average{dimension="steal"} > 10
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: Netdata CPU steal noisy neighbor (instance {{ $labels.instance }})
        description: "CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataHighMemoryUsage
      expr: 100 / netdata_system_ram_MiB_average * netdata_system_ram_MiB_average{dimension=~"free|cached"} < 20 and netdata_system_ram_MiB_average > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: Netdata high memory usage (instance {{ $labels.instance }})
        description: "Netdata high memory usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataLowDiskSpace
      expr: 100 / netdata_disk_space_GB_average * netdata_disk_space_GB_average{dimension=~"avail|cached"} < 20 and netdata_disk_space_GB_average > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: Netdata low disk space (instance {{ $labels.instance }})
        description: "Netdata low disk space (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataPredictedDiskFull
      expr: predict_linear(netdata_disk_space_GB_average{dimension=~"avail|cached"}[3h], 24 * 3600) < 0
      for: 0m
      labels:
        severity: warning
      annotations:
        summary: Netdata predicted disk full (instance {{ $labels.instance }})
        description: "Netdata predicted disk full in 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataMDMismatchCntUnsynchronizedBlocks
      expr: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: Netdata MD mismatch cnt unsynchronized blocks (instance {{ $labels.instance }})
        description: "RAID Array have unsynchronized blocks\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataDiskReallocatedSectors
      expr: increase(netdata_smartd_log_reallocated_sectors_count_sectors_average[1m]) > 0
      for: 0m
      labels:
        severity: info
      annotations:
        summary: Netdata disk reallocated sectors (instance {{ $labels.instance }})
        description: "Disk reallocated sectors detected ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataDiskCurrentPendingSector
      expr: netdata_smartd_log_current_pending_sector_count_sectors_average > 0
      for: 0m
      labels:
        severity: warning
      annotations:
        summary: Netdata disk current pending sector (instance {{ $labels.instance }})
        description: "Disk current pending sector\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    
    - alert: NetdataReportedUncorrectableDiskSectors
      expr: increase(netdata_smartd_log_offline_uncorrectable_sector_count_sectors_average[2m]) > 0
      for: 0m
      labels:
        severity: warning
      annotations:
        summary: Netdata reported uncorrectable disk sectors (instance {{ $labels.instance }})
        description: "Reported uncorrectable disk sectors ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

1.10. Embedded exporter (9 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/netdata/embedded-exporter.yml

warning

1.10.1. Netdata high cpu usage

Netdata high CPU usage (> 80%)

  # This is a gauge metric (not a counter). Checking idle < 20% means CPU usage > 80%.
- alert: NetdataHighCpuUsage
  expr: netdata_cpu_cpu_percentage_average{dimension="idle"} < 20
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata high cpu usage (instance {{ $labels.instance }})
    description: "Netdata high CPU usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.2. Netdata CPU steal noisy neighbor

CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.

- alert: NetdataCPUStealNoisyNeighbor
  expr: netdata_cpu_cpu_percentage_average{dimension="steal"} > 10
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata CPU steal noisy neighbor (instance {{ $labels.instance }})
    description: "CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.3. Netdata high memory usage

Netdata high memory usage (> 80%)

- alert: NetdataHighMemoryUsage
  expr: 100 / netdata_system_ram_MiB_average * netdata_system_ram_MiB_average{dimension=~"free|cached"} < 20 and netdata_system_ram_MiB_average > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata high memory usage (instance {{ $labels.instance }})
    description: "Netdata high memory usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.4. Netdata low disk space

Netdata low disk space (> 80%)

- alert: NetdataLowDiskSpace
  expr: 100 / netdata_disk_space_GB_average * netdata_disk_space_GB_average{dimension=~"avail|cached"} < 20 and netdata_disk_space_GB_average > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata low disk space (instance {{ $labels.instance }})
    description: "Netdata low disk space (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.5. Netdata predicted disk full

Netdata predicted disk full in 24 hours

- alert: NetdataPredictedDiskFull
  expr: predict_linear(netdata_disk_space_GB_average{dimension=~"avail|cached"}[3h], 24 * 3600) < 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata predicted disk full (instance {{ $labels.instance }})
    description: "Netdata predicted disk full in 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.6. Netdata MD mismatch cnt unsynchronized blocks

RAID Array have unsynchronized blocks

- alert: NetdataMDMismatchCntUnsynchronizedBlocks
  expr: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: Netdata MD mismatch cnt unsynchronized blocks (instance {{ $labels.instance }})
    description: "RAID Array have unsynchronized blocks\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

info

1.10.7. Netdata disk reallocated sectors

Disk reallocated sectors detected ({{ $value }} sectors)

- alert: NetdataDiskReallocatedSectors
  expr: increase(netdata_smartd_log_reallocated_sectors_count_sectors_average[1m]) > 0
  for: 0m
  labels:
    severity: info
  annotations:
    summary: Netdata disk reallocated sectors (instance {{ $labels.instance }})
    description: "Disk reallocated sectors detected ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.8. Netdata disk current pending sector

Disk current pending sector

- alert: NetdataDiskCurrentPendingSector
  expr: netdata_smartd_log_current_pending_sector_count_sectors_average > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata disk current pending sector (instance {{ $labels.instance }})
    description: "Disk current pending sector\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

warning

1.10.9. Netdata reported uncorrectable disk sectors

Reported uncorrectable disk sectors ({{ $value }} sectors)

- alert: NetdataReportedUncorrectableDiskSectors
  expr: increase(netdata_smartd_log_offline_uncorrectable_sector_count_sectors_average[2m]) > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata reported uncorrectable disk sectors (instance {{ $labels.instance }})
    description: "Reported uncorrectable disk sectors ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

More in Basic resource monitoring

Prometheus self-monitoring Host and hardware S.M.A.R.T Device Monitoring IPMI Docker containers Blackbox Windows Server VMware Proxmox VE eBPF Process Exporter Systemd