Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Netdata Prometheus Alert Rules

9 Prometheus alerting rules for Netdata. Exported via Embedded exporter. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

1.10. Embedded exporter (9 rules)

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/netdata/embedded-exporter.yml
warning

1.10.1. Netdata high cpu usage

Netdata high CPU usage (> 80%)

  # This is a gauge metric (not a counter). Checking idle < 20% means CPU usage > 80%.
- alert: NetdataHighCpuUsage
  expr: netdata_cpu_cpu_percentage_average{dimension="idle"} < 20
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata high cpu usage (instance {{ $labels.instance }})
    description: "Netdata high CPU usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.2. Netdata CPU steal noisy neighbor

CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.

- alert: NetdataCPUStealNoisyNeighbor
  expr: netdata_cpu_cpu_percentage_average{dimension="steal"} > 10
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata CPU steal noisy neighbor (instance {{ $labels.instance }})
    description: "CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.3. Netdata high memory usage

Netdata high memory usage (> 80%)

- alert: NetdataHighMemoryUsage
  expr: 100 / netdata_system_ram_MiB_average * netdata_system_ram_MiB_average{dimension=~"free|cached"} < 20 and netdata_system_ram_MiB_average > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata high memory usage (instance {{ $labels.instance }})
    description: "Netdata high memory usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.4. Netdata low disk space

Netdata low disk space (> 80%)

- alert: NetdataLowDiskSpace
  expr: 100 / netdata_disk_space_GB_average * netdata_disk_space_GB_average{dimension=~"avail|cached"} < 20 and netdata_disk_space_GB_average > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Netdata low disk space (instance {{ $labels.instance }})
    description: "Netdata low disk space (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.5. Netdata predicted disk full

Netdata predicted disk full in 24 hours

- alert: NetdataPredictedDiskFull
  expr: predict_linear(netdata_disk_space_GB_average{dimension=~"avail|cached"}[3h], 24 * 3600) < 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata predicted disk full (instance {{ $labels.instance }})
    description: "Netdata predicted disk full in 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.6. Netdata MD mismatch cnt unsynchronized blocks

RAID Array have unsynchronized blocks

- alert: NetdataMDMismatchCntUnsynchronizedBlocks
  expr: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: Netdata MD mismatch cnt unsynchronized blocks (instance {{ $labels.instance }})
    description: "RAID Array have unsynchronized blocks\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
info

1.10.7. Netdata disk reallocated sectors

Disk reallocated sectors detected ({{ $value }} sectors)

- alert: NetdataDiskReallocatedSectors
  expr: increase(netdata_smartd_log_reallocated_sectors_count_sectors_average[1m]) > 0
  for: 0m
  labels:
    severity: info
  annotations:
    summary: Netdata disk reallocated sectors (instance {{ $labels.instance }})
    description: "Disk reallocated sectors detected ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.8. Netdata disk current pending sector

Disk current pending sector

- alert: NetdataDiskCurrentPendingSector
  expr: netdata_smartd_log_current_pending_sector_count_sectors_average > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata disk current pending sector (instance {{ $labels.instance }})
    description: "Disk current pending sector\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

1.10.9. Netdata reported uncorrectable disk sectors

Reported uncorrectable disk sectors ({{ $value }} sectors)

- alert: NetdataReportedUncorrectableDiskSectors
  expr: increase(netdata_smartd_log_offline_uncorrectable_sector_count_sectors_average[2m]) > 0
  for: 0m
  labels:
    severity: warning
  annotations:
    summary: Netdata reported uncorrectable disk sectors (instance {{ $labels.instance }})
    description: "Reported uncorrectable disk sectors ({{ $value }} sectors)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"