Skip to main content
APA
Sponsored by CAST AI — Kubernetes cost optimization Better Stack — Uptime monitoring and log management
⚠️

Alert thresholds depend on the nature of your applications. Some queries may have arbitrary tolerance thresholds. Building an efficient monitoring platform takes time. 😉

Keycloak Prometheus Alert Rules

6 Prometheus alerting rules for Keycloak. Exported via aerogear/keycloak-metrics-spi. These rules cover critical and warning conditions — copy and paste the YAML into your Prometheus configuration.

wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/keycloak/aerogear-keycloak-metrics-spi.yml
warning

9.8.1. Keycloak high login failure rate

More than 5% of login attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf "%.1f" }}%).

  # Threshold of 5% is a rough default. Adjust based on your user base and expected error rates.
  # A spike in failed logins may indicate a brute-force attack or misconfigured client.
- alert: KeycloakHighLoginFailureRate
  expr: (sum by (realm) (rate(keycloak_failed_login_attempts_total[5m])) / (sum by (realm) (rate(keycloak_logins_total[5m])) + sum by (realm) (rate(keycloak_failed_login_attempts_total[5m])))) * 100 > 5 and (sum by (realm) (rate(keycloak_logins_total[5m])) + sum by (realm) (rate(keycloak_failed_login_attempts_total[5m]))) > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Keycloak high login failure rate (instance {{ $labels.instance }})
    description: "More than 5% of login attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf \"%.1f\" }}%).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
critical

9.8.2. Keycloak no successful logins

No successful logins in realm {{ $labels.realm }} for the last 15 minutes.

  # Only fires when login attempts exist but none succeed — may indicate an authentication outage.
- alert: KeycloakNoSuccessfulLogins
  expr: sum by (realm) (rate(keycloak_logins_total[15m])) == 0 and (sum by (realm) (rate(keycloak_logins_total[15m])) + sum by (realm) (rate(keycloak_failed_login_attempts_total[15m]))) > 0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: Keycloak no successful logins (instance {{ $labels.instance }})
    description: "No successful logins in realm {{ $labels.realm }} for the last 15 minutes.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

9.8.3. Keycloak high token refresh error rate

More than 10% of token refresh attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf "%.1f" }}%).

  # Threshold of 10% is a rough default. High refresh token errors may indicate expired sessions or token store issues.
- alert: KeycloakHighTokenRefreshErrorRate
  expr: (sum by (realm) (rate(keycloak_refresh_tokens_errors_total[5m])) / sum by (realm) (rate(keycloak_refresh_tokens_total[5m]))) * 100 > 10 and sum by (realm) (rate(keycloak_refresh_tokens_total[5m])) > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Keycloak high token refresh error rate (instance {{ $labels.instance }})
    description: "More than 10% of token refresh attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf \"%.1f\" }}%).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

9.8.4. Keycloak high code-to-token exchange error rate

More than 10% of code-to-token exchanges are failing in realm {{ $labels.realm }} (current value: {{ $value | printf "%.1f" }}%).

  # Threshold of 10% is a rough default. Code-to-token failures may indicate misconfigured OAuth clients or replay attacks.
- alert: KeycloakHighCode-to-tokenExchangeErrorRate
  expr: (sum by (realm) (rate(keycloak_code_to_tokens_errors_total[5m])) / sum by (realm) (rate(keycloak_code_to_tokens_total[5m]))) * 100 > 10 and sum by (realm) (rate(keycloak_code_to_tokens_total[5m])) > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Keycloak high code-to-token exchange error rate (instance {{ $labels.instance }})
    description: "More than 10% of code-to-token exchanges are failing in realm {{ $labels.realm }} (current value: {{ $value | printf \"%.1f\" }}%).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

9.8.5. Keycloak high registration failure rate

More than 10% of registration attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf "%.1f" }}%).

  # Threshold of 10% is a rough default.
- alert: KeycloakHighRegistrationFailureRate
  expr: (sum by (realm) (rate(keycloak_registrations_errors_total[5m])) / sum by (realm) (rate(keycloak_registrations_total[5m]))) * 100 > 10 and sum by (realm) (rate(keycloak_registrations_total[5m])) > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Keycloak high registration failure rate (instance {{ $labels.instance }})
    description: "More than 10% of registration attempts are failing in realm {{ $labels.realm }} (current value: {{ $value | printf \"%.1f\" }}%).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
warning

9.8.6. Keycloak slow request response time

Keycloak {{ $labels.method }} requests are taking more than 2 seconds on average.

  # keycloak_request_duration is in milliseconds. Threshold of 2000ms (2 seconds) is a rough default.
- alert: KeycloakSlowRequestResponseTime
  expr: sum by (method) (rate(keycloak_request_duration_sum[5m])) / sum by (method) (rate(keycloak_request_duration_count[5m])) > 2000 and sum by (method) (rate(keycloak_request_duration_count[5m])) > 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: Keycloak slow request response time (instance {{ $labels.instance }})
    description: "Keycloak {{ $labels.method }} requests are taking more than 2 seconds on average.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"