Health checks for multi-site deployments

Validating the health of a multi-site deployment

When running the Multi-site deployments in a Kubernetes environment, you should automate checks to see if everything is up and running as expected.

This page provides an overview of URLs, Kubernetes resources, and Healthcheck endpoints available to verify a multi-site setup of Keycloak.

Overview

A proactive monitoring strategy aims to detect and alert about issues before they impact users. This strategy is the key for a highly resilient and highly available Keycloak application.

Health checks across various architectural components (such as application health, load balancing, caching, and overall system status) are critical for:

Ensuring high availability

Verifying that all sites and the load balancer are operational is a key to ensure that a system can handle requests even if one site goes down.

Maintaining performance

Checking the health and distribution of the Infinispan cache ensures that Keycloak can maintain optimal performance by efficiently handling sessions and other temporary data.

Operational resilience

By continuously monitoring the health of both Keycloak and its dependencies within the Kubernetes environment, the system can quickly identify and possibly auto-remediate issues, reducing downtime.

Prerequisites

  1. Kubectl CLI is installed and configured.

  2. Install jq if it is not already installed on your operating system.

Specific health checks

Keycloak load balancer and sites

Verifies the health of the Keycloak application through its load balancer and both primary and backup sites. This ensures that Keycloak is accessible and that the load balancing mechanism is functioning correctly across different geographical or network locations.

This command returns the health status of the Keycloak application’s connection to its configured database, thus confirming the reliability of database connections. This command is available only on the management port and not from the external URL. In a Kubernetes setup, the sub-status health/ready is checked periodically to make the Pod as ready.

curl -s https://keycloak:managementport/health

This command verifies the lb-check endpoint of the load balancer and ensures the Keycloak application cluster is up and running.

curl -s https://keycloak-load-balancer-url/lb-check

These commands will return the running status of the Site A and Site B of the Keycloak in a multi-site setup.

curl -s https://keycloak_site_a_url/lb-check
curl -s https://keycloak_site_b_url/lb-check

Infinispan Cache health

Check the health of the default cache manager and individual caches in an external Infinispan cluster. This check is vital for Keycloak performance and reliability, as Infinispan is often used for distributed caching and session clustering in Keycloak deployments.

This command returns the overall health of the Infinispan cache manager, which is useful as the Admin user does not need to provide user credentials to get the health status.

curl -s https://infinispan_rest_url/rest/v2/cache-managers/default/health/status

In contrast to the preceding health checks, the following health checks require the Admin user to provide the Infinispan user credentials as part of the request to peek into the overall health of the external Infinispan cluster caches.

curl -u <infinispan_user>:<infinispan_pwd> -s https://infinispan_rest_url/rest/v2/cache-managers/default/health \
 | jq 'if .cluster_health.health_status == "HEALTHY" and (all(.cache_health[].status; . == "HEALTHY")) then "HEALTHY" else "UNHEALTHY" end'

The jq filter is a convenience to compute the overall health based on the individual cache health. You can also choose to run the above command without the jq filter to see the full details.

Infinispan Cluster distribution

Assesses the distribution health of the Infinispan cluster, ensuring that the cluster’s nodes are correctly distributing data. This step is essential for the scalability and fault tolerance of the caching layer.

You can modify the expectedCount 3 argument to match the total nodes in the cluster and validate if they are healthy or not.

curl <infinispan_user>:<infinispan_pwd> -s https://infinispan_rest_url/rest/v2/cluster\?action\=distribution \
 | jq --argjson expectedCount 3 'if map(select(.node_addresses | length > 0)) | length == $expectedCount then "HEALTHY" else "UNHEALTHY" end'

Overall, Infinispan system health

Uses the kubectl CLI tool to query the health status of Infinispan clusters and the Keycloak service in the specified namespace. This comprehensive check ensures that all components of the Keycloak deployment are operational and correctly configured within the Kubernetes environment.

kubectl get infinispan -n <NAMESPACE> -o json  \
| jq '.items[].status.conditions' \
| jq 'map({(.type): .status})' \
| jq 'reduce .[] as $item ([]; . + [keys[] | select($item[.] != "True")]) | if length == 0 then "HEALTHY" else "UNHEALTHY: " + (join(", ")) end'

Keycloak readiness in Kubernetes

Specifically, checks for the readiness and rolling update conditions of Keycloak deployments in Kubernetes, ensuring that the Keycloak instances are fully operational and not undergoing updates that could impact availability.

kubectl wait --for=condition=Ready --timeout=10s keycloaks.k8s.keycloak.org/keycloak -n <NAMESPACE>
kubectl wait --for=condition=RollingUpdate=False --timeout=10s keycloaks.k8s.keycloak.org/keycloak -n <NAMESPACE>
On this page