Simulate Keycloak Site Failover

Prerequisites:

  • A Keycloak instance replicated across two Openshift clusters with Infinispan xsite and an Aurora DB

  • Realm, user and client exist with the values required by the benchmark CLI command

Running the failure test from the CLI

Preparations

  • Extract the keycloak-benchmark-${version}.[zip|tar.gz] file

  • Preparing Keycloak for testing

  • Make sure your local KUBECONFIG is set to the Openshift cluster which you want to fail.

Parameters

The failover scripts requires the following env variables to be set; FAILOVER_MODE and DOMAIN.

The FAILOVER_MODE determines the type of failover that is initiated by the script and can be one of the following values:

FAILOVER_MODE Description

HEALTH_PROBE

Deletes the Keycloak aws-health-route so that Route53 will eventually failover.

KEYCLOAK_ROUTES

Deletes all Keycloak routes so that Route53 will eventually failover, but requests to the old DNS IP addresses will fail. The Keycloak Operator is scaled down to 0 pods to prevent the Keycloak Ingress from being recreated.

CLUSTER_FAIL

Deletes all Keycloak and Infinispan pods with no grace period and remove the associated StatefulSet. Both operators are scaled down to prevent the removed resources from being recreated.

GOSSIP_ROUTER_FAIL

Deletes the Infinispan Gossip Router pod with no grace period and removes the associated Deployment. The Infinispan operator is scaled down to prevent the removed resources from being recreated.

See below for a description of the other environment variables that can be configured.

DOMAIN

Required. The Route53 domain hosting the client., primary. and backup. subdomains.

FAILOVER_DELAY

Optional. The delay in seconds to wait before initiating cluster failover. Defaults to 60 seconds.

Execution

Use the Running benchmarks from the CLI guide to simulate load against a specific Kubernetes environment.

In parallel execute below command to initiate failover:

FAILOVER_MODE="KEYCLOAK_ROUTES" DOMAIN=... ./kc-failover.sh
In order for the kc-failover.sh script to accurately record the time taken for Route53 failover to occur, it’s recommended that the script is executed in the same environment as the Keycloak benchmark scenario.

Restoring clusters after failover tests

Once a failover benchmark has been executed, it’s possible to restore the original cluster state by executing the script with the RECOVERY_MODE env variable set. The value of RECOVERY_MODE determines the subdomain that is used to recreate the aws-health-route Route.

Parameters

RECOVERY_MODE Description

ACTIVE

Recreates the aws-health-route Route with primary.${DOMAIN} URL and scales up the Infinispan and Keycloak operators.

PASSIVE

Recreates the aws-health-route Route with backup.${DOMAIN} URL and scales up the Infinispan and Keycloak operators.

DOMAIN

Required. The Route53 domain hosting the client., primary. and backup. subdomains.

Execution

RECOVERY_MODE=ACTIVE DOMAIN=... ./kc-failover.sh