Multi-site deployments

Connect multiple Keycloak deployments in different sites to increase the overall availability

Keycloak supports deployments that consist of multiple Keycloak instances that connect to each other using its Infinispan caches; load balancers can distribute the load evenly across those instances. Those setups are intended for a transparent network on a single site.

The Keycloak high-availability guide goes one step further to describe setups across multiple sites. While this setup adds additional complexity, that extra amount of high availability may be needed for some environments.

When to use a multi-site setup

The multi-site deployment capabilities of Keycloak are targeted at use cases that:

  • Are constrained to a single AWS Region or an equivalent low-latency setup.

  • Permit planned outages for maintenance.

  • Fit within a defined user and request count.

  • Can accept the impact of periodic outages.

Tested Configuration

We regularly test Keycloak with the following configuration:

  • Two Openshift single-AZ clusters, in the same AWS Region

    • Provisioned with Red Hat OpenShift Service on AWS (ROSA), using ROSA HCP.

    • Each Openshift cluster has all its workers in a single Availability Zone.

    • OpenShift version 4.16.

  • Amazon Aurora PostgreSQL database

    • High availability with a primary DB instance in one Availability Zone, and a synchronously replicated reader in the second Availability Zone

    • Version 16.1

  • AWS Global Accelerator, sending traffic to both ROSA clusters

  • AWS Lambda triggered by ROSA’s Prometheus and Alert Manager to automate failover

While equivalent setups should work, you will need to verify the performance and failure behavior of your environment. We provide functional tests, failure tests and load tests in the Keycloak Benchmark Project.

Read more on each item in the Building blocks multi-site deployments guide.

Tested load

We regularly test Keycloak with the following load:

  • 100,000 users

  • 300 requests per second

While we did not see a hard limit in our tests with these values, we ask you to test for higher volumes with horizontally and vertically scaled Keycloak name instances and databases.

See the Concepts for sizing CPU and memory resources guide for more information.

Limitations

Even with the additional redundancy of the two sites, downtimes can still occur:

  • During upgrades of Keycloak or Infinispan both sites needs to be taken offline for the duration of the upgrade.

  • During certain failure scenarios, there may be downtime of up to 5 minutes.

  • After certain failure scenarios, manual intervention may be required to restore redundancy by bringing the failed site back online.

  • During certain switchover scenarios, there may be downtime of up to 5 minutes.

For more details on limitations see the Concepts for multi-site deployments guide.

Next steps

The different guides introduce the necessary concepts and building blocks. For each building block, a blueprint shows how to set a fully functional example. Additional performance tuning and security hardening are still recommended when preparing a production setup.

On this page