Error messages and remedies for Kubernetes

This describes common error messages and their remedies when running the Keycloak Benchmark suite on Kubernetes.

Keycloak message ERROR: Failed to obtain JDBC connection

Context

This error message can appear when running Keycloak with a relational database such as PostgreSQL or CockroachDB.

Similar other error messages:

  • Unable to acquire JDBC Connection

  • Sorry, acquisition timeout!

This might happen during the startup and then the startup fails. It might also happen during a load test when Keycloak creates new database connections.

Cause

The database is either not started, or the number of database connections is exhausted in the current setup.

Remedy
  • Ensure that the database is running.

  • Ensure that the database did not restart, for example, due to an out-of-memory problem.

  • Ensure that the number of DB connections in total does not exceed the maximum number of connections of the database. See the Keycloak deployment configuration options KC_DB_* for details.

  • Ensure that Keycloak does not try to use more connections than configured as maximum numbers of connections.

Caution
  • Under high load, the number of database connections is usually the constraint of the system. Having Keycloak running into a “Sorry, acquisition timeout” and returning an HTTP 5xx code to the caller is a sensible load shedding mechanism. See Running in production for details.

Keycloak message prepared transactions are disabled

Full message
org.postgresql.util.PSQLException: ERROR: prepared transactions are disabled.
Hint: Set max_prepared_transactions to a nonzero value.
Context

This happens when the transaction manager or Quarkus handles more than one transaction in a request, and one of the transactions belongs to a PostgreSQL database.

Cause

Once the transaction manager of Quarkus bundles two transactions, it sends a PREPARE TRANSACTION command to the database before sending the commit after it prepared all transactions. For this to work, the database needs to have the max_prepared_transactions parameter set.

Effect

No transactions against the PostgreSQL database succeed.

Remedy

Pass the parameters -c max_prepared_transactions=xxx to the database. For the containerized database in the Kubernetes setup, this has been configured in postgres-deployment.yaml.

Keycloak message ARJUNA012225: cannot access root of object store

Full message
ARJUNA012225: FileSystemStore::setupStore - cannot access root of object store: ObjectStore/ShadowNoFileLockStore/defaultStore/
Context

This happens when the transaction manager or Quarkus handles more than one transaction in a request and attempts to locally persist the state of the transaction.

Cause

The working directory of Keycloak is not writable in the Keycloak container, therefore writing the state fails.

Effect

No transactions against any store participating in the JTA transaction complete, therefore Keycloak will not start.

Remedy

Via the environment variable QUARKUS_TRANSACTION_MANAGER_OBJECT_STORE_DIRECTORY pass in a folder that is writable. This has been fixed upstream for Keycloak 21.1 in keycloak#19384.

Keycloak message 'org.jgroups.util.ThreadPool: thread pool is full'

Full message
org.jgroups.util.ThreadPool`: thread pool is full (max=xx, active=xx); thread dump (dumped once, until thread_dump is reset)
Context

This happens when the thread pool in JGroups runs out of threads. For this message to appear with the default JGroups configuration, the system property jgroups.thread_dumps_threshold needs to be set to 1, as otherwise the message will appear only after 10000 threads have been rejected.

Cause

Each Keycloak Pod has executor threads that run the RESTEasy requests. While those can request data from the local caches directly (either the primary or the back ones), they need to call remote caches through JGroups. While some of these requests might get bundled, each of the Keycloak requests calling out remotely might need to use one JGroups thread.

Effect

When JGroups runs out of threads in its thread pool, it rejects and discards those remote calls it is about to enqueue. This will lead to longer processing times and eventually to timeouts on Infinispan where JGroups is the underlying transport.

Remedy

The number of executor threads in all Keycloak Pods in a cluster combined should be equal to the maximum size of the JGroups thread pool.

With ISPN-14780 being fixed in Keycloak 22.0.2, the number of JGroup threads is 200 by default, and can be configured using the property Java system property jgroups.thread_pool.max_threads. As shown in experiments, assuming a Keycloak cluster with 4 Pods, each Pod shouldn’t have more than 50 worker threads so that it doesn’t run out of threads in the JGroup thread pool of 200. Use the Quarkus configuration options quarkus.thread-pool.max-threads to configure the maximum number of worker threads.

See more in the Running in production.