Running Keycloak in Docker for development is straightforward. Running it in production requires careful configuration of database pooling, reverse proxy headers, JVM tuning, health checks, and security hardening. This guide provides copy-paste Docker Compose configurations for Keycloak 26.x that are production-ready.
Clone the companion repo: All configurations from this guide are available as a ready-to-run project at IAMDevBox/keycloak-docker-production. Clone it, copy
.env.exampleto.env, set your passwords, and rundocker compose up -d.
Single-Node Production Setup
This is the recommended starting point. A single Keycloak instance with PostgreSQL handles thousands of concurrent users.
Docker Compose
services:
postgres:
image: postgres:16-alpine
container_name: keycloak-postgres
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: keycloak
POSTGRES_USER: keycloak
POSTGRES_PASSWORD: ${DB_PASSWORD}
networks:
- internal
healthcheck:
test: ["CMD-SHELL", "pg_isready -U keycloak -d keycloak"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
memory: 1G
cpus: "1.0"
keycloak:
image: quay.io/keycloak/keycloak:26.1
container_name: keycloak
command: start --optimized
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
environment:
# Database
KC_DB: postgres
KC_DB_URL: jdbc:postgresql://postgres:5432/keycloak
KC_DB_USERNAME: keycloak
KC_DB_PASSWORD: ${DB_PASSWORD}
KC_DB_POOL_INITIAL_SIZE: 25
KC_DB_POOL_MIN_SIZE: 25
KC_DB_POOL_MAX_SIZE: 25
# Hostname (TLS terminates at reverse proxy)
KC_HOSTNAME: https://auth.example.com
KC_HTTP_ENABLED: "true"
KC_PROXY_HEADERS: xforwarded
# Observability
KC_HEALTH_ENABLED: "true"
KC_METRICS_ENABLED: "true"
# Admin bootstrap (first run only)
KC_BOOTSTRAP_ADMIN_USERNAME: admin
KC_BOOTSTRAP_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
# Single-node cache
KC_CACHE: local
# Logging
KC_LOG_LEVEL: info
KC_LOG_CONSOLE_OUTPUT: json
# JVM
JAVA_OPTS_KC_HEAP: >-
-XX:MaxRAMPercentage=70
-XX:InitialRAMPercentage=50
-XX:MaxHeapFreeRatio=30
ports:
- "8080:8080"
networks:
- frontend
- internal
healthcheck:
test: ["CMD-SHELL", "exec 3<>/dev/tcp/localhost/9000 && echo -e 'GET /health/ready HTTP/1.1\\r\\nHost: localhost\\r\\nConnection: close\\r\\n\\r\\n' >&3 && cat <&3 | grep -q '\"status\": \"UP\"'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
deploy:
resources:
limits:
memory: 2G
cpus: "2.0"
reservations:
memory: 1G
volumes:
pgdata:
networks:
frontend:
driver: bridge
internal:
driver: bridge
internal: true
Environment File
Create a .env file (gitignored) alongside your docker-compose.yml:
DB_PASSWORD=change-this-to-a-strong-password
ADMIN_PASSWORD=change-this-to-a-strong-password
Key Configuration Decisions
Database connection pool: The Keycloak docs recommend setting initial, min, and max pool sizes to the same value. This avoids the overhead of creating new connections under load. 25 connections handles most single-node deployments.
KC_CACHE: local: Disables distributed Infinispan caching. For a single node, this eliminates unnecessary cluster discovery overhead.
KC_BOOTSTRAP_ADMIN_USERNAME: Replaces the deprecated KEYCLOAK_ADMIN in Keycloak 26.x. The admin user is created only on first boot — these variables are ignored on subsequent starts.
Health check on port 9000: Keycloak’s management interface runs on port 9000 (separate from the application port 8080). Health endpoints (/health/ready, /health/live) and metrics (/metrics) are served here. Never expose port 9000 publicly.
Network isolation: PostgreSQL sits on an internal network with no external access. Keycloak bridges both frontend (for client traffic) and internal (for database).
Optimized Build for Faster Startup
The stock Keycloak image runs a build phase on every container start, adding 15-30 seconds to startup. Use a multi-stage Dockerfile to pre-build:
FROM quay.io/keycloak/keycloak:26.1 AS builder
ENV KC_DB=postgres
ENV KC_HEALTH_ENABLED=true
ENV KC_METRICS_ENABLED=true
ENV KC_CACHE=ispn
RUN /opt/keycloak/bin/kc.sh build
FROM quay.io/keycloak/keycloak:26.1
COPY --from=builder /opt/keycloak/ /opt/keycloak/
ENTRYPOINT ["/opt/keycloak/bin/kc.sh"]
CMD ["start", "--optimized"]
Replace image: quay.io/keycloak/keycloak:26.1 in your Compose file with:
keycloak:
build: .
command: start --optimized
Startup drops from ~30 seconds to under 10 seconds.
Reverse Proxy Configuration
Keycloak in production runs behind a reverse proxy that terminates TLS. Three options:
Nginx
upstream keycloak {
server keycloak:8080;
}
server {
listen 443 ssl http2;
server_name auth.example.com;
ssl_certificate /etc/nginx/ssl/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
# OIDC/SAML endpoints
location /realms/ {
proxy_pass http://keycloak;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Port $server_port;
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
}
# Static resources (cache aggressively)
location /resources/ {
proxy_pass http://keycloak;
proxy_set_header Host $host;
expires 1y;
add_header Cache-Control "public, immutable";
}
# OIDC discovery
location /.well-known/ {
proxy_pass http://keycloak;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Admin console — restrict to internal IPs
location /admin/ {
allow 10.0.0.0/8;
allow 172.16.0.0/12;
allow 192.168.0.0/16;
deny all;
proxy_pass http://keycloak;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Block management endpoints
location /health { deny all; }
location /metrics { deny all; }
}
Key points:
- Only expose
/realms/,/resources/, and/.well-known/publicly - Restrict
/admin/to internal IPs - Never proxy port 9000 (management interface)
- Large
proxy_buffer_sizeprevents issues with SAML responses
Traefik (Docker Labels)
keycloak:
labels:
- "traefik.enable=true"
- "traefik.http.routers.keycloak.rule=Host(`auth.example.com`)"
- "traefik.http.routers.keycloak.entrypoints=websecure"
- "traefik.http.routers.keycloak.tls.certresolver=letsencrypt"
- "traefik.http.services.keycloak.loadbalancer.server.port=8080"
- "traefik.http.services.keycloak.loadbalancer.sticky.cookie=true"
- "traefik.http.services.keycloak.loadbalancer.sticky.cookie.name=AUTH_SESSION_ID"
Caddy (Automatic TLS)
Caddy automatically provisions Let’s Encrypt TLS certificates — zero manual certificate management.
PostgreSQL Tuning
Default PostgreSQL settings are conservative. Tune for Keycloak’s workload pattern (many small CRUD queries):
postgres:
image: postgres:16-alpine
command:
- "postgres"
- "-c"
- "shared_buffers=256MB"
- "-c"
- "effective_cache_size=768MB"
- "-c"
- "work_mem=4MB"
- "-c"
- "maintenance_work_mem=64MB"
- "-c"
- "max_connections=50"
- "-c"
- "random_page_cost=1.1"
- "-c"
- "effective_io_concurrency=200"
- "-c"
- "log_min_duration_statement=500"
Sizing rules:
shared_buffers: 25% of container memory (256 MB for a 1 GB container)effective_cache_size: 75% of container memorywork_mem: 4 MB is sufficient — Keycloak runs simple queries, not complex joinsmax_connections: Must be >= Keycloak pool size + monitoring/admin overhead. For single-node with pool size 25, set to 50
Automated Backups
Add a backup sidecar to your Compose file:
backup:
image: postgres:16-alpine
entrypoint: /bin/sh
command: >
-c 'while true; do
pg_dump -h postgres -U keycloak -Fc keycloak > /backups/keycloak_$$(date +%Y%m%d_%H%M).dump;
find /backups -name "keycloak_*.dump" -mtime +7 -delete;
sleep 86400;
done'
environment:
PGPASSWORD: ${DB_PASSWORD}
volumes:
- ./backups:/backups
networks:
- internal
depends_on:
postgres:
condition: service_healthy
This creates daily backups and retains 7 days of history.
Multi-Node Clustering
For high availability, run multiple Keycloak instances behind a load balancer. Keycloak 26.x uses JDBC_PING2 by default — nodes discover each other through the shared PostgreSQL database. No multicast or external discovery service required.
Clustered Docker Compose
services:
postgres:
image: postgres:16-alpine
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: keycloak
POSTGRES_USER: keycloak
POSTGRES_PASSWORD: ${DB_PASSWORD}
command:
- "postgres"
- "-c"
- "shared_buffers=256MB"
- "-c"
- "max_connections=80"
networks:
- internal
healthcheck:
test: ["CMD-SHELL", "pg_isready -U keycloak -d keycloak"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
keycloak-1:
image: quay.io/keycloak/keycloak:26.1
command: start --optimized
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
environment:
KC_DB: postgres
KC_DB_URL: jdbc:postgresql://postgres:5432/keycloak
KC_DB_USERNAME: keycloak
KC_DB_PASSWORD: ${DB_PASSWORD}
KC_DB_POOL_INITIAL_SIZE: 20
KC_DB_POOL_MIN_SIZE: 20
KC_DB_POOL_MAX_SIZE: 20
KC_HOSTNAME: https://auth.example.com
KC_HTTP_ENABLED: "true"
KC_PROXY_HEADERS: xforwarded
KC_HEALTH_ENABLED: "true"
KC_METRICS_ENABLED: "true"
KC_BOOTSTRAP_ADMIN_USERNAME: admin
KC_BOOTSTRAP_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
KC_CACHE: ispn
KC_CACHE_STACK: jdbc-ping
KC_LOG_LEVEL: info
KC_LOG_CONSOLE_OUTPUT: json
JAVA_OPTS_APPEND: "-Djava.net.preferIPv4Stack=true"
networks:
- frontend
- internal
healthcheck:
test: ["CMD-SHELL", "exec 3<>/dev/tcp/localhost/9000 && echo -e 'GET /health/ready HTTP/1.1\\r\\nHost: localhost\\r\\nConnection: close\\r\\n\\r\\n' >&3 && cat <&3 | grep -q '\"status\": \"UP\"'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 90s
keycloak-2:
image: quay.io/keycloak/keycloak:26.1
command: start --optimized
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
keycloak-1:
condition: service_healthy
environment:
KC_DB: postgres
KC_DB_URL: jdbc:postgresql://postgres:5432/keycloak
KC_DB_USERNAME: keycloak
KC_DB_PASSWORD: ${DB_PASSWORD}
KC_DB_POOL_INITIAL_SIZE: 20
KC_DB_POOL_MIN_SIZE: 20
KC_DB_POOL_MAX_SIZE: 20
KC_HOSTNAME: https://auth.example.com
KC_HTTP_ENABLED: "true"
KC_PROXY_HEADERS: xforwarded
KC_HEALTH_ENABLED: "true"
KC_METRICS_ENABLED: "true"
KC_CACHE: ispn
KC_CACHE_STACK: jdbc-ping
KC_LOG_LEVEL: info
KC_LOG_CONSOLE_OUTPUT: json
JAVA_OPTS_APPEND: "-Djava.net.preferIPv4Stack=true"
networks:
- frontend
- internal
healthcheck:
test: ["CMD-SHELL", "exec 3<>/dev/tcp/localhost/9000 && echo -e 'GET /health/ready HTTP/1.1\\r\\nHost: localhost\\r\\nConnection: close\\r\\n\\r\\n' >&3 && cat <&3 | grep -q '\"status\": \"UP\"'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 90s
nginx:
image: nginx:alpine
ports:
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro
depends_on:
keycloak-1:
condition: service_healthy
keycloak-2:
condition: service_healthy
networks:
- frontend
volumes:
pgdata:
networks:
frontend:
internal:
internal: true
Clustering Key Points
JDBC_PING2 uses the shared database for node discovery. Nodes register themselves in a JGROUPSPING table. No multicast, no external etcd/consul, no cloud-specific discovery.
Sticky sessions are required. The AUTH_SESSION_ID cookie ensures a user’s authentication flow stays on the same node. Without sticky sessions, multi-step login flows will fail. Configure in Nginx:
upstream keycloak_cluster {
ip_hash;
server keycloak-1:8080;
server keycloak-2:8080;
}
Inter-node communication uses ports 7800 (data) and 57800 (failure detection). mTLS between nodes is enabled by default in Keycloak 26.x with auto-generated certificates.
Verify cluster formation:
# Check for cluster join message
docker compose logs keycloak-1 | grep "ISPN000094"
# Check cluster size via metrics
curl -s http://localhost:9000/metrics | grep vendor_cluster_size
Security Hardening
Disable Unused Features
# Add to Keycloak environment
KC_FEATURES_DISABLED: impersonation,kerberos,device-flow,ciba
Disable features you don’t use to reduce attack surface. Common candidates: impersonation (admin impersonating users), device-flow (IoT), ciba (client-initiated backchannel auth).
Separate Admin Hostname
KC_HOSTNAME: https://auth.example.com
KC_HOSTNAME_ADMIN: https://admin.internal.example.com
Serve the admin console on a separate internal hostname, inaccessible from the public internet.
Credential Management
Keycloak does not support Docker secrets _FILE pattern natively. Options from most to least secure:
- PKCS12 Keystore (Keycloak-native):
keytool -importpass -alias kc.db-password -keystore conf/keystore.p12 \
-storepass keystorepass -storetype PKCS12
.envfile (simplest, restrict file permissions to 600):
chmod 600 .env
- Docker Swarm secrets (via custom entrypoint reading
/run/secrets/)
Load Shedding
Prevent cascade failures under extreme load:
KC_HTTP_MAX_QUEUED_REQUESTS: 1000
Requests exceeding the queue receive an immediate 503. At ~200 requests/second throughput, a queue of 1000 means ~5 second maximum wait.
Monitoring
Prometheus Integration
Enable metrics and health endpoints:
KC_HEALTH_ENABLED: "true"
KC_METRICS_ENABLED: "true"
Scrape configuration:
# prometheus.yml
scrape_configs:
- job_name: keycloak
metrics_path: /metrics
static_configs:
- targets: ['keycloak:9000']
Key Metrics to Monitor
| Metric | Type | What It Tells You |
|---|---|---|
keycloak_user_events_total | Counter | Login success/failure rates per realm |
http_server_requests_seconds_bucket | Histogram | Request latency distribution |
agroal_active_count | Gauge | Active DB connections (should match pool size under load) |
vendor_cluster_size | Gauge | Number of nodes in cluster (should match expected count) |
jvm_memory_usage_after_gc_percent | Gauge | Heap pressure (alert if consistently >85%) |
Grafana Dashboards
Keycloak provides official dashboards at keycloak/keycloak-grafana-dashboard:
- Troubleshooting Dashboard: SLI monitoring and deployment issues
- Capacity Planning Dashboard: Password validations, login counts, load indicators
Community dashboards on Grafana.com: ID 10441 and 17878.
Troubleshooting
“Failed to initialize Liquibase”
Cause: Keycloak starts before PostgreSQL accepts connections.
Fix: Use depends_on with health check condition:
keycloak:
depends_on:
postgres:
condition: service_healthy
Token Issuer Mismatch / Hostname Errors
Cause: KC_HOSTNAME doesn’t match the URL clients see, or proxy headers aren’t being forwarded.
Fix: Set KC_HOSTNAME to the full external URL including scheme:
KC_HOSTNAME=https://auth.example.com
KC_PROXY_HEADERS=xforwarded
Debug with KC_HOSTNAME_DEBUG=true, then check /realms/master/hostname-debug.
OOM Kills / High Memory Usage
Cause: Default MaxRAMPercentage=70 leaves insufficient room for non-heap memory.
Fix: Set container memory to minimum 2 GB and tune heap:
JAVA_OPTS_KC_HEAP: "-XX:MaxRAMPercentage=65 -XX:InitialRAMPercentage=50"
Slow Startup (30+ seconds)
Cause: Build phase runs on every container start.
Fix: Use the optimized Dockerfile from the Build Optimization section. Startup drops to under 10 seconds.
Cluster Nodes Not Discovering Each Other
Cause: Docker network blocks JGroups ports or IPv6 interferes with discovery.
Fix:
- Ensure nodes share the same Docker network
- Add
JAVA_OPTS_APPEND: "-Djava.net.preferIPv4Stack=true" - Verify with
docker compose logs | grep "ISPN000094"
For more Keycloak troubleshooting, see our Keycloak Session Expired Errors and Keycloak LDAP Connection Troubleshooting guides.
