Skip to content

[BUG] #917

@Gradlon

Description

@Gradlon

Describe the bug

Hello, I use a KeyDB multimaster cluster behind an HAProxy load balancer.
This works most of the time, but in certain situations I get the Follwoing Error:

2025/09/30 15:02:24.000 [E] write tcp 10.34.5.49:38184->10.34.5.81:6379: write: broken pipe
2025/09/30 15:02:24.001 [D] | 185.73.121.250| 503 | 2.115088ms| nomatch| GET /api/get-account

This mostly happens when a user hits the login page.

My docker-compose looks like this

services:
casdoor:
image: registry.integral-systems.ch/cache_docker/casbin/casdoor:v2.55.0
environment:
TZ: "Europe/Zurich"
dbName: ${DB_NAME}
driverName: postgres
dataSourceName: "user=${DB_USER} password=${DB_PASSWORD} host=${DB_HOST} port=${DB_PORT} sslmode=${DB_SSL} dbname=${DB_NAME}"
appname: ${APP_NAME}
httpport: 8000
runmode: prod
redisEndpoint: ${REDIS_HOST}:6379,${REDIS_DB},${REDIS_PASSWORD}
radiusServerPort: 1812
radiusSecret: ${RADIUS_SECRET}
origin: https://auth.sunnysideup.so
originFrontend: https://auth.sunnysideup.so
volumes:
- type: cluster
source: casdoor_data
target: /files
ports:
- target: 8000
published: 32009
networks:
- loadbalancer
cap_drop:
- ALL
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000"]
interval: 30s
timeout: 5s
retries: 15
start_period: 30s
The Haproxy Config looks like this

global
maxconn 50000
log stdout format raw local0 info
nbthread 4

defaults
mode tcp
log global
# Timeout values should be configured for your specific use.
# See: https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#4-timeout%20connect
timeout connect 5s
timeout client 5m
timeout server 5m
timeout tunnel 1h
# TCP keep-alive on client side. Server already enables them.
option clitcpka
option srvtcpka
retries 3 # Retry up to 3 times before marking a node as failed
option redispatch # Redispatch to another node if one fails during a session
option log-health-checks

listen KeyDB
bind *:6379
maxconn 40000
mode tcp
timeout client 15m
timeout server 15m
hash-type consistent
balance source
option tcplog
option tcp-check
#uncomment these lines if you have basic auth
tcp-check send AUTH\ PASSWORD\r\n
tcp-check expect string "+OK"
tcp-check send "PING\r\n" comment "Ping phase"
tcp-check expect string "+PONG"
tcp-check send "info replication\r\n" comment "Role (active-replica)phase"
tcp-check expect string "role:active-replica"
tcp-check send "QUIT\r\n" comment "Disconnect phase"
tcp-check expect string "+OK"
default-server inter 2s fall 3 rise 2 slowstart 60s
server KeyDB-01 kv-01.cluster:6379 maxconn 20000 check
server KeyDB-02 kv-02.cluster:6379 maxconn 20000 check
server KeyDB-03 kv-03.cluster:6379 maxconn 20000 check

To reproduce

Use haporxy between a a multimaster Cluster and casdoor
Expected behavior

If one of those node fails Haproxy should handle this so that Casdoor (Beego) can continue working without an error.

services:
  keydb:
    image: registry.integral-systems.ch/cache_docker/eqalpha/keydb:alpine_x86_64_v6.3.4
    container_name: keydb
    labels:
      ch.integral-systems.group: "database"
      ch.integral-systems.deployment: "redis"
      ch.integral-systems.health_monitor: "true"
      ch.integral-systems.customer: "false"
      ch.integral-systems.infrastructure: "false"
      ch.integral-systems.services: "true"
    extra_hosts:
       - "KeyDB-01:172.16.65.1"
       - "KeyDB-02:172.16.65.2"
       - "KeyDB-03:172.16.65.3"
    env_file:
      - .env
    command: keydb-server /etc/redis.conf --requirepass $REDIS_PASSWORD --masterauth $REDIS_PASSWORD --port 6379 --replicaof KeyDB-02 6379 --replicaof KeyDB-03 6379
    volumes:
      - ./config.conf:/etc/redis.conf:Z
      - /data/key-db:/data:Z
    network_mode: "host"
    mem_limit: 2G
    mem_reservation: 2G
    restart: unless-stopped

Additional information

Original Issue
casdoor/casdoor#4218 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions