DB connection is interrupted with multiple management servers

### problem

Hello, I am using an environment with 2 management servers, using MariaDB Galera as the database server.

A few minutes after starting the management server service, the connection to the database is dropped, and the management server logs show the following record:

`ERROR [c.c.u.d.T.Transaction] (AsyncJobMgr-Heartbeat-1:[ctx-c66c8126]) (logid:cfb46db0) Unexpected exception: java.sql.SQLTransientConnectionException: cloud - Connection is not available, request timed out after 80000ms (total=1000, active=1000, idle=0, waiting=22)`

My current MariaDB server configuration is as follows:

```
[mysqld]
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=1000
log-bin=mysql-bin
max_allowed_packet=1024M
net_read_timeout=120
net_write_timeout=120
```

And the configuration of access to the management servers database (db.properties) are these:

```
# CloudStack database tuning parameters
db.cloud.connectionPoolLib=hikaricp
db.cloud.maxActive=1000
db.cloud.maxIdle=100
db.cloud.maxWait=900000
db.cloud.minIdleConnections=20
db.cloud.connectionTimeout=80000
db.cloud.keepAliveTime=800000
db.cloud.validationQuery=/* ping */ SELECT 1
db.cloud.testOnBorrow=true
db.cloud.testWhileIdle=true
db.cloud.timeBetweenEvictionRunsMillis=40000
db.cloud.minEvictableIdleTimeMillis=240000
db.cloud.poolPreparedStatements=false
db.cloud.url.params=prepStmtCacheSize=517&cachePrepStmts=true&sessionVariables=sql_mode='STRICT_TRANS_T ABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'&serverTimezone=UTC
```

PS: I changed some parameters of both the database server (max_connections, max_allowed_packet and timeout parameters) and the connection pools in the db.properties file (db.cloud.maxActive, db.cloud.connectionTimeout, db.cloud.minIdleConnections), but none of them seem to have solved it, it just takes a little longer to interrupt.

When I check the database server logs I can only get the following error records:

```
Feb 26 10:19:14 acs-stg-mngt-01 mariadbd[20527]: 2025-02-26 10:19:14 3320 [Warning] Aborted connection 3320 to db: 'cloud_usage' user: 'cloud' host: 'IP_MNGT_SERVER' (Got an error reading communication packets)
Feb 26 10:18:34 acs-stg-mngt-01 mariadbd[20527]: 2025-02-26 10:18:34 2282 [Warning] Aborted connection 2282 to db: 'cloud' user: 'cloud' host: 'IP_MNGT_SERVER' (Got an error reading communication packets)
```

PS: I noticed that this started happening after installing cloudstack-usage, this same problem already occurred in a Standalone installation (without HA) of Cloudstack and it occurred precisely after installing cloudstack-usage, but I'm still debugging to find out if it is really related to the current problem.


### versions

Cloudstack: 4.20
MariaDB: 10.6

### The steps to reproduce the bug

1. Configure Cloudstack in HA mode (MariaDB, HA Proxy)
2. Add 3 KVM hosts
3. Enable HA hosts


### What to do about it?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DB connection is interrupted with multiple management servers #10469

problem

versions

The steps to reproduce the bug

What to do about it?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DB connection is interrupted with multiple management servers #10469

Description

problem

versions

The steps to reproduce the bug

What to do about it?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions