Skip to content

db_mysql: recover from ER_UNKNOWN_STMT_HANDLER (1243)#3865

Open
dondetir wants to merge 1 commit intoOpenSIPS:masterfrom
dondetir:fix/db_mysql-aurora-stmt-recovery
Open

db_mysql: recover from ER_UNKNOWN_STMT_HANDLER (1243)#3865
dondetir wants to merge 1 commit intoOpenSIPS:masterfrom
dondetir:fix/db_mysql-aurora-stmt-recovery

Conversation

@dondetir
Copy link
Copy Markdown
Contributor

@dondetir dondetir commented Apr 7, 2026

Summary

When MySQL returns error 1243 (ER_UNKNOWN_STMT_HANDLER) from mysql_stmt_execute(), the current wrapper_single_mysql_stmt_execute() switch drops it into the default: branch, logs LM_CRIT and returns the error up to the caller without triggering the existing reconnect + re-prepare recovery path. This patch adds the case so 1243 funnels into the same recovery flow that already handles CR_SERVER_GONE_ERROR, CR_SERVER_LOST, CR_COMMANDS_OUT_OF_SYNC, and 4031.

Why this matters — real production impact

This was reported by Sasmita Panda (3CLogic) on the opensips-users list (thread: "Need some help on mysql error on opensips", 2025-10-21). Their setup runs OpenSIPS on EKS against AWS RDS Aurora MySQL. When Aurora performs a minor-version zero-downtime upgrade, the backing database instance is replaced while the client TCP connection is preserved. The server-side prepared-statement cache is lost on the new instance, so the next mysql_stmt_execute using an existing handle returns 1243. libmysqlclient's auto-reprepare (which handles the related ER_NEED_REPREPARE case for DDL invalidation) does not kick in here because the server has no record of the handle at all. With the current code OpenSIPS logs:

CRITICAL:db_mysql:wrapper_single_mysql_stmt_execute: driver error (1243): Unknown prepared statement handler
ERROR:usrloc:db_insert_ucontact: inserting contact in db failed

and the query fails silently. The operator has to restart OpenSIPS to recover.

What the fix does

The recovery path already exists and is proven. When the wrapper returns -1, callers like db_do_prepared_query() run:

switch_state_to_disconnected(conn);
connect_with_retry(conn, max_db_retries);
re_init_statement(conn, pq_ptr, ctx, 1);

This closes all cached statements, opens a fresh MYSQL connection (which correctly routes to the new Aurora backend), re-prepares the statement, and retries the caller's loop up to max_db_queries times. All we need to do is tell the error classifier that 1243 is a reconnect-worthy error.

The fix is a one-case addition to the switch. ER_UNKNOWN_STMT_HANDLER is defined identically in both MySQL (mysqld_error.h) and MariaDB (mariadb_error.h) as 1243, so no magic-number comment is needed (unlike the 4031 case above it).

Scope discipline

  • Only wrapper_single_mysql_stmt_execute is modified. The companion wrapper_single_mysql_stmt_prepare is intentionally left alone: it starts from a fresh mysql_stmt_init() handle that carries no server-side ID, so the server cannot return 1243 in response to a COM_STMT_PREPARE. Adding the case there would be unreachable defensive code.
  • wrapper_single_mysql_real_query / wrapper_single_mysql_send_query do not need the case — they don't use prepared statement handles.
  • ER_NEED_REPREPARE (1615) is intentionally not added — libmysqlclient auto-reprepare already handles that scenario.
  • +9 / -0 lines, one file, zero drive-by changes.

Verification

  • Clean build, zero warnings.
  • Disassembly diff of db_mysql.so before/after: exactly 1 new cmp $0x4db, %eax instruction inside the wrapper_single_mysql_stmt_execute inlined call site. All 5 existing error-code comparisons (0x7d6, 0x7de, 0xfbf) preserved bit-for-bit — zero regression on any existing error classification.
  • Deterministic test harness drives both the pre-fix and post-fix switch logic against a live MySQL 8.4 happy-path query plus every relevant error code (CR_SERVER_GONE_ERROR, CR_SERVER_LOST, CR_COMMANDS_OUT_OF_SYNC, 4031, ER_UNKNOWN_STMT_HANDLER, ER_NEED_REPREPARE, syntax error, dup entry). Before the fix, 1243 classifies as +1 HARD ERROR; after the fix, 1243 classifies as -1 RECONNECT. Every other error code is classified identically in both versions.
  • Happy-path runtime: opensips loads the patched db_mysql.so, registers the E_MYSQL_CONNECTION event, initializes usrloc in sql-only mode against a live MySQL container, and runs the reactor cleanly.

Credit

Reported-by: Sasmita Panda spanda@3clogic.com

The prepared-statement execute wrapper treats MySQL error 1243
(ER_UNKNOWN_STMT_HANDLER) as a hard failure and skips the existing
reconnect + re-prepare path. This surfaces in production when the
backing database is replaced underneath a live client connection -
for example during an AWS Aurora zero-downtime minor-version upgrade,
which preserves the TCP connection but drops the server-side
prepared-statement cache. The next stmt_execute returns 1243, OpenSIPS
logs CRITICAL and the query fails instead of transparently recovering.

Add the case to wrapper_single_mysql_stmt_execute so it funnels into
the same switch_state_to_disconnected -> connect_with_retry ->
re_init_statement recovery path that already handles CR_SERVER_GONE_ERROR
and friends.

ER_NEED_REPREPARE (1615) is intentionally not added: libmysqlclient
auto-reprepare already handles that case. 1243 bypasses auto-reprepare
because the server has no record of the handle at all.

The companion prepare wrapper is not modified: it starts from a fresh
mysql_stmt_init() handle that carries no server-side ID, so the server
cannot return 1243 in response to a prepare request.

Reported-by: Sasmita Panda <spanda@3clogic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant