Skip to content

Add kernel kTLS support for zero-copy TLS data path #31

@EdmondDantes

Description

@EdmondDantes

Goal

Bring kernel TLS (setsockopt(SOL_TLS, TLS_TX/TLS_RX)) into the server so AES-GCM encryption / decryption happens in-kernel during sendmsg / recvmsg, and SSL_sendfile can splice static file content straight from page cache to the encrypted socket. Targeted gain: 20-40% on TLS static / large body responses (eliminates one body memcpy and the userspace cipher BIO ring).

Pairs with #30 Phase 1 (hybrid DRAIN/GATHER emit, merged separately) — Phase 1 is the best we can do on the existing memory-BIO async transport; Phase 2 here is the structural step beyond it.

Approach

Path 1 from the research notes — socket-BIO + zend_async_poll_event_t for readiness. Matches what nginx and HAProxy do. Path 2 (custom-BIO with kTLS controls) was rejected — no production server uses it; it lives only in two open OpenSSL issues (#18176, #31138). Path 3 (manual key extraction) rejected as unmaintainable.

For each new TLS connection, at spawn time:

  1. Probe (Linux + OpenSSL kTLS build + tls module loaded) → already done by tls_kernel_ktls_supported().
  2. If yes: tls_session_new_socket(ctx, fd) wraps the connection's raw fd in BIO_new_socket(fd, BIO_NOCLOSE); SSL_OP_ENABLE_KTLS is already set on the SSL_CTX. After handshake, OpenSSL itself probes the kernel and promotes the BIO to kTLS — checked via BIO_get_ktls_send.
  3. If no: stay on the existing memory-BIO transport. Both paths coexist in the code.

I/O readiness comes from ZEND_ASYNC_NEW_SOCKET_EVENT(fd, ASYNC_READABLE | ASYNC_WRITABLE), not libuv uv_read_start. Single poll callback drives both directions; READABLE re-enters the parse FSM, WRITABLE drains any pending plaintext.

Constraints

  • The TLS layer's send-side backpressure must use pure poll events + callbacks, not ZEND_ASYNC_SUSPEND inside tls_layer.c / http_connection_tls.c. Coroutine suspension belongs at the producer API boundary (where user code calls \$res->send() etc.), not inside TLS internals.
  • The memory-BIO path stays. Probe-gated; on hosts without kTLS the server behaves exactly as today.
  • WSL2 MS-shipped kernel is built without `CONFIG_TLS`, so kTLS cannot be validated locally on the maintainer's box — all kTLS validation goes through a dedicated CI workflow (`ktls-smoke.yml`) on `ubuntu-22.04` runners with `modprobe tls`.

Concrete steps (planned commits)

  1. Probe + log — runtime `tls_kernel_ktls_supported()`, gate `ctx->ktls_enabled` on it, log once per process.
  2. socket-BIO session constructor — `tls_session_new_socket(ctx, fd)` parallel to `tls_session_new`.
  3. Handshake driver via poll events — `http_connection_ktls_arm_handshake` + FSM that re-arms READABLE / WRITABLE on `WANT_READ` / `WANT_WRITE`.
  4. Data path — SSL_read into `read_buffer` + feed parser; SSL_write with pending-buffer drained on WRITABLE. No coroutine suspension inside TLS internals.
  5. `SSL_sendfile` for static — splice path for HTTP/1 and HTTP/2 static responses when `BIO_get_ktls_send` reports active.
  6. Bench + CI — h2load vs Phase 1 hybrid on TLS static 1B/16K/64K, dynamic 3B/16K/64K. `ktls-smoke.yml` validates handshake + telemetry on every push.

Out of scope

  • TLS 1.3 KeyUpdate re-keying — addressed when CI surfaces `EKEYEXPIRED` in practice.
  • Windows kTLS (Schannel) — kernel TLS API there is Schannel-specific, not a portable extension to this work.
  • Mellanox / NIC TLS offload — orthogonal; works automatically once kTLS is engaged.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions