Skip to content

isql reads stdin one byte per read(2) syscall (8–15s init on Docker bind mounts / NFS / sshfs) #9016

@fdcastel

Description

@fdcastel

isql reads stdin one byte per read(2) syscall (8–15s init on Docker bind mounts / NFS / sshfs)

A reproducible Docker harness + captured evidence is attached as
firebird-isql-stdin-bug.zip. After unzip, bash tests/run_all.sh
rebuilds and re-runs everything below against the unmodified
firebirdsql/firebird:5.0.4 image.

While investigating FirebirdSQL/firebird-docker#40
— a Docker‑image init script went from "instant" to 8–9 seconds after switching from
cat "$f" | isql to isql < "$f" — we found that isql reads SQL input from stdin
one byte per read(2) syscall, and that the cause is the bundled editline
library, not glibc/stdio.

This issue is the result of running the official firebirdsql/firebird:5.0.4
image through that test harness; all numbers below come from runs against
the unmodified official binary. Each section cites the evidence file inside
the attached zip.

TL;DR

  1. Stripping cat | off the front of an isql invocation makes stdin reads go
    through the bundled editline library's read_char(), which calls
    read(fd, buf, 1) once per character. On any filesystem with userland
    round-trip per syscall (Docker Desktop bind mounts, NFS, FUSE‑overlay, sshfs,
    WSL2 9P) this turns a 100 ms script into many seconds.
  2. The fix is one line: gate the editline branch in
    readNextInputLine() on Interactive as well as readingStdin(). When stdin
    is redirected (!Interactive), fall through to the existing fgets() path,
    which uses normal stdio buffering.
  3. setvbuf(stdin, …) does not fix this — editline reads stdin via raw
    read(2) and bypasses stdio entirely. We measured this and confirmed.

The bug, mechanically

isql is statically linked against the bundled editline in
extern/editline/. We see the EditLine wrapper literal in .rodata and the
/usr/share/terminfo/d/dumb and /root/.editrc opens at startup
(see evidence/T1_editline_linkage.txt).

In src/isql/isql.epp, readNextInputLine() chooses between editline and
fgets():

#ifdef HAVE_EDITLINE_H
    if (Filelist->readingStdin())                      // <-- only check
    {
        const char* new_prompt = Interactive ? prompt : "";
        lastInputLine = readline(new_prompt);
        ...
        return;
    }
#endif
    // ... fgets() fallback ...
    if (fgets(buffer, charBuffer->getCapacity(),
              Filelist->Ifp().indev_fpointer) != NULL)

The editline branch fires whenever input comes from stdin, including when
stdin is redirected from a file or piped from another process. Interactive
has already been set to false for redirected stdin a few hundred lines
earlier:

if (stdin_redirected())
    Interactive = false;

…but it's never used to gate the editline branch. So redirected stdin still
goes through readline("").

Inside readline(), the per‑character read in extern/editline/src/read.c
is:

while ((num_read = read(el->el_infd, cbuf + cbp, (size_t)1)) == -1) {

Every byte of input → one read(0, _, 1) syscall. We confirmed this two ways:

  • Caller classification. An LD_PRELOAD shim that records the immediate
    return address of every read(0, …) and looks up the containing mapping
    in /proc/self/maps:
    318894 calls (318893 bytes)  caller=/opt/firebird/bin/isql
      total: 318894 read(0,...) calls
    
    All 318 894 reads come from inside the isql executable's text segment —
    not libc, not libfbclient. (evidence/T3_caller_classification.txt)
  • Disassembly. The return address is 0x4be159 in the official
    isql binary. Just above it (evidence/T3_disasm.txt):
    4be148:  mov    0x20(%r14),%edi         ; arg1 = el->el_infd
    4be14c:  mov    $0x1,%edx               ; arg3 = count = 1
    4be151:  mov    %rbx,%rsi               ; arg2 = cbuf + cbp
    4be154:  call   406420 <read@plt>       ; read(infd, cbuf+cbp, 1)
    4be159:  mov    %rax,%r8                ; ← return address
    4be15c:  cmp    $0xffffffffffffffff,%rax
    4be160:  jne    4be250                  ; matches `while (… == -1)`
    
    This is a literal compile of read_char() from extern/editline/src/read.c.

The fact that the binary is dynamically linked against libc but statically
linked against editline means ldd isql gives no hint that editline is
involved — that's why this took some time to find.

Reproduction

Prereqs: Docker, internet (to pull firebirdsql/firebird:5.0.4).

unzip firebird-isql-stdin-bug.zip
cd firebird-isql-stdin-bug
bash tests/run_all.sh

run_all.sh builds a small image on top of firebirdsql/firebird:5.0.4
adding strace + gcc + libedit-dev, then runs each test against the
unmodified /opt/firebird/bin/isql from that image. Each test prints
its result to stdout and writes a copy under evidence/.

The pre-captured evidence/*.txt files in the archive are from a run on
Linux 6.17 / Debian 13 inside the official Firebird 5.0.4 image. A re-run
should produce the same syscall counts; the wall-clock numbers in T7 will
vary with nanosleep() granularity on the host.

What we measured

Test 2 — read() syscall counts on a 318 893-byte SQL script

invocation reads on FD 0 size histogram
isql < file 318 894 318 893 × 1 byte, 1 × 0 (EOF)
cat file | isql 318 894 318 892 × 1 byte, 1 × 82, 1 × 0
isql -i file 0 on FD 0 43 × 8192 + 8 × 832 on FD 3

-i opens the file via os_utils::fopen(), which doesn't go through
Filelist->readingStdin(), so the editline branch never fires; reads happen
through stdio with the default 8 KB buffer.
(evidence/T2_syscall_pattern.txt)

Test 4 — setvbuf(stdin, …) does not change anything

We ran the official isql with an LD_PRELOAD constructor that calls
setvbuf(stdin, big_buf, _IOFBF, 65536) before any application code:

baseline:                                        318894 read(0,...) calls
with setvbuf(stdin, big_buf, _IOFBF, 65536):    318894 read(0,...) calls
with setvbuf(stdin, NULL,    _IOFBF, 65536):    318894 read(0,...) calls

The setvbuf call returns 0 (success) — it really takes effect on glibc's
stdio — but editline doesn't use stdio, so the syscall pattern is unchanged.
(evidence/T4_setvbuf_irrelevance.txt)

Test 5 + 6 — control: editline alone vs fgets alone

Two minimal C programs reading the same 318 KB redirected file:

editline readline()  : 318894 read(0,...)  (318893 × 1 byte)
plain   fgets()      :     40 read(0,...)  (38 × 8192 + 1 × 7597 + 1 EOF)

The editline reproducer (src/repro_editline.c) links libedit.so.2 and
produces exactly the same pattern as the full isql. The fgets reproducer
(src/repro_fgets.c) shows what the fallback path in readNextInputLine()
does — proper 8 KB chunks. (evidence/T5_T6_repro.txt)

Test 7 — wall-clock under simulated FUSE latency

We injected a per-read() delay (regular files only — pipes/sockets unaffected)
via LD_PRELOAD (src/inject_latency.c), to simulate Docker Desktop bind-mount /
FUSE / 9P / sshfs round-trip cost, on the same 318 KB script:

injected µs/regular-file read isql < file cat | isql isql -i file
0 211 ms 36 ms 3 ms
25 24 643 ms 70 ms 3 ms
50 32 645 ms 72 ms 3 ms
100 48 583 ms 70 ms 3 ms

isql < file scales linearly with injected latency (≈318 k regular‑file
reads × delay). The other two are flat: with cat |, isql's 318 k reads
come from a kernel pipe, not the underlying filesystem; with -i, the file is
opened with stdio buffering and there are only ~50 reads total.
(evidence/T7_wallclock.txt)

(nanosleep() overhead inflates the absolute numbers — the kernel's minimum
sleep granularity is well above the 25 µs we asked for — but the shape of
the table is the point: redirect grows linearly with per-syscall cost; pipe
and -i don't.)

Suggested fix

Gate the editline branch on Interactive (which is already computed correctly
just for this purpose):

 #ifdef HAVE_EDITLINE_H
-    if (Filelist->readingStdin())
+    if (Interactive && Filelist->readingStdin())
     {
         const char* new_prompt = Interactive ? prompt : "";
         lastInputLine = readline(new_prompt);
         ...
         return;
     }
 #endif

When stdin is redirected, Interactive is already false (set in the
stdin_redirected() check near main() in isql.epp). The change makes
readNextInputLine() skip editline's char‑at‑a‑time path and fall through
to the existing fgets() block, which we measured doing ~40 reads of 8 KB
on the same input. The Interactive ? prompt : "" line that follows can be
deleted at the same time, since Interactive is now an invariant of the
branch.

We didn't patch and rebuild Firebird ourselves (the build is non-trivial),
but the evidence chain is:

  1. fgets(stdin) reads in 8 KB chunks on this glibc/Linux for redirected
    stdin (Test 6, plain C program — same image).
  2. Filelist->Ifp().indev_fpointer is initialised to libc's stdin for the
    redirected‑stdin case (isql.epp:7269), so the existing
    fgets(buffer, charBuffer->getCapacity(), Filelist->Ifp().indev_fpointer) call uses the same FILE* the test
    above does.
  3. There is no other code path between the #ifdef HAVE_EDITLINE_H block and
    the fgets() call that would alter buffering behaviour.

Why setvbuf(stdin, …) is not the answer

An earlier draft of this writeup suggested adding
setvbuf(stdin, NULL, _IOFBF, 65536) near the top of main(). That
suggestion was wrong: editline uses raw read(2) on el->el_infd, never
fgets/getc/fread on stdin. Test 4 confirms: forcing a 64 KB
fully-buffered stdio buffer on stdin leaves the syscall count at exactly
318 894. If you want to keep editline available for interactive use (you do —
it's the line editor end users rely on at the SQL> prompt), the only fix
is to not call editline when stdin isn't a terminal.

A related improvement on the editline side would be to make read_char()
batch‑read when input is non‑seekable and not a TTY. But that's an editline
upstream change; gating the branch in isql is strictly simpler and entirely
local to Firebird.

Why this matters beyond Docker bind mounts

The 1‑byte‑per‑syscall pattern is invisible on a native ext4/XFS — each
syscall costs <1 µs. It becomes a multi‑second regression on:

  • Docker Desktop bind mounts. macOS / Windows hosts use gRPC FUSE / virtiofs
    / OSXFS; each read() is a userland round trip.
  • WSL2 with \\wsl$\ paths (9P protocol).
  • NFS home directories where DBAs run isql < script.sql.
  • sshfs / s3fs / cloud-mounted filesystems.
  • Encrypted filesystems with per-block crypto on the read path.

For us, the trigger was a Docker entrypoint: changing
cat "$f" | isql to isql < "$f" made firebird-docker initdb scripts go
from 100 ms to 8 s on macOS. The fix above closes the regression class
entirely on every backend, without any per‑user / per‑filesystem tuning.

Workaround in firebird-docker

We reverted our entrypoint to pipe SQL into isql via cat. PR:
FirebirdSQL/firebird-docker#41.
That works around the issue without depending on any isql change, and
matches the existing shape of the .sql.gz / .sql.xz / .sql.zst cases
(which use a decompressor pipeline, hence avoid the slow stdin path "by
accident").


Tested against: firebirdsql/firebird:5.0.4 (Debian 13 trixie, glibc 2.40, kernel 6.17 on host).
Source ref: Firebird 5.0.4 tag v5.0.4 (commit f6d83a2).
Attachments: firebird-isql-stdin-bug.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions