Summary
When multiple threads concurrently call sf_open_virtual() on unsupported formats (e.g. MP4 bytes), the resulting LibsndfileError sometimes has code=0 instead of the correct error code (e.g. 1 for "Format not recognised"). This is because _open() at soundfile.py:L1264 calls sf_error(NULL) to retrieve the error code after a failed open, and sf_error(NULL) reads a global (non-thread-safe) error variable in libsndfile.
When thread A fails sf_open_virtual and thread B also fails concurrently, thread B's call to sf_error(NULL) may return 0 because thread A's subsequent operations (or libsndfile's internal cleanup) have already cleared the global error state.
This results in LibsndfileError with:
code = 0
error_string = "(Garbled error message from libsndfile)"
Reproduction
import soundfile
from io import BytesIO
from concurrent.futures import ThreadPoolExecutor, as_completed
# Any bytes that soundfile can't handle (e.g. an MP4 file)
with open("some_video.mp4", "rb") as f:
mp4_data = f.read()
def try_open(i):
try:
soundfile.SoundFile(BytesIO(mp4_data))
return i, "ok", None
except soundfile.LibsndfileError as e:
return i, e.code, e.error_string
with ThreadPoolExecutor(max_workers=8) as pool:
futs = [pool.submit(try_open, i) for i in range(8)]
for f in as_completed(futs):
idx, code, msg = f.result()
if code == 0:
print(f"thread {idx}: code={code}, msg={msg!r}")
# Expected code=1 ("Format not recognised"), got code=0
Running this repeatedly will produce code=0 on some threads.
Impact
Callers that branch on LibsndfileError.code to distinguish error types (e.g. "unrecognised format" vs "corrupt file") get incorrect codes under concurrency, leading to wrong error-handling paths.
Environment
soundfile==0.13.1
libsndfile 1.2.2
- Python 3.12, Linux x86_64
Possible fix
soundfile.py:_open() could use sf_error() on the (failed) file handle instead of sf_error(NULL), but since the handle is NULL on open failure, libsndfile doesn't support that. An alternative is to add a module-level lock around the sf_open_virtual + sf_error(NULL) sequence so the global error state can't be clobbered between the two calls.
Summary
When multiple threads concurrently call
sf_open_virtual()on unsupported formats (e.g. MP4 bytes), the resultingLibsndfileErrorsometimes hascode=0instead of the correct error code (e.g.1for "Format not recognised"). This is because_open()at soundfile.py:L1264 callssf_error(NULL)to retrieve the error code after a failed open, andsf_error(NULL)reads a global (non-thread-safe) error variable in libsndfile.When thread A fails
sf_open_virtualand thread B also fails concurrently, thread B's call tosf_error(NULL)may return0because thread A's subsequent operations (or libsndfile's internal cleanup) have already cleared the global error state.This results in
LibsndfileErrorwith:code = 0error_string = "(Garbled error message from libsndfile)"Reproduction
Running this repeatedly will produce
code=0on some threads.Impact
Callers that branch on
LibsndfileError.codeto distinguish error types (e.g. "unrecognised format" vs "corrupt file") get incorrect codes under concurrency, leading to wrong error-handling paths.Environment
soundfile==0.13.1libsndfile 1.2.2Possible fix
soundfile.py:_open()could usesf_error()on the (failed) file handle instead ofsf_error(NULL), but since the handle isNULLon open failure, libsndfile doesn't support that. An alternative is to add a module-level lock around thesf_open_virtual+sf_error(NULL)sequence so the global error state can't be clobbered between the two calls.