This issue tracks free-threading (FT) thread-safety findings from ft-review-toolkit, complementing the ongoing FT work in #234. The analysis combines static analysis (shared state, lock discipline, unsafe APIs, atomic candidates) with dynamic ThreadSanitizer stress testing (18 concurrent scenarios, 4 threads × 200 iterations).
Full report: bitarray_ft_report.md
Migration plan: bitarray_migration_plan.md
TSan report: bitarray_tsan_report.md
TSan stress script: tsan_stress_bitarray.py
Note: Our analysis was initially run against a stale clone that still had PyDict_GetItem in _bitarray.c:2847. The upstream 3.8.1 has already replaced this with PyDict_GetItemRef — thank you Ilan for catching this! All other findings below are confirmed against the current upstream code.
TSan results: 70 raw warnings, 41 unique races, 6 SIGABRTs across 18 stress scenarios. All races are in extension code (0 CPython-internal).
CRITICAL: Lazy-Init Singleton Races (3 locations)
Three static PyObject* variables are lazily initialized with a check-then-write pattern that races under free-threading. Two threads calling the function simultaneously can both see NULL, both import/lookup, and both write — one result leaks, or a thread reads a partially-written pointer.
1. info (BufferInfo class) — _bitarray.c:1021-1025
static PyObject *info = NULL; /* BufferInfo object */
// ...
if (info == NULL)
info = bitarray_module_attr("BufferInfo"); // RACE: two threads both see NULL
Called from bitarray_buffer_info(). Fix: Initialize eagerly in PyInit__bitarray(), or remove the cache (bitarray_module_attr does PyImport_ImportModule which already caches):
static PyObject *
bitarray_buffer_info(bitarrayobject *self)
{
PyObject *info = bitarray_module_attr("BufferInfo"); // no static cache
if (info == NULL)
return NULL;
// ... use info ...
Py_DECREF(info);
// ...
}
2. frozen (frozenbitarray class) — _bitarray.c:1097-1093
Same pattern in freeze_if_frozen(). Called during bitarray.__init__() for subclass instances — hot path.
3. reconstructor (pickle helper) — _bitarray.c:1365-1359
Same pattern in bitarray_reduce(). Called during pickling.
CRITICAL: Lazy-Init Lookup Tables with Endianness-Dependent Re-Init (2+1 locations)
4. ssqi() tables — _util.c:277-293
Three static char[256] tables (count_table, sum_table, sum_sqr_table) guarded by static int setup = -1. The guard tracks endianness — tables are re-initialized when a different-endianness bitarray is encountered:
static int setup = -1; /* endianness of tables */
// ...
if (setup != a->endian) { // RACE: non-atomic read
setup_table(count_table, 'c');
setup_table(sum_table, IS_LE(a) ? 'a' : 'A');
setup_table(sum_sqr_table, IS_LE(a) ? 's' : 'S');
setup = a->endian; // RACE: non-atomic write
}
Two threads calling ssqi() with different-endianness bitarrays will race on both the flag AND the table contents, producing silently wrong computation results.
Note: Simple atomics are NOT sufficient here — the flag guards compound state (three tables). Fix: Pre-compute both endianness variants at module init (tables are small — 6 × 256 bytes = 1.5KB total), or protect with PyMutex:
static PyMutex ssqi_mutex = {0};
PyMutex_Lock(&ssqi_mutex);
if (setup != a->endian) {
setup_table(count_table, 'c');
// ...
setup = a->endian;
}
PyMutex_Unlock(&ssqi_mutex);
5. xor_indices() tables — _util.c:321-334
Same pattern with parity_table and xor_table. Same endianness-dependent re-init race.
6. digit_to_int() table — _util.c:848-864
Similar but endianness-independent (setup is boolean, not endianness). Uses memset to clear the table before populating — a reader during this window gets invalid data. Fix: Initialize at module init in PyInit__util() (table never changes).
HIGH: Zero Per-Object Synchronization (91 functions)
Every function that accesses self->ob_item, self->nbits, self->allocated, or self->ob_exports does so without any lock or critical section. Under free-threading, concurrent access to a shared bitarray object is a data race.
TSan confirmed this across 5 scenarios (concurrent_mutation, read_write_contention, slice_operations, fill_padbits, frombytes_tobytes) — all racing on the array buffer via getbit/setbit and resize().
Key architectural constraint: getbit()/setbit() in bitarray.h are inline leaf functions — they cannot hold locks. All synchronization must be at the Python-facing method level.
Fix: Add Py_BEGIN_CRITICAL_SECTION(self) / Py_END_CRITICAL_SECTION(self) to every bitarray_* method registered in the method table and tp_* slots. The pythoncapi_compat.h header (already included) provides these macros with backward compatibility.
Priority order:
- Mutation methods (append, extend, insert, pop, remove, clear, sort, reverse, invert, setall, fill, frombytes) — these call
resize() which does PyMem_Realloc on ob_item
- Buffer protocol (getbuffer/releasebuffer) —
ob_exports counter race
- Iterators (bitarrayiter_next, searchiter_next, decodeiter_next) — use
Py_BEGIN_CRITICAL_SECTION(it->self)
- Two-object operations (bitwise, copy_n, extend_bitarray) — use
Py_BEGIN_CRITICAL_SECTION2(self, other)
- Read-only accessors (count, all, any, len, repr, tobytes, tolist) — still need protection because concurrent
resize() can invalidate ob_item
This is a large but mechanical change. The migration plan at bitarray_ft_migration_plan.md has the full breakdown.
MEDIUM: Structural Migration Items
7. Static type objects (5 types)
Bitarray_Type, DecodeTree_Type, DecodeIter_Type, SearchIter_Type, BitarrayIter_Type are all static PyTypeObject initialized with PyType_Ready(). Under free-threading, CPython may internally mutate tp_dict, tp_subclasses. Convert to heap types via PyType_FromSpec.
Bitarray_Type has Py_TPFLAGS_BASETYPE (subclassable) — highest priority since subclassing triggers tp_subclasses mutations at runtime.
8. CHDI_Type static type — _util.c:2064
Same issue in _util.c.
9. Module state — _bitarray.c:4222, _util.c:2239
Both modules use m_size = -1 (no per-module state). The lazy-init singletons (findings 1-3), interned strings, and bitarray_type pointer should move to a module state struct with multi-phase init (Py_mod_exec).
10. bitarray_type pointer — _util.c:16
static PyTypeObject *bitarray_type written once during PyInit__util, read everywhere. Safe today (import lock serializes init), but should move to module state for subinterpreter correctness.
FIXED: PyDict_GetItem in encode loop ✅
_bitarray.c:2847 — PyDict_GetItem(codedict, symbol) returns borrowed ref.
Already fixed in 3.8.1 — replaced with PyDict_GetItemRef. Thank you!
LOW: PyTuple_GET_ITEM in index error path
_bitarray.c:1256-1257 — borrowed ref from args tuple consumed immediately by PyErr_Format. Low risk since args is call-stack-local.
TSan Stress Test
A stress test script is available that exercises 18 concurrent scenarios:
# Run under TSan-enabled free-threaded Python:
PYTHON_GIL=0 /path/to/tsan-python tsan_stress_bitarray.py 2> tsan_report.txt
Scenarios include: concurrent mutation, read-write contention, bitwise operations, iteration during mutation, slice operations, buffer export, bytereverse, encode/decode, and more.
Suggested Fix Order
- Lazy-init singletons (1-3) — remove the caches or use
PyMutex. Trivial fixes.
- Lazy-init tables (4-6) — pre-compute at module init. Small change, eliminates endianness re-init race.
- Per-object critical sections (finding 7) — the bulk of the work. Mechanical but touches ~80 functions.
- Structural migration (7-10) — longer-term, for subinterpreter support.
Analysis by ft-review-toolkit. TSan stress testing via labeille. Report reviewed by a human before submission.
This issue tracks free-threading (FT) thread-safety findings from ft-review-toolkit, complementing the ongoing FT work in #234. The analysis combines static analysis (shared state, lock discipline, unsafe APIs, atomic candidates) with dynamic ThreadSanitizer stress testing (18 concurrent scenarios, 4 threads × 200 iterations).
Full report: bitarray_ft_report.md
Migration plan: bitarray_migration_plan.md
TSan report: bitarray_tsan_report.md
TSan stress script: tsan_stress_bitarray.py
TSan results: 70 raw warnings, 41 unique races, 6 SIGABRTs across 18 stress scenarios. All races are in extension code (0 CPython-internal).
CRITICAL: Lazy-Init Singleton Races (3 locations)
Three
static PyObject*variables are lazily initialized with a check-then-write pattern that races under free-threading. Two threads calling the function simultaneously can both seeNULL, both import/lookup, and both write — one result leaks, or a thread reads a partially-written pointer.1.
info(BufferInfo class) —_bitarray.c:1021-1025Called from
bitarray_buffer_info(). Fix: Initialize eagerly inPyInit__bitarray(), or remove the cache (bitarray_module_attrdoesPyImport_ImportModulewhich already caches):2.
frozen(frozenbitarray class) —_bitarray.c:1097-1093Same pattern in
freeze_if_frozen(). Called duringbitarray.__init__()for subclass instances — hot path.3.
reconstructor(pickle helper) —_bitarray.c:1365-1359Same pattern in
bitarray_reduce(). Called during pickling.CRITICAL: Lazy-Init Lookup Tables with Endianness-Dependent Re-Init (2+1 locations)
4.
ssqi()tables —_util.c:277-293Three
static char[256]tables (count_table,sum_table,sum_sqr_table) guarded bystatic int setup = -1. The guard tracks endianness — tables are re-initialized when a different-endianness bitarray is encountered:Two threads calling
ssqi()with different-endianness bitarrays will race on both the flag AND the table contents, producing silently wrong computation results.Note: Simple atomics are NOT sufficient here — the flag guards compound state (three tables). Fix: Pre-compute both endianness variants at module init (tables are small — 6 × 256 bytes = 1.5KB total), or protect with
PyMutex:5.
xor_indices()tables —_util.c:321-334Same pattern with
parity_tableandxor_table. Same endianness-dependent re-init race.6.
digit_to_int()table —_util.c:848-864Similar but endianness-independent (
setupis boolean, not endianness). Usesmemsetto clear the table before populating — a reader during this window gets invalid data. Fix: Initialize at module init inPyInit__util()(table never changes).HIGH: Zero Per-Object Synchronization (91 functions)
Every function that accesses
self->ob_item,self->nbits,self->allocated, orself->ob_exportsdoes so without any lock or critical section. Under free-threading, concurrent access to a sharedbitarrayobject is a data race.TSan confirmed this across 5 scenarios (concurrent_mutation, read_write_contention, slice_operations, fill_padbits, frombytes_tobytes) — all racing on the array buffer via
getbit/setbitandresize().Key architectural constraint:
getbit()/setbit()inbitarray.hare inline leaf functions — they cannot hold locks. All synchronization must be at the Python-facing method level.Fix: Add
Py_BEGIN_CRITICAL_SECTION(self)/Py_END_CRITICAL_SECTION(self)to everybitarray_*method registered in the method table andtp_*slots. Thepythoncapi_compat.hheader (already included) provides these macros with backward compatibility.Priority order:
resize()which doesPyMem_Realloconob_itemob_exportscounter racePy_BEGIN_CRITICAL_SECTION(it->self)Py_BEGIN_CRITICAL_SECTION2(self, other)resize()can invalidateob_itemThis is a large but mechanical change. The migration plan at bitarray_ft_migration_plan.md has the full breakdown.
MEDIUM: Structural Migration Items
7. Static type objects (5 types)
Bitarray_Type,DecodeTree_Type,DecodeIter_Type,SearchIter_Type,BitarrayIter_Typeare allstatic PyTypeObjectinitialized withPyType_Ready(). Under free-threading, CPython may internally mutatetp_dict,tp_subclasses. Convert to heap types viaPyType_FromSpec.Bitarray_TypehasPy_TPFLAGS_BASETYPE(subclassable) — highest priority since subclassing triggerstp_subclassesmutations at runtime.8.
CHDI_Typestatic type —_util.c:2064Same issue in
_util.c.9. Module state —
_bitarray.c:4222,_util.c:2239Both modules use
m_size = -1(no per-module state). The lazy-init singletons (findings 1-3), interned strings, andbitarray_typepointer should move to a module state struct with multi-phase init (Py_mod_exec).10.
bitarray_typepointer —_util.c:16static PyTypeObject *bitarray_typewritten once duringPyInit__util, read everywhere. Safe today (import lock serializes init), but should move to module state for subinterpreter correctness.FIXED:✅PyDict_GetItemin encode loop_bitarray.c:2847—PyDict_GetItem(codedict, symbol)returns borrowed ref.Already fixed in 3.8.1 — replaced with
PyDict_GetItemRef. Thank you!LOW:
PyTuple_GET_ITEMin index error path_bitarray.c:1256-1257— borrowed ref fromargstuple consumed immediately byPyErr_Format. Low risk sinceargsis call-stack-local.TSan Stress Test
A stress test script is available that exercises 18 concurrent scenarios:
Scenarios include: concurrent mutation, read-write contention, bitwise operations, iteration during mutation, slice operations, buffer export, bytereverse, encode/decode, and more.
Suggested Fix Order
PyMutex. Trivial fixes.Analysis by ft-review-toolkit. TSan stress testing via labeille. Report reviewed by a human before submission.