-
-
Notifications
You must be signed in to change notification settings - Fork 34.3k
Description
Crash report
What happened?
It's possible to segfault the interpreter from a double-free in xml.etree.ElementTree.TreeBuilder. Please let me know if the automated diagnosis is incorrect and whether you prefer me not to post it in new issues.
Automated diagnosis:
Missing Py_NewRef causes double-free in treebuilder_handle_end (line 2851): Py_XSETREF(self->last_for_tail, self->last) stores into last_for_tail without taking a new reference. Both fields point to the same object with only one reference. treebuilder_gc_clear decrements twice — a double-free triggered on every XML end tag. Lines 2882 and 2922 demonstrate the correct pattern with Py_NewRef.
Fix: Py_XSETREF(self->last_for_tail, Py_NewRef(self->last));
MRE:
import xml.etree.ElementTree
import gc
builder = xml.etree.ElementTree.TreeBuilder()
for x in range(10):
builder.start("a", {})
for x in range(10):
builder.end("a")
root = builder.close()
print(gc.get_referrers(root[0]))Backtrace:
Program received signal SIGSEGV, Segmentation fault.
0x0000555555b5ca97 in Py_INCREF (op=0x0) at ./Include/refcount.h:281
281 PY_UINT32_T cur_refcnt = op->ob_refcnt;
(gdb) bt
#0 0x0000555555b5ca97 in Py_INCREF (op=0x0) at ./Include/refcount.h:281
#1 _Py_NewRef (obj=0x0) at ./Include/refcount.h:529
#2 list_repr_impl (v=0x7c6ff70471a0) at Objects/listobject.c:604
#3 list_repr (self=0x7c6ff70471a0) at Objects/listobject.c:644
#4 0x0000555555bf05dc in PyObject_Repr (v=v@entry=0x7c6ff70471a0) at Objects/object.c:781
#5 0x0000555555cdc514 in PyUnicodeWriter_WriteRepr (writer=writer@entry=0x7c6ff71b2390, obj=obj@entry=0x7c6ff70471a0) at Objects/unicode_writer.c:390
#6 0x0000555555b5cb0b in list_repr_impl (v=0x7c6ff70749a0) at Objects/listobject.c:615
#7 list_repr (self=0x7c6ff70749a0) at Objects/listobject.c:644
#8 0x0000555555bf028b in PyObject_Str (v=0x7c6ff70749a0) at Objects/object.c:823
#9 0x0000555555b26f23 in PyFile_WriteObject (v=0x7c6ff70749a0, f=0x7d0ff70126a0, flags=flags@entry=1) at Objects/fileobject.c:119
#10 0x0000555555e4de11 in builtin_print_impl (module=0x7ccff6fe6450, objects=0x7bfff5db6b28, objects_length=1, sep=0x0, end=0x0, file=0x7d0ff70126a0, flush=0) at Python/bltinmodule.c:2356
#11 builtin_print (module=<optimized out>, args=<optimized out>, nargs=<optimized out>, kwnames=0x0) at Python/clinic/bltinmodule.c.h:1122
#12 0x0000555555bde65c in cfunction_vectorcall_FASTCALL_KEYWORDS (func=func@entry=0x7c7ff70374c0, args=args@entry=0x7bfff5db6b28, nargsf=nargsf@entry=9223372036854775809,
kwnames=kwnames@entry=0x0) at Objects/methodobject.c:465
#13 0x0000555555ab9e00 in _PyObject_VectorcallTstate (tstate=0x5555568f7b18 <_PyRuntime+360664>, callable=0x7c7ff70374c0, args=0x7bfff5db6b28, nargsf=9223372036854775809, kwnames=0x0)
at ./Include/internal/pycore_call.h:136
#14 0x0000555555e588dd in _Py_VectorCallInstrumentation_StackRefSteal (callable=..., arguments=<optimized out>, total_args=1, kwnames=..., call_instrumentation=<optimized out>,
frame=<optimized out>, this_instr=<optimized out>, tstate=<optimized out>) at Python/ceval.c:770
#15 0x0000555555e94263 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/generated_cases.c.h:1838
#16 0x0000555555e57778 in _PyEval_EvalFrame (tstate=0x5555568f7b18 <_PyRuntime+360664>, frame=0x7e8ff6fe5220, throwflag=0) at ./Include/internal/pycore_ceval.h:118
#17 _PyEval_Vector (tstate=<optimized out>, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0) at Python/ceval.c:2134
#18 0x0000555555e57195 in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=0x7c7ff70884c0) at Python/ceval.c:681
#19 0x0000555556061fb0 in run_eval_code_obj (tstate=tstate@entry=0x5555568f7b18 <_PyRuntime+360664>, co=co@entry=0x7d5ff6ffe990, globals=globals@entry=0x7c7ff70884c0,
locals=locals@entry=0x7c7ff70884c0) at Python/pythonrun.c:1368
Claude explanation of the MRE and crash:
- The loop calls
builder.end("a")10 times, each time executing line 2851:Py_XSETREF(self->last_for_tail, self->last)— missingPy_NewRef. Each end tag over-decrements the element it touches. - After
builder.close(),root[0](the innermost<a>) has had its refcount corrupted — it's lower than it should be. gc.get_referrers(root[0])returns a list containing objects that reference that element. One of those referrers is theTreeBuilder's internal stack list, which itself contains elements with corrupted refcounts.print()callslist_repron that referrers list, which callsPy_NewRefon each item. When it hits an element whose refcount was decremented to 0 (already freed),opisNULLor points to freed memory → segfault atPy_INCREF(op=0x0).
The backtrace confirms it: list_repr_impl at listobject.c:604 tries to Py_NewRef a NULL pointer — an object that was prematurely freed due to the cumulative over-decrements from the 10 end() calls.
Found using cpython-review-toolkit.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.15.0a7+ (heads/main:99e2c5eccd2, Mar 17 2026, 08:26:50) [Clang 21.1.2 (2ubuntu6)]