fix: add depth limit to EventSerializer to prevent hangs on complex objects#1577
fix: add depth limit to EventSerializer to prevent hangs on complex objects#1577BKDDFS wants to merge 6 commits into
Conversation
a895aed to
087052e
Compare
…bjects EventSerializer.default() recursively traverses __dict__ and __slots__ of arbitrary objects without a depth limit. When @observe() captures function arguments containing objects like google.genai.Client (which hold aiohttp sessions, connection pools, and threading locks), json.dumps blocks indefinitely on the second invocation. Add a _MAX_DEPTH=20 counter that returns a <TypeName> placeholder when exceeded, preventing infinite recursion into complex object graphs while preserving all existing serialization behavior.
087052e to
43114a6
Compare
|
Hi @hassiebp, could you check my PR, please? |
|
friendly bump @hassiebp |
|
Another friendly bump @hassiebp / @marcklingen. This is open for ~50 days now with no review. The hang causes real impact in our prod: we ship Happy to:
if either is blocking the review. |
|
@claude review |
|
@BKDDFS Thanks a lot for your contribution! This looks good to me - could you kindly resolve the merge conflicts for this to get merged? |
| self._depth = 0 | ||
|
|
||
| def default(self, obj: Any) -> Any: | ||
| if self._depth >= self._MAX_DEPTH: | ||
| return f"<{type(obj).__name__}>" | ||
|
|
||
| self._depth += 1 | ||
| try: | ||
| return self._default_inner(obj) | ||
| finally: | ||
| self._depth -= 1 |
There was a problem hiding this comment.
🔴 The new depth guard in default() (lines 44-48) returns f"<{type(obj).__name__}>" for any object — including primitive strings — before any type analysis. Because the dict branch in _default_inner routes both keys and values through self.default() ({self.default(k): self.default(v) for k, v in obj.items()}), at the truncation boundary a normal string key like 'a' is replaced with the literal '<str>', silently corrupting user trace data. Fix: either move the depth check inside _default_inner after the primitive type checks, or change the dict branch to {k: self.default(v) for k, v in obj.items()} (matching the __dict__ branch convention on line 165).
Extended reasoning...
What the bug is
The depth guard added at the top of default() (lines 46-48) is unconditional — it runs before any type dispatch:
def default(self, obj: Any) -> Any:
if self._depth >= self._MAX_DEPTH:
return f"<{type(obj).__name__}>"
...This means at the boundary, any value passed to default() — including plain str and int primitives — gets replaced by its type-name placeholder. Combined with the pre-existing dict branch in _default_inner (line 142-143):
if isinstance(obj, dict):
return {self.default(k): self.default(v) for k, v in obj.items()}which routes both keys and values through self.default(), the result is that dict keys at the truncation boundary are silently rewritten to "<str>" (or "<int>", etc.) instead of being preserved.
Step-by-step proof
Consider a chain of 22 nested {"a": {...}} dicts encoded by EventSerializer:
encode(D_0)resets_depth=0, callsdefault(D_0).default(D_k)enters with_depth=k, checksk >= 20(false for k<20), increments tok+1, calls_default_inner(D_k).- The dict branch returns
{self.default("a"): self.default(D_{k+1})}— both at_depth=k+1. - At
k=19:_default_innercallsself.default("a")while_depth=20. The guard_depth >= _MAX_DEPTH(20 >= 20) fires first, returning"<str>"instead of dispatching to theisinstance(obj, str)branch which would have returned"a".
Verified live output:
{"a": {"a": {"a": ... (19 levels) ... {"<str>": "<dict>"}}}}The last surviving key — the most informative one in the trace — is the one that gets clobbered.
Why existing tests miss it
test_max_depth_returns_type_name uses a Level class with __dict__. That path goes through the __dict__ branch at line 165:
result = {k: self.default(v) for k, v in vars(obj).items()}Note: keys are not routed through self.default() here. So the test never exercises the corrupted-key path that the raw-dict branch creates.
Impact
This is a regression introduced by this PR. Before the change, default("a") always hit the isinstance(obj, str) branch in _default_inner and returned "a" unchanged. Now, for any traced input/output containing a deeply nested dict (a common shape: nested JSON payloads, message threads, agent state), the key name at the truncation boundary is silently replaced with "<str>". Worse, "<str>" is indistinguishable from a legitimate string a user might have placed there. The __slots__ branch is also affected, because line 153-156 wraps slot names in a synthetic dict that gets routed through default() — so slot names get the same treatment.
How to fix
Two simple options:
- Move the depth check into
_default_innerafter the primitive type checks. Primitives (str,int,float,None,bytes, etc.) are leaves and cannot recurse, so they should never be subject to the depth guard. - Match the
__dict__branch convention: change line 142-143 to{k: self.default(v) for k, v in obj.items()}. Keys are already required to be JSON-encodable (strings/ints), so routing them throughdefault()was never necessary.
| self._depth = 0 | ||
|
|
||
| def default(self, obj: Any) -> Any: | ||
| if self._depth >= self._MAX_DEPTH: | ||
| return f"<{type(obj).__name__}>" | ||
|
|
||
| self._depth += 1 | ||
| try: | ||
| return self._default_inner(obj) | ||
| finally: | ||
| self._depth -= 1 |
There was a problem hiding this comment.
🔴 The __slots__ branch at lines 153-156 calls self.default({...}) on the synthesized dict, which re-enters the new depth-tracking wrapper and increments _depth a second time. As a result, every __slots__ object consumes 2 depth units per nesting level instead of 1, so the effective _MAX_DEPTH for __slots__-based objects (e.g. pydantic v1 BaseModels, many attrs/slotted dataclasses) is silently halved from 20 to ~10. One-line fix: call self._default_inner({...}) in the slots branch so it consumes depth equivalently to the __dict__ branch (which already builds its result dict directly without re-routing through default).
Extended reasoning...
What the bug is
The PR adds depth tracking by wrapping the existing serialization logic in a new default() method that increments _depth, calls _default_inner, and decrements in a finally. The pre-existing __slots__ branch at langfuse/_utils/serializer.py:153-156 was not updated to match the new wrapper-vs-inner split, so it still does:
if hasattr(obj, "__slots__"):
return self.default(
{slot: getattr(obj, slot, None) for slot in obj.__slots__}
)That self.default(...) call re-enters the wrapper and bumps _depth a second time for the same logical nesting level. The __dict__ branch a few lines below avoids this by building its result directly: {k: self.default(v) for k, v in vars(obj).items()} — only the child traversal increments depth, not the level itself.
Step-by-step proof
Consider a chain SL_0 -> SL_1 -> ... -> SL_N where each SlotLevel has __slots__ = ['child']:
| Step | Call | _depth on entry |
After increment |
|---|---|---|---|
| 1 | default(SL_0) |
0 | 1 |
| 2 | inner sees __slots__ → default({'child': SL_1}) |
1 | 2 |
| 3 | inner sees dict → default(SL_1) (via value) |
2 | 3 |
| 4 | inner sees __slots__ → default({'child': SL_2}) |
3 | 4 |
| ... | ... | ... | ... |
So default(SL_i) is entered with _depth = 2i. The truncation check _depth >= _MAX_DEPTH (20) fires at i = 10 — i.e. truncation happens at ~10 levels for __slots__ objects, vs. 20 for __dict__ objects. This is exactly what the PR's own new test bakes in (depth <= _MAX_DEPTH // 2 + 3 = 13), and Greptile flagged the same issue independently.
Why existing code doesn't prevent it
Before this PR, the redundant self.default(...) call in the __slots__ branch was harmless — default() had no per-call side effect. The PR introduces the per-call _depth increment but only modifies the __dict__ branch's data flow (which already bypassed default() for its own dict construction). The __slots__ branch was overlooked.
Impact
- The documented
_MAX_DEPTH = 20is effectively 10 for any object using__slots__(pydantic v1 BaseModels, many attrs classes, slotted dataclasses, etc.). - When truncation does trigger in a
__slots__chain, the placeholder becomes"<dict>"(the synthesized wrapper dict's type) instead of"<SlotLevel>"(the actual object type) — less informative for debugging. - Severity is moderate: the primary fix (preventing hangs) still works, but the depth budget is silently halved for a common class of objects, and the truncation marker is misleading.
How to fix
One-line change in the slots branch — call the inner helper directly so it matches the __dict__ branch's accounting:
if hasattr(obj, "__slots__"):
return self._default_inner(
{slot: getattr(obj, slot, None) for slot in obj.__slots__}
)The PR author has already offered to make this change ("address Greptile __slots__ double-depth nit — one-line fix: route the slots dict through _default_inner instead of default"). Once applied, the companion test bound (tests/test_serializer.py:267, currently _MAX_DEPTH // 2 + 3 = 13) should be tightened to _MAX_DEPTH + 2 (or restructured like test_max_depth_returns_type_name) so it doesn't encode the buggy behavior.
Summary
EventSerializer.default()recursively traverses__dict__and__slots__of arbitrary objects without a depth limit. When@observe()captures function arguments containing objects likegoogle.genai.Client(which hold aiohttp sessions, connection pools, and threading locks),json.dumpscan block indefinitely.Root cause
_serialize()increate_span_attributescallsjson.dumps(obj, cls=EventSerializer). The serializer's__dict__fallback recurses into every attribute of every object. For objects likeaiohttp.ClientSession→TCPConnector→ internal locks, this either hangs (lock contention / blocking attribute access) or enters extremely deep recursion.Fix
Add a
_MAX_DEPTH = 20depth counter toEventSerializer:default()increments/decrements_depthon each call_depth >= _MAX_DEPTH, returns<TypeName>placeholder instead of recursing_default_inner()— no behavior changesencode()resets_depthat the start of each serializationWorkaround (current)
@observe(capture_input=False)— disables automatic input capture entirely.Tests
3 new tests:
test_deeply_nested_object_does_not_hang— objects with connection pools/lockstest_max_depth_returns_type_name—__dict__deep nesting truncationtest_deeply_nested_slots_object_is_truncated—__slots__deep nesting truncationAll 21 tests pass (18 existing + 3 new).
Disclaimer: Experimental PR review
Greptile Summary
This PR fixes a real and reproducible hang in
EventSerializerby introducing a_MAX_DEPTH = 20counter that short-circuits recursion when serializing deeply nested objects (e.g.google.genai.Clientholding aiohttp session internals). The core mechanism — incrementing_depthindefault()and decrementing it in afinallyblock — is correct and well-structured.\n\nKey changes:\n-default()is now a thin depth-guard that delegates to_default_inner(), preserving all existing serialization behavior unchanged.\n-encode()resets_depth = 0before each top-level call, which is a good defensive measure.\n- Three new tests cover the main scenarios: hang prevention,__dict__depth truncation, and__slots__depth truncation.\n\nIssues found:\n- The__slots__branch at line 153–156 callsself.default({...})for the intermediate dict it constructs. Becausedefault()also increments_depth, each__slots__object consumes 2 depth units per level instead of 1, cutting the effective limit from 20 to ~10 for__slots__-based objects. Changing that single call to_default_inner({...})would make behavior consistent.\n- Thetest_deeply_nested_slots_object_is_truncatedassertion bound (_MAX_DEPTH + 5 = 25) is too loose to catch regressions in truncation depth for the slots case.Confidence Score: 4/5
Safe to merge — the hang is fixed and no existing behavior is regressed; the remaining issues are a minor depth inconsistency for
__slots__objects and a loose test assertion.The fix correctly solves the stated problem (infinite recursion / hang on complex objects) and all 21 tests pass. The only issues are a double-depth-counting side-effect in the
__slots__path (effective limit ~10 instead of 20) and an overly permissive test assertion — neither affects the primary fix or causes correctness failures in normal usage.langfuse/_utils/serializer.pylines 153–156 (__slots__handling) and the corresponding test assertion attests/test_serializer.py:267.Important Files Changed
_MAX_DEPTH = 20depth limiter to prevent infinite recursion/hangs; logic is correct for__dict__-based objects but__slots__objects incur double depth consumption (consuming 2 units per level instead of 1), halving their effective limit to ~10.__dict__truncation, and__slots__truncation; the slots test has an overly loose assertion (_MAX_DEPTH + 5) that doesn't tightly verify the truncation depth.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["encode(obj)"] --> B["_depth = 0\nseen.clear()"] B --> C["default(obj)"] C --> D{{"_depth >= _MAX_DEPTH?"}} D -- Yes --> E["return '<TypeName>'"] D -- No --> F["_depth += 1\n_default_inner(obj)"] F --> G{{"Type of obj?"}} G -- "datetime / UUID / bytes / etc." --> H["Return primitive"] G -- "dict" --> I["self.default(k)\nself.default(v)\nfor each item"] G -- "list / Sequence" --> J["self.default(item)\nfor each item"] G -- "__slots__" --> K["self.default(dict of slots)\n⚠️ extra depth unit"] G -- "__dict__" --> L{{"id(obj) in seen?"}} L -- Yes --> M["return TypeName (circular ref)"] L -- No --> N["seen.add(id)\nself.default(v)\nfor each attr\nseen.remove(id)"] G -- "other" --> O["return '<TypeName>'"] I --> P["_depth -= 1\n(finally)"] J --> P K --> P N --> P H --> P M --> P O --> P P --> Q["return result to encode()"] Q --> R["super().encode(result)"]Comments Outside Diff (1)
langfuse/_utils/serializer.py, line 153-156 (link)__slots__path consumes double the depth budget per object levelBecause the
__slots__branch callsself.default({...})for the intermediate dict, each__slots__object consumes 2 depth units (one for the object itself, one for the constructed dict), whereas a__dict__-based object consumes only 1. In practice this halves the effective depth limit for__slots__objects: they truncate at approximately 10 levels of nesting rather than the intended 20.Tracing for a chain of
SlotLevelobjects:default(SlotLevel_N)→_depth= 2*(N-1)default({"child": SlotLevel_{N+1}})→_depth= 2*(N-1)+1A lightweight fix is to call
_default_innerdirectly for the constructed dict rather than routing it back throughdefault():This avoids the redundant depth check and gives
__slots__objects the same effective depth limit as__dict__objects.Reviews (1): Last reviewed commit: "fix: add depth limit to EventSerializer ..." | Re-trigger Greptile
(2/5) Greptile learns from your feedback when you react with thumbs up/down!