Skip to content

fix(pipecat): retain strong references to background storage tasks to prevent GC#1008

Open
devteamaegis wants to merge 1 commit into
supermemoryai:mainfrom
devteamaegis:fix/pipecat-background-task-gc
Open

fix(pipecat): retain strong references to background storage tasks to prevent GC#1008
devteamaegis wants to merge 1 commit into
supermemoryai:mainfrom
devteamaegis:fix/pipecat-background-task-gc

Conversation

@devteamaegis
Copy link
Copy Markdown
Contributor

Problem

In SupermemoryPipecatService.process_frame, conversation messages are stored
via a fire-and-forget coroutine:

asyncio.create_task(self._store_messages(unsent_messages))

The Task object returned by asyncio.create_task is immediately discarded.
Python's asyncio event loop only holds weak references to scheduled
tasks, so the garbage collector is free to destroy a task at any time — even
while it is still awaiting an I/O call inside _store_messages. On a GC
cycle this can silently kill an in-flight supermemory.memories.add() request,
dropping messages with no error logged.

The SupermemoryCartesiaAgent in the same repo (cartesia-sdk-python) already
uses the correct pattern: it saves each task in a _background_tasks set and
removes it via add_done_callback(discard) once the coroutine finishes.

Fix

Apply the same pattern to SupermemoryPipecatService:

  1. Add self._background_tasks: set = set() in __init__.
  2. Save every storage task in the set; remove via add_done_callback.
  3. Clear the set in reset_memory_tracking().
# before
asyncio.create_task(self._store_messages(unsent_messages))

# after
task = asyncio.create_task(self._store_messages(unsent_messages))
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)

Tests

tests/test_background_task_tracking.py (all 5 pass):

  • test_background_tasks_set_exists – attribute present after init
  • test_task_held_during_execution – task is in the set while running, removed after
  • test_task_removed_after_completion – set is empty once coroutine finishes
  • test_gc_cannot_collect_tracked_task – forced gc.collect() mid-execution can't kill a tracked task
  • test_reset_clears_background_tasksreset_memory_tracking() empties the set

…nt GC

asyncio.create_task() is only weakly referenced by the event loop.  If the
caller discards the returned Task object the GC can destroy it before the
coroutine finishes, silently dropping any messages that were queued for
storage.

The Cartesia SDK in this same repo already uses the correct pattern
(_background_tasks set + add_done_callback(discard)).  Apply the same fix
to SupermemoryPipecatService:

* Add `_background_tasks: set` in __init__
* Save every storage task in the set; remove it via done-callback once
  complete
* Clear the set in reset_memory_tracking()

Adds tests/test_background_task_tracking.py with five test cases:
- presence of _background_tasks attribute
- task is held in the set while running
- task is removed from the set after completion
- a forced GC cycle cannot collect a tracked task mid-execution
- reset_memory_tracking clears the set
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant