Skip to content

Conversation

@ddkohler
Copy link
Contributor

@ddkohler ddkohler commented Nov 24, 2025

@ddkohler ddkohler changed the title Update _protocol.py release tcp messages from memory Nov 24, 2025
@ddkohler ddkohler marked this pull request as ready for review November 24, 2025 21:28
@ddkohler ddkohler mentioned this pull request Dec 1, 2025
1 task
@ksunden
Copy link
Member

ksunden commented Dec 4, 2025

I would tend to prefer handling this inside of the Unpacker class, rather than reaching into private state from elsewhere.

It is not so bad considering it is all in the same package, but Unpacker is intended to be relatively self contained (and potentially spin-offable if interest was ever to accrue, though no plans to do so)

@ksunden
Copy link
Member

ksunden commented Dec 4, 2025

Further, I would have to think harder on this to be sure that this particular clearing location would not risk missing messages

@ddkohler
Copy link
Contributor Author

ddkohler commented Dec 5, 2025

could push the job to Unpacker.feed:

    def feed(self, data: bytes):
        # Must support random access, if it does not, must be fed externally (e.g. TCP)
        pos = self._file.tell()
        self._file.seek(0, 2)
        if pos == self._file.tell():
            # read reached EOF and we are safe to clear
            self._file = io.BytesIO()
            pos = 0
        self._file.write(data)
        self._file.seek(pos)
        self.new_data.set()

This puts the rewrite inside of Unpacker and should ensure we don't miss messages. The con is that clearing the file is less scheduled and we might run into an issue where _file never gets cleared, but to me that seems unlikely in real world cases...

@ddkohler
Copy link
Contributor Author

ddkohler commented Dec 5, 2025

@ksunden lmk what you think of Unpacker's file_like argument...these changes do not respect the possibility of anything other than a BytesIO object for the Unpacker._file....

@ddkohler
Copy link
Contributor Author

okay, using file-like methods to clear _file, rather than explicitly creating a new BytesIO object. No conflict with file_like arg for Unpacker

@untzag
Copy link
Member

untzag commented Dec 22, 2025

Thanks for finding this memory leak @ddkohler

@ksunden and I spent quite a while thinking about how the whole unpacker process could be better, see the story in #97

In the interest of getting this leak fixed right now, I like your solution @ddkohler. Going to merge and release this and hopefully we can come around and make a better solution e.g. #97 at a later date

@untzag
Copy link
Member

untzag commented Dec 22, 2025

(we did think about this implementation and decided that it's not going to cause any problems, it's purely better than what we had before)

@untzag untzag merged commit 3354de3 into main Dec 22, 2025
25 checks passed
@untzag untzag deleted the unpacker-clear-file branch December 22, 2025 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

protocol unpacker: old messages kept in memory python 3.12 yields huge memory accumulation for yaq daemons

4 participants