fix: persist moving summary buffer to db and limit message fetch by date#5880
fix: persist moving summary buffer to db and limit message fetch by date#5880RenzoMXD wants to merge 4 commits intoFlowiseAI:mainfrom
Conversation
…h by date - Add 'summaryMessage' to MessageType in both server and components Interface - Load persisted summary from db on each request via summaryMessage row - Filter message query to only fetch messages after last summary date - Save updated summary back to db after pruning - Fix double-prepend of summary when no pruning is needed ~
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the efficiency and performance of conversation summary buffer memory by implementing database persistence for summaries and optimizing message retrieval. Previously, summaries were re-generated from scratch for every request, leading to high token usage and slow performance in long conversations. The changes ensure that summaries are stored, loaded, and used effectively, bounding the amount of text processed per turn and maintaining token usage within limits. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a key issue with conversation summary memory by persisting the summary to the database, which improves performance, reduces token consumption in long conversations, and fixes the double-prepending of summaries. However, it introduces two significant security concerns: an Insecure Direct Object Reference (IDOR) vulnerability due to unvalidated sessionId usage in database queries, which violates the authorization rule, and a persistent Indirect Prompt Injection risk where malicious instructions can be stored in the conversation summary and affect all future interactions in the session. Additionally, consider wrapping the database operations for updating the summary in a transaction to prevent potential race conditions and ensure data integrity.
...s/components/nodes/memory/ConversationSummaryBufferMemory/ConversationSummaryBufferMemory.ts
Show resolved
Hide resolved
...s/components/nodes/memory/ConversationSummaryBufferMemory/ConversationSummaryBufferMemory.ts
Show resolved
Hide resolved
...s/components/nodes/memory/ConversationSummaryBufferMemory/ConversationSummaryBufferMemory.ts
Outdated
Show resolved
Hide resolved
|
Hello, @rvasilero, @HenryHengZJ. Could you please review my PR? |
Root cause
The
getChatMessagesmethod fetched all messages from the db on every request with no limit. Additionally,movingSummaryBufferwas an in-memory instance variable that was lost between requests (the class isre-instantiated per request), so the summary was never actually reused — every turn re-summarized the full history from scratch.
Fix
Persist the summary to db — after pruning generates a summary, it is saved as a
summaryMessagerow in theChatMessagetable for the session.Load the persisted summary on startup — at the start of each request, the latest
summaryMessagerow is fetched and restored intomovingSummaryBuffer.Date-filter the message query — the main message query now only fetches messages created after the last summary's timestamp, so the number of messages processed stays bounded regardless of conversation length.
Fix double-prepend — a pre-existing issue where the summary was prepended twice when no pruning was needed is also resolved.
Result
Token usage now stays within
maxTokenLimitacross long conversations. The amount of text re-summarized per turn is bounded to only the messages since the last summary, not the full history.Closes #5873