Sanitize flowData in getSinglePublicChatflow#5938
Sanitize flowData in getSinglePublicChatflow#5938christopherholland-workday wants to merge 1 commit intomainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on improving the security and privacy of public chatflows by ensuring that sensitive Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to improve security by sanitizing flowData in the getSinglePublicChatflow public endpoint, using sanitizeFlowDataForPublicEndpoint to prevent sensitive information leakage. However, the current sanitizeFlowDataForPublicEndpoint implementation has a bypass when the inputParams property is missing from a node's data, potentially exposing credentials. Additionally, the controller returns the entire chatflow object, which exposes internal metadata and unsanitized configuration. It is recommended to refine the sanitizer logic and limit the fields returned by the public endpoint. There is also a minor code style suggestion to improve readability and maintainability, aligning with repository rules.
| if (chatflow.isPublic) | ||
| return res.status(StatusCodes.OK).json({ ...chatflow, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) |
There was a problem hiding this comment.
The sanitizeFlowDataForPublicEndpoint function used here has a potential bypass. If the flowData is missing the inputParams property for a node, the sanitizer skips the node entirely, leaving sensitive inputs (including passwords and authentication headers) exposed in the public response. The sanitizer implementation in sanitizeFlowData.ts should be updated to handle cases where inputParams is missing, ensuring that at least header sanitization is always performed. Additionally, for improved readability and to prevent potential future bugs (like the dangling else problem), it's a good practice to always use curly braces {} with if statements, especially when the enclosed logic is long enough to be broken into multiple lines, as shown in the suggestion.
| if (chatflow.isPublic) | |
| return res.status(StatusCodes.OK).json({ ...chatflow, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) | |
| if (chatflow.isPublic) { | |
| return res.status(StatusCodes.OK).json({ ...chatflow, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) | |
| } |
References
- The comment highlights a potential bypass in a public endpoint's sanitizer that could expose sensitive data, directly contradicting the rule's requirement that public access should not expose sensitive data.
- The suggestion to always use curly braces with
ifstatements aligns with prioritizing readability and preventing potential bugs, which improves understandability and reduces future errors.
| if (!chatflow) return res.status(StatusCodes.NOT_FOUND).json({ message: 'Chatflow not found' }) | ||
| if (chatflow.isPublic) return res.status(StatusCodes.OK).json(chatflow) | ||
| if (chatflow.isPublic) | ||
| return res.status(StatusCodes.OK).json({ ...chatflow, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) |
There was a problem hiding this comment.
Returning the entire chatflow object via { ...chatflow } exposes internal metadata such as apikeyid and workspaceId to the public. Additionally, the chatbotConfig field is returned without any sanitization, which could leak sensitive configuration settings (e.g., allowedOrigins). It is recommended to return only the necessary fields and ensure all returned data is properly sanitized.
| return res.status(StatusCodes.OK).json({ ...chatflow, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) | |
| return res.status(StatusCodes.OK).json({ id: chatflow.id, name: chatflow.name, type: chatflow.type, flowData: sanitizeFlowDataForPublicEndpoint(chatflow.flowData) }) |
References
- The comment points out that returning the entire
chatflowobject exposes internal metadata and unsanitized configuration to the public, which violates the rule's condition that public endpoints should not expose sensitive data.
Sanitize flowData using sanitizeFlowDataForPublicEndpoint. FLOWISE-293