Skip to content

Fix FR-2: preserve Personal data during Pro app saves#63

Open
patrickkidd-hurin wants to merge 1 commit intomasterfrom
feat/tl-47
Open

Fix FR-2: preserve Personal data during Pro app saves#63
patrickkidd-hurin wants to merge 1 commit intomasterfrom
feat/tl-47

Conversation

@patrickkidd-hurin
Copy link
Copy Markdown
Collaborator

@patrickkidd-hurin patrickkidd-hurin commented Mar 3, 2026

Summary

  • Root cause: Diagram.update_with_version_check() replaced the entire pickle blob when new_data was provided. Pro app saves omit Personal-owned fields (pdp, clusters, clusterCacheKey), so those were silently destroyed.
  • Fix: Added _merge_personal_fields() in diagram.py that deserializes both existing and incoming blobs, and preserves Personal-owned fields from existing data when they're absent in the incoming data. When all fields are present (e.g. Personal app writes), the incoming blob passes through unchanged — zero overhead.
  • No API or schema changes — the fix is entirely within the model layer.

Test plan

  • 5 unit tests for _merge_personal_fields() covering: PDP preservation, cluster preservation, incoming data respected when present, empty existing data, and all-fields-present pass-through
  • 3 integration tests: Pro save preserves PDP, Pro save preserves clusters, Personal save can update PDP
  • 1 HTTP integration test: full concurrent write scenario through the Personal PUT endpoint
  • All 73 existing personal tests pass (11 e2e deselected)

Closes patrickkidd/theapp#47

🤖 Generated with Claude Code

Closes #47

When the Pro app saves a full DiagramData blob via update_with_version_check,
it omits Personal-owned fields (pdp, clusters, clusterCacheKey). Previously
this replaced the entire data column, silently destroying those fields.

The fix adds _merge_personal_fields() which deserializes both blobs and
copies Personal-owned fields from existing data when absent in incoming
data. When the Personal app writes (including all fields), no merge is
needed and the incoming blob passes through unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical data loss issue where Personal app-specific data fields were being inadvertently deleted when the Pro app saved diagram data. The solution introduces a robust merging mechanism that ensures these sensitive fields are preserved during updates, thereby maintaining data integrity across different application contexts without altering the external interface or data structure.

Highlights

  • Root Cause Identified: The Diagram.update_with_version_check() method was found to replace the entire pickle blob when new_data was provided, leading to the silent destruction of Personal-owned fields (like pdp, clusters, clusterCacheKey) during Pro app saves.
  • Data Preservation Logic Implemented: A new function, _merge_personal_fields(), was added to diagram.py. This function deserializes both existing and incoming data blobs and intelligently preserves Personal-owned fields from the existing data if they are absent in the incoming data.
  • Zero Overhead for Full Data Saves: The new merge logic is designed to have zero overhead when all Personal-owned fields are present in the incoming data (e.g., during Personal app writes), as it returns the incoming blob unchanged in such cases.
  • No API or Schema Changes: The fix is entirely contained within the model layer, ensuring no changes to existing APIs or database schemas.
Changelog
  • btcopilot/pro/models/diagram.py
    • Imported the logging module.
    • Defined _log for logging and PERSONAL_OWNED_FIELDS tuple to list fields owned by the Personal app.
    • Implemented _merge_personal_fields function to handle merging existing and incoming data blobs, specifically preserving Personal-owned fields if they are missing from the incoming data.
    • Modified the update_with_version_check method to call _merge_personal_fields when new_data is provided, ensuring personal data is not overwritten.
  • btcopilot/tests/personal/test_fr2_concurrent_writes.py
    • Created a new test file to address the FR-2 bug.
    • Included unit tests for _merge_personal_fields to verify PDP and cluster preservation, correct handling of incoming data, and no-op scenarios.
    • Added integration tests to simulate Pro app saves and confirm that Personal PDP and clusters are preserved.
    • Implemented an HTTP integration test to validate the fix in a full concurrent write scenario through the Personal PUT endpoint.
Activity
  • Five unit tests were added for _merge_personal_fields covering PDP preservation, cluster preservation, incoming data respect, empty existing data, and all-fields-present pass-through.
  • Three integration tests were implemented to verify Pro save preserves PDP, Pro save preserves clusters, and Personal save can update PDP.
  • One HTTP integration test was added to simulate a full concurrent write scenario through the Personal PUT endpoint.
  • All 73 existing personal tests passed, with 11 e2e tests deselected.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to resolve an issue where personal data was being lost during saves from the Pro app by introducing a _merge_personal_fields function to intelligently merge incoming and existing data. However, a critical security vulnerability has been identified: the use of pickle.loads() on untrusted user input within the _merge_personal_fields function. This allows for Remote Code Execution (RCE) via malicious pickle payloads in diagram data, including potential persistent RCE from database-stored payloads. It is strongly recommended to replace pickle with a safer serialization format like JSON for all user-supplied and database-stored diagram data. While the core logic for fixing data loss is sound, this security vulnerability must be addressed before merging.


import PyQt5.sip # Required for unpickling QtCore objects

incoming = pickle.loads(incoming_blob)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

The _merge_personal_fields function uses pickle.loads() on incoming_blob without validation, which is a critical security vulnerability. This incoming_blob originates from user-supplied data via the update route in btcopilot/personal/routes/diagrams.py, allowing an attacker to provide a malicious pickle payload for Remote Code Execution. The pickle module is not secure for untrusted data. It is highly recommended to switch to a safer serialization format like JSON, potentially implementing custom serialization/deserialization logic for complex objects.

# Incoming data has all Personal-owned fields — nothing to merge.
return incoming_blob

existing = pickle.loads(existing_blob)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

The function _merge_personal_fields also calls pickle.loads() on existing_blob, which is the current diagram data stored in the database (self.data). Since self.data can be updated by users via the update route with arbitrary data (as seen in the call to _merge_personal_fields on line 194), an attacker can store a malicious pickle payload in the database. Subsequent calls to this function will then execute the payload, leading to persistent Remote Code Execution (RCE).

@patrickkidd-hurin patrickkidd-hurin self-assigned this Mar 4, 2026
@patrickkidd
Copy link
Copy Markdown
Owner

This needs to happen but is not a priority right now.

@patrickkidd
Copy link
Copy Markdown
Owner

Ready for Patrick's review — CI passing, no conflicts. Tracked in patrickkidd/theapp#101.

@patrickkidd-hurin
Copy link
Copy Markdown
Collaborator Author

\ud83d\udce9 Task spawned \u2014 addressing your comment.

@patrickkidd-hurin
Copy link
Copy Markdown
Collaborator Author

Acknowledged — no action needed. PR is ready for your review. CI green, no conflicts.

@patrickkidd-hurin
Copy link
Copy Markdown
Collaborator Author

\ud83d\udce9 Follow-up queued \u2014 resuming agent session to address 3 comment(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants