-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add data refresh capability for Python-based tables with automatic derived table regeneration #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Co-authored-by: Chenglong-MS <93549116+Chenglong-MS@users.noreply.github.com>
…ssues Co-authored-by: Chenglong-MS <93549116+Chenglong-MS@users.noreply.github.com>
Co-authored-by: Chenglong-MS <93549116+Chenglong-MS@users.noreply.github.com>
Chenglong-MS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is incorrect we should do this function as recalculate_derived_data, we need to sub functions:
- recalc_derived_data_py(): update a table based on python, if this is the case, we will provide a list of input tables, and the derived table code, then return the new derived table by applying the python code on updated inputs
- recalc_derived_data_sql(): this will simply rerun the query against duck db
decide which one to call based on whether the derived data is virtual or not in the backend. Call each table update independently (i.e., when a table update, we first find the list of derived table that are affected, and then for each affected table, gather their actual input data, and rerun code to get the new content and update) we don't need to do this recursively)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for now, let's only do recalc_derived_data_py, as we first only support python data refresh. This simplifies the refresh logic for remote/duckdb data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to recalculate_derived_data endpoint with recalc_derived_data_py() function. Virtual/SQL table support skipped - returns 'skipped' status. Each affected table is processed independently with actual input data. Fixed in commit f611d1f.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to update concept shelf, since it requires all columns are the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed concept shelf update from updateTableRows action since columns must be identical. Fixed in commit f611d1f.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's first only implement refresh for python (local data, skip database refresh for now), and when they need to replace, they can either upload a new data from file, url, or paste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's gray out the refresh button for virtual tables (duckdb-based)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refresh button now disabled (grayed out) for virtual (DuckDB) tables with message "DuckDB tables not supported". Only Python/local data refresh is supported. Fixed in commit f611d1f.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refresh button disabled for virtual tables with explanatory message. Fixed in commit f611d1f.
…able for virtual tables Co-authored-by: Chenglong-MS <93549116+Chenglong-MS@users.noreply.github.com>
All issues addressed in commit f611d1f:
Build succeeds and Python syntax validated. |
Users need to update base data after creating charts without losing derived transformations. Currently no mechanism exists to refresh data and propagate changes to dependent tables.
Changes
Backend (
tables_routes.py)/api/tables/recalculate-derived-dataendpoint that re-executes Python transformation code in subprocess for securityrecalc_derived_data_py()function that processes each affected derived table independentlyFrontend (
DataThread.tsx)RefreshDataDialogcomponent supporting file upload (CSV/XLSX/JSON) or raw JSON pasteState Management (
dfSlice.tsx)updateTableRowsaction to update table data (columns must be identical)Usage
Limitations
Security
Python transformations execute in subprocess with audit hooks blocking file writes and dangerous operations.
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.