filer: switch workspace upload from import-file to /workspace/import#5165
Draft
shreyas-goenka wants to merge 1 commit intomainfrom
Draft
filer: switch workspace upload from import-file to /workspace/import#5165shreyas-goenka wants to merge 1 commit intomainfrom
shreyas-goenka wants to merge 1 commit intomainfrom
Conversation
Replace POST /api/2.0/workspace-files/import-file/{path}?overwrite=… with
the multipart variant of POST /api/2.0/workspace/import (via the SDK's
Workspace.Upload + format=AUTO). The legacy endpoint is being deprecated
and the new endpoint is the strategic replacement.
Why multipart and not the JSON body of /workspace/import: the JSON body
is server-capped at 10 MiB. Multipart accepts the same sizes import-file
did (verified up to 250 MiB against a real workspace), so DAB users
shipping wheels/jars/large files keep working.
Error mapping moves from URL-status + message-regex to errorCode dispatch:
- 409 / 400 "Path (X) already exists" → 400 RESOURCE_ALREADY_EXISTS
- new: 400 INVALID_PARAMETER_VALUE "Requested node type [X] is
different from the existing node type [Y]" — also surfaced as
fileAlreadyExistsError, since from the caller's perspective there
*is* something at that path.
The mkdir + retry on 404 (CreateParentDirectories mode) and the 403
permissionError path are unchanged.
format=AUTO is end-to-end-verified for every workspace-filesystem object
type DABs care about, against bundle-dev:
- .py with magic header → NOTEBOOK (PYTHON), extension stripped
- .sql with magic header → NOTEBOOK (SQL), extension stripped
- .ipynb → NOTEBOOK (PYTHON), extension stripped
- .py without header → FILE
- .lvdash.json → DASHBOARD, extension preserved
- regular files → FILE
Tests: extends libs/filer/workspace_files_client_test.go to cover the new
errorCode-based branches; libs/testserver/handlers.go gains a multipart
handler at /workspace/import; acceptance/internal/prepare_server.go
normalizes multipart bodies (sorted form fields, file parts recorded as
{filename,size}) so recorded-request fixtures stay deterministic. ~70
acceptance fixtures regenerated to reflect the new endpoint and request
shape; one (bundle/upload/internal_server_error) shows the deployment
locker now hits the stub before the user files do — same coverage, just
a different error site.
Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace
POST /api/2.0/workspace-files/import-file/{path}?overwrite=…with the multipart variant ofPOST /api/2.0/workspace/import(via the SDK'sWorkspace.Upload+format=AUTO). The legacy endpoint is being deprecated and the new endpoint is the strategic replacement.Why multipart and not the JSON body of
/workspace/import: the JSON body is server-capped at 10 MiB. Multipart accepts the same sizesimport-filedid (verified up to 250 MiB against a real workspace), so DAB users shipping wheels/jars/large files keep working.Error mapping
Moves from URL-status + message-regex to errorCode dispatch:
RESOURCE_ALREADY_EXISTSINVALID_PARAMETER_VALUE"Requested node type [X] is different from the existing node type [Y]" — also surfaced asfileAlreadyExistsError, since from the caller's perspective there is something at that path.The mkdir + retry on 404 (
CreateParentDirectoriesmode) and the 403permissionErrorpath are unchanged.End-to-end verification
format=AUTOis verified for every workspace-filesystem object type DABs cares about, against a real workspace:object_type.pywith# Databricks notebook sourceNOTEBOOK(PYTHON), extension stripped.sqlwith-- Databricks notebook sourceNOTEBOOK(SQL), extension stripped.ipynbNOTEBOOK(PYTHON), extension stripped.pywithout headerFILE.lvdash.jsonDASHBOARD, extension preservedFILEFILE(uploaded successfully — would have failed with JSON body)Alerts / jobs / pipelines / schemas / etc. are not files in the workspace; they're created via dedicated REST APIs and don't go through the filer.
Test plan
libs/filer/workspace_files_client_test.goupdated for new error branches.libs/testserver/handlers.goextended with multipart handler at/workspace/import.acceptance/internal/prepare_server.gonormalizes multipart bodies (sorted form fields, file parts recorded as{filename, size}) so request fixtures stay deterministic.This pull request and its description were written by Isaac.