Skip to content

filer: switch workspace upload from import-file to /workspace/import#5165

Draft
shreyas-goenka wants to merge 1 commit intomainfrom
shreyas-goenka/import-api
Draft

filer: switch workspace upload from import-file to /workspace/import#5165
shreyas-goenka wants to merge 1 commit intomainfrom
shreyas-goenka/import-api

Conversation

@shreyas-goenka
Copy link
Copy Markdown
Contributor

Summary

Replace POST /api/2.0/workspace-files/import-file/{path}?overwrite=… with the multipart variant of POST /api/2.0/workspace/import (via the SDK's Workspace.Upload + format=AUTO). The legacy endpoint is being deprecated and the new endpoint is the strategic replacement.

Why multipart and not the JSON body of /workspace/import: the JSON body is server-capped at 10 MiB. Multipart accepts the same sizes import-file did (verified up to 250 MiB against a real workspace), so DAB users shipping wheels/jars/large files keep working.

Error mapping

Moves from URL-status + message-regex to errorCode dispatch:

  • 409 / 400 "Path (X) already exists" → 400 RESOURCE_ALREADY_EXISTS
  • new: 400 INVALID_PARAMETER_VALUE "Requested node type [X] is different from the existing node type [Y]" — also surfaced as fileAlreadyExistsError, since from the caller's perspective there is something at that path.

The mkdir + retry on 404 (CreateParentDirectories mode) and the 403 permissionError path are unchanged.

End-to-end verification

format=AUTO is verified for every workspace-filesystem object type DABs cares about, against a real workspace:

Local file Workspace object_type
.py with # Databricks notebook source NOTEBOOK (PYTHON), extension stripped
.sql with -- Databricks notebook source NOTEBOOK (SQL), extension stripped
.ipynb NOTEBOOK (PYTHON), extension stripped
.py without header FILE
.lvdash.json DASHBOARD, extension preserved
regular files FILE
60 MB binary FILE (uploaded successfully — would have failed with JSON body)

Alerts / jobs / pipelines / schemas / etc. are not files in the workspace; they're created via dedicated REST APIs and don't go through the filer.

Test plan

  • Unit tests in libs/filer/workspace_files_client_test.go updated for new error branches.
  • libs/testserver/handlers.go extended with multipart handler at /workspace/import.
  • acceptance/internal/prepare_server.go normalizes multipart bodies (sorted form fields, file parts recorded as {filename, size}) so request fixtures stay deterministic.
  • ~70 acceptance fixtures regenerated.
  • End-to-end verification against a real workspace for files, all notebook types, dashboards, and 60 MB binary.
  • CI green on this PR.

This pull request and its description were written by Isaac.

Replace POST /api/2.0/workspace-files/import-file/{path}?overwrite=… with
the multipart variant of POST /api/2.0/workspace/import (via the SDK's
Workspace.Upload + format=AUTO). The legacy endpoint is being deprecated
and the new endpoint is the strategic replacement.

Why multipart and not the JSON body of /workspace/import: the JSON body
is server-capped at 10 MiB. Multipart accepts the same sizes import-file
did (verified up to 250 MiB against a real workspace), so DAB users
shipping wheels/jars/large files keep working.

Error mapping moves from URL-status + message-regex to errorCode dispatch:
  - 409 / 400 "Path (X) already exists" → 400 RESOURCE_ALREADY_EXISTS
  - new: 400 INVALID_PARAMETER_VALUE "Requested node type [X] is
    different from the existing node type [Y]" — also surfaced as
    fileAlreadyExistsError, since from the caller's perspective there
    *is* something at that path.
The mkdir + retry on 404 (CreateParentDirectories mode) and the 403
permissionError path are unchanged.

format=AUTO is end-to-end-verified for every workspace-filesystem object
type DABs care about, against bundle-dev:
  - .py with magic header → NOTEBOOK (PYTHON), extension stripped
  - .sql with magic header → NOTEBOOK (SQL), extension stripped
  - .ipynb → NOTEBOOK (PYTHON), extension stripped
  - .py without header → FILE
  - .lvdash.json → DASHBOARD, extension preserved
  - regular files → FILE

Tests: extends libs/filer/workspace_files_client_test.go to cover the new
errorCode-based branches; libs/testserver/handlers.go gains a multipart
handler at /workspace/import; acceptance/internal/prepare_server.go
normalizes multipart bodies (sorted form fields, file parts recorded as
{filename,size}) so recorded-request fixtures stay deterministic. ~70
acceptance fixtures regenerated to reflect the new endpoint and request
shape; one (bundle/upload/internal_server_error) shows the deployment
locker now hits the stub before the user files do — same coverage, just
a different error site.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant