Skip to content

feat: add example datasets and workflows to Docker Compose#4247

Open
bobbai00 wants to merge 3 commits intoapache:mainfrom
bobbai00:feat/add-example-dataset-workflow
Open

feat: add example datasets and workflows to Docker Compose#4247
bobbai00 wants to merge 3 commits intoapache:mainfrom
bobbai00:feat/add-example-dataset-workflow

Conversation

@bobbai00
Copy link
Contributor

@bobbai00 bobbai00 commented Feb 28, 2026

What changes were proposed in this PR?

This PR adds example datasets and workflows that are automatically loaded into Texera when using Docker Compose, so new users have sample data to explore immediately after startup.

What gets loaded:

  • 2 datasets: Iris Species (5KB CSV), TMDb Popular Movies (327KB CSV)
  • 2 workflows: ML on Iris Dataset, Data Exploration on Movies Dataset

Key design choices:

  • Uses Docker Compose profiles — the loader only runs when explicitly opted in via docker compose --profile examples up, not on every restart
  • Uses a stock alpine:latest image with runtime apk add — no custom image build required
  • Idempotent — skips datasets/workflows that already exist
  • Admin credentials (USER_SYS_ADMIN_USERNAME/USER_SYS_ADMIN_PASSWORD) are defined in .env and shared between the Texera backend and the example loader

Files added:

  • bin/single-node/examples/ — datasets (CSV + descriptions), workflow JSONs, and load-examples.sh loader script
  • bin/single-node/docker-compose.yml — added example-data-loader service (Part 4)
  • bin/single-node/.env — added USER_SYS_ADMIN_USERNAME/USER_SYS_ADMIN_PASSWORD env vars

Any related issues, documentation, discussions?

No

How was this PR tested?

  1. Ran docker compose --profile examples up and verified:
    • The loader waits for services to be healthy
    • Datasets and workflows appear in the Texera UI
    • The loader container exits after loading
  2. Ran docker compose up (without --profile examples) and verified the loader does NOT start

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.6)

@chenlica
Copy link
Contributor

@kunwp1 Can you review it?

@chenlica chenlica requested a review from kunwp1 February 28, 2026 07:57
@bobbai00 bobbai00 force-pushed the feat/add-example-dataset-workflow branch from ae4a92a to afcadca Compare February 28, 2026 18:08
@bobbai00 bobbai00 force-pushed the feat/add-example-dataset-workflow branch from afcadca to 47e2b5d Compare February 28, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants