Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,13 @@ coverage.xml
*.mo
*.pot

# Django stuff:
# Django
staticfiles/
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
*.sqlite3

# Flask stuff:
instance/
Expand Down Expand Up @@ -102,6 +104,9 @@ celerybeat.pid

# Environments
.env
.env.*
!.env.example
!.env.sample
.venv
env/
venv/
Expand Down Expand Up @@ -129,6 +134,7 @@ dmypy.json

# uv
**/uv.lock
!django-postgres-app/uv.lock

# Visual Studio Code
.vscode/
Expand All @@ -140,7 +146,7 @@ ml-models.iml
ml-models.ipr

.DS_Store
.databricks
.databricks/
.gradio

# Node.js / Next.js / Web development
Expand Down
136 changes: 136 additions & 0 deletions django-postgres-app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Django on Databricks Apps

This template deploys a Django application on [Databricks Apps](https://www.databricks.com/product/databricks-apps) with [Lakebase Autoscaling](https://docs.databricks.com/aws/en/oltp/projects/) as the PostgreSQL backend. It stays close to standard Django, adding code only where necessary to interface with Lakebase.

## Project structure

```
databricks.yml # Databricks Asset Bundle: Lakebase project, branch, and app
app.yaml # Databricks Apps runtime configuration
entrypoint.sh # Startup script: collectstatic, ensure_schema, migrate, gunicorn
config/settings.py # Django settings (database, CSRF, static files)
lakebase/base.py # Custom DB backend: injects OAuth tokens via Databricks SDK
todos/ # Sample Django app (CRUD todo list)
```

## Note on SECRET_KEY

The template ships with an insecure fallback `SECRET_KEY` for local development. Set a real key via the `SECRET_KEY` environment variable before deploying to production.

## Deploy to Databricks

### 1. Authenticate

```console
databricks auth login -p <PROFILE>
```

### 2. Deploy the bundle

```console
databricks bundle deploy -p <PROFILE> -t dev
```

This will fail on the first run because the Lakebase database does not exist yet.

### 3. Create the Lakebase database

DABs do not yet support `postgres_databases` as a resource type. Create it via the API instead.

1. Look up the role for your branch:

```console
databricks api get /api/2.0/postgres/projects/django/branches/development/roles -p <PROFILE>
```

2. Note the role `name` field, which should look something like `projects/django/branches/development/roles/rol-xxxx-xxxxxxxx`. Then, create the database:

```console
databricks api post \
'/api/2.0/postgres/projects/django/branches/development/databases?database_id=django-app' \
-p <PROFILE> \
--json '{"spec": {"postgres_database": "django_app", "role": "<ROLE NAME>"}}'
```

### 4. Deploy again

```console
databricks bundle deploy -p <PROFILE> -t dev
```

### 5. Start the app

```console
databricks bundle run django_app -p <PROFILE> -t dev
```

## Deploy as a git-backed app

Instead of syncing files to a workspace folder, you can deploy directly from a Git repository. The app reads code from a Git reference (branch, tag, or commit) each time you deploy. See the [Databricks docs](https://docs.databricks.com/aws/en/dev-tools/databricks-apps/deploy) for full details.

### 1. Create the app and configure the Git repository

In the Databricks UI:

1. Go to **Compute → Apps** and click **Create app**.
2. In the **Configure Git repository** step, enter your repository URL (e.g. `https://github.com/org/repo`) and select the Git provider.
3. For private repositories, click **Configure Git credential** on the app details page to give the app's service principal access. Public repositories do not require a credential.
4. Click **Create app**.

Or with the CLI:

```console
databricks apps create my-app-name \
--git-url https://github.com/org/repo \
--git-provider github
```

### 2. Deploy from Git

In the Databricks UI:

1. On the app details page, click **Deploy** and select **From Git**.
2. Enter the **Git reference** (`main`, a tag like `v1.0.0`, or a commit SHA).
3. Select the **Reference type** (branch, tag, or commit).
4. (Optional) Set **Source code path** if the app lives in a subdirectory of the repo (e.g. `django-postgres-app/`).
5. Click **Deploy**.

Or with the CLI:

```console
databricks apps deploy my-app-name \
--git-reference main \
--git-reference-type branch
```

To deploy from a subdirectory, add `--source-code-path django-postgres-app/`.

### 3. Redeploy after changes

Push your changes to the Git repository, then click **Deploy** again (or re-run the CLI deploy command). Databricks pulls the latest commit from the configured reference.

> **Note:** The Lakebase database still needs to be created separately as described in the DABs deployment section above. Git-backed deployment only changes how the app source code is delivered—resource provisioning remains the same.

## Run locally

1. Look up the `host` and endpoint `name`:

```console
databricks api get /api/2.0/postgres/projects/django/branches/development/endpoints -p <PROFILE>
```

2. Start the app with `run-local`, passing connection details as environment variables.

```console
# run the app code (assuming you have a virtual environment .venv there)
PATH=".venv/bin:$PATH" databricks apps run-local -p <PROFILE> \
--env PGENDPOINT=<ENDPOINT NAME> \
--env PGHOST=<HOST> \
--env PGPORT=5432 \
--env PGUSER=<YOUR USER> \
--env PGDATABASE=django_app \
--env PGSSLMODE=require \
--env DJANGO_DEV_EMAIL=<YOUR EMAIL>
```

`DJANGO_DEV_EMAIL` enables admin access during local development since the `X-Forwarded-Email` header is not present outside Databricks Apps. The authentication middleware falls back to this variable when set.
4 changes: 4 additions & 0 deletions django-postgres-app/app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
command: ["bash", "entrypoint.sh"]
env:
- name: PGENDPOINT
valueFrom: postgres
Empty file.
64 changes: 64 additions & 0 deletions django-postgres-app/config/auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
"""
Databricks Apps authentication for Django.

Databricks Apps authenticates users before requests reach the app and
forwards the user's email in the X-Forwarded-Email header. This module
wires that header into Django's RemoteUser infrastructure so that:
- Every authenticated Databricks user gets a Django user automatically.
- If DJANGO_SUPERUSERS is set (comma-separated emails), those users
are granted staff + superuser access on first login.
- Otherwise, the first user to access the app becomes the superuser.
Additional superusers can be managed via the admin portal.
"""

import os

from django.contrib.auth import authenticate, login, logout
from django.contrib.auth.backends import RemoteUserBackend


class DatabricksAppsMiddleware:
"""Authenticate users via the X-Forwarded-Email header set by Databricks Apps."""

def __init__(self, get_response):
self.get_response = get_response

def __call__(self, request):
email = request.META.get("HTTP_X_FORWARDED_EMAIL")
if not email:
email = os.environ.get("DJANGO_DEV_EMAIL")
if email and not request.user.is_authenticated:
user = authenticate(request, remote_user=email)
if user:
request.user = user
login(request, user)
elif not email and request.user.is_authenticated:
logout(request)
return self.get_response(request)


class DatabricksAppsBackend(RemoteUserBackend):
create_unknown_user = True

def configure_user(self, request, user, created=False):
if created:
from django.contrib.auth import get_user_model

User = get_user_model()
user.email = user.username

superuser_csv = os.environ.get("DJANGO_SUPERUSERS", "")
if superuser_csv.strip():
superuser_emails = {
e.strip().lower() for e in superuser_csv.split(",") if e.strip()
}
is_superuser = user.email.lower() in superuser_emails
else:
# No explicit list — first user becomes superuser.
is_superuser = not User.objects.filter(is_superuser=True).exists()

if is_superuser:
user.is_staff = True
user.is_superuser = True
user.save()
return user
147 changes: 147 additions & 0 deletions django-postgres-app/config/settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
"""
Django settings for the Databricks Apps + Lakebase.

The only Databricks-specific configuration here is the database ENGINE
("lakebase") which handles OAuth token injection transparently.
Everything else is standard Django.
"""

import os
from pathlib import Path

from utils import get_schema_name

BASE_DIR = Path(__file__).resolve().parent.parent

# Security
# WARNING: Set a real SECRET_KEY via environment variable in production.
# The fallback below is only safe for local development.
SECRET_KEY = os.environ.get(
"SECRET_KEY",
"django-insecure-change-me-in-production",
)
DEBUG = os.environ.get("DEBUG", "").lower() in ("true", "1", "yes")
# Databricks Apps run behind a reverse proxy and aren't directly exposed to the
# public internet, so we accept all hosts at the Django layer. If you deploy this
# outside Databricks Apps, replace # \"*\" with your real domain(s) to re‑enable
# Django’s Host header protection.
ALLOWED_HOSTS = ["*"]


# CSRF: Required for POST requests over HTTPS (Django 4.0+).
#
# Browsers send Origin/Referer for the app's *public* URL on *.databricksapps.com,
# not the workspace host in DATABRICKS_HOST (e.g. adb-....azuredatabricks.net).
# Wildcard entries match subdomains per Django's CSRF docs.
# Optional: set DJANGO_CSRF_TRUSTED_ORIGINS="https://a.com,https://b.com" for extras.
def _csrf_trusted_origins():
origins = []
raw = os.environ.get("DATABRICKS_HOST", "").strip().rstrip("/")
if raw:
if not raw.startswith(("http://", "https://")):
raw = f"https://{raw}"
origins.append(raw)
extra = os.environ.get("DJANGO_CSRF_TRUSTED_ORIGINS", "")
for part in extra.split(","):
part = part.strip().rstrip("/")
if part and part not in origins:
origins.append(part)
for pattern in (
"https://*.databricksapps.com",
"https://*.cloud.databricksapps.com",
):
if pattern not in origins:
origins.append(pattern)
return origins


CSRF_TRUSTED_ORIGINS = _csrf_trusted_origins()

SESSION_COOKIE_SECURE = not DEBUG
CSRF_COOKIE_SECURE = not DEBUG

# Application definition
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"todos",
]

MIDDLEWARE = [
"django.middleware.security.SecurityMiddleware",
"whitenoise.middleware.WhiteNoiseMiddleware",
"django.contrib.sessions.middleware.SessionMiddleware",
"django.middleware.common.CommonMiddleware",
"django.middleware.csrf.CsrfViewMiddleware",
"django.contrib.auth.middleware.AuthenticationMiddleware",
"config.auth.DatabricksAppsMiddleware",
"django.contrib.messages.middleware.MessageMiddleware",
"django.middleware.clickjacking.XFrameOptionsMiddleware",
]

AUTHENTICATION_BACKENDS = [
"config.auth.DatabricksAppsBackend",
]

ROOT_URLCONF = "config.urls"

TEMPLATES = [
{
"BACKEND": "django.template.backends.django.DjangoTemplates",
"DIRS": [],
"APP_DIRS": True,
"OPTIONS": {
"context_processors": [
"django.template.context_processors.request",
"django.contrib.auth.context_processors.auth",
"django.contrib.messages.context_processors.messages",
],
},
},
]

WSGI_APPLICATION = "config.wsgi.application"


# Database
# https://docs.djangoproject.com/en/5.2/ref/settings/#databases
#
# The "lakebase" engine extends Django's PostgreSQL backend to inject
# OAuth tokens via the Databricks SDK. No password is needed in settings.


DATABASES = {
"default": {
"ENGINE": "lakebase",
"NAME": os.environ.get("PGDATABASE", ""),
"USER": os.environ.get("PGUSER", ""),
"HOST": os.environ.get("PGHOST", ""),
"PORT": os.environ.get("PGPORT", ""),
"OPTIONS": {
"sslmode": os.environ.get("PGSSLMODE", "require"),
"options": f"-c search_path={get_schema_name()}",
},
}
}

# Internationalization
LANGUAGE_CODE = "en-us"
TIME_ZONE = "UTC"
USE_I18N = True
USE_TZ = True

# Static files
STATIC_URL = "static/"
STATIC_ROOT = BASE_DIR / "staticfiles"
STORAGES = {
"staticfiles": {
"BACKEND": "whitenoise.storage.CompressedManifestStaticFilesStorage",
},
}

# Default primary key field type
DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"
Loading