Skip to content

Latest commit

 

History

History
200 lines (150 loc) · 5.39 KB

File metadata and controls

200 lines (150 loc) · 5.39 KB

Genie Chat Automated Setup

This directory contains scripts to automate the complete setup of the Genie Chat application.

What It Does

The setup script automates:

  1. Creates the Databricks App - Registers the app and obtains service principal
  2. Creates a Serverless SQL Warehouse - A small, cost-optimized warehouse for queries
  3. Generates Sample Data - 2,000 sales records with geographic and time series data
  4. Creates Unity Catalog Table - Uploads data to {catalog}.{schema}.genie_demo_sales
  5. Creates a Genie Space - Configured with sample questions and column descriptions
  6. Grants Permissions - Automatically grants the app's service principal access to warehouse, table, and Genie Space
  7. Updates app.yaml - Configures the app with the Genie Space ID and resources
  8. Deploys the App - Uploads code and deploys to Databricks Apps

Prerequisites

1. Node.js 18+ and npm

Required to build the frontend and backend.

# Check if installed
node --version  # Should be v18 or higher
npm --version

Install from nodejs.org if needed.

2. Python 3.8+

Required for the setup script.

# Check if installed
python3 --version

3. Databricks Authentication

Choose one method:

Option A: Databricks CLI (Recommended)

pip install databricks-cli
databricks configure

Enter your workspace URL and personal access token when prompted.

Option B: Environment Variables

export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi123456789...

Option C: OAuth (Browser-based)

export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
databricks auth login

This opens a browser for authentication - no token needed.

Verify authentication works:

databricks current-user me

4. Databricks Workspace Permissions

You need the following permissions:

  • CREATE CATALOG (or use an existing catalog you have access to)
  • CREATE SCHEMA (or use an existing schema)
  • CREATE TABLE
  • CREATE WAREHOUSE
  • CREATE GENIE SPACE

Usage

Quick Start

From the repository root:

./run_setup.sh \
  --workspace myworkspace.cloud.databricks.com \
  --catalog main \
  --schema genie_demo

All Options

./run_setup.sh \
  --workspace myworkspace.cloud.databricks.com \
  --catalog main \
  --schema genie_demo \
  --table-name genie_demo_sales \
  --warehouse-name genie_demo_warehouse \
  --num-records 2000 \
  --model-endpoint databricks-meta-llama-3-3-70b-instruct \
  --app-name genie-chat \
  --skip-build  # Skip npm build if already built
  --skip-deploy  # Skip deployment (setup only)

Options

Option Required Default Description
--workspace Yes - Databricks workspace URL
--catalog Yes - Unity Catalog name
--schema Yes - Schema name
--table-name No genie_demo_sales Demo table name
--warehouse-name No genie_demo_warehouse Warehouse name
--num-records No 2000 Number of sample records
--model-endpoint No databricks-meta-llama-3-1-70b-instruct Model endpoint for chart recommendations
--app-name No genie-chat Name for the Databricks App
--skip-build No false Skip building frontend/backend
--skip-deploy No false Skip app deployment

Sample Data

The generated genie_demo_sales table includes:

Column Type Description
id BIGINT Primary key
event_date DATE Sale date (2023-2024)
state_code STRING US state abbreviation (for choropleth maps)
state_name STRING Full state name
city STRING City name
latitude DOUBLE City latitude (for scatter geo maps)
longitude DOUBLE City longitude (for scatter geo maps)
category STRING Product category
region STRING US region (West, East, Midwest, South)
sales_amount DOUBLE Sale amount in USD
revenue DOUBLE Revenue/profit in USD
quantity INT Items sold

Test Queries

After setup, test these queries in the Genie Chat app:

  1. Choropleth Map: "Show total sales by state"
  2. Line Chart: "Show monthly revenue trend"
  3. Bar Chart: "Top 5 categories by sales"
  4. Scatter Geo: "Show all store locations on a map"
  5. Pie Chart: "What percentage of sales comes from each region?"

Troubleshooting

Authentication Errors

Error: Failed to authenticate with Databricks

Ensure Databricks CLI is configured:

databricks configure

Or set environment variables:

export DATABRICKS_HOST=myworkspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi...

Permission Errors

Error: User does not have CREATE on catalog

Contact your workspace admin to grant necessary permissions, or use an existing catalog/schema you have access to.

Warehouse Start Timeout

Error: Warehouse did not start within 300 seconds

This can happen during high-demand periods. Try again, or check the Databricks UI for warehouse status.

Manual Cleanup

To remove created resources:

-- Drop the table
DROP TABLE IF EXISTS {catalog}.{schema}.genie_demo_sales;

-- Delete the warehouse (via UI or CLI)
-- Compute → SQL Warehouses → Delete

-- Delete the Genie Space (via UI)
-- Genie → Click space → Settings → Delete