Build Databricks projects with AI coding assistants (Claude Code, Cursor, etc.) using MCP (Model Context Protocol).
The AI Dev Kit provides everything you need to build on Databricks using AI assistants:
- High-level Python functions for Databricks operations
- MCP server that exposes these functions as tools for AI assistants
- Skills that teach AI assistants best practices and patterns
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Project β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β databricks-skills/ β β .claude/mcp.json β β
β β β β β β
β β Knowledge & Patterns β β MCP Server Config β β
β β β’ dabs-writer β β β databricks-mcp-server β β
β β β’ sdp-writer β β β β
β β β’ synthetic-data-gen β βββββββββββββββββ¬ββββββββββββββββββ β
β β β’ databricks-sdk β β β
β βββββββββββββ¬ββββββββββββββ β β
β β β β
β β SKILLS teach β TOOLS execute β
β β HOW to do things β actions on β
β β β Databricks β
β βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Claude Code β β
β β β β
β β "Create a DAB with a DLT pipeline and deploy to dev/prod" β β
β β β β
β β β Uses SKILLS to know the patterns and best practices β β
β β β Uses MCP TOOLS to execute SQL, create pipelines, etc. β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β MCP Protocol
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β databricks-mcp-server β
β β
β Exposes Python functions as MCP tools via stdio transport β
β β’ execute_sql, execute_sql_multi β
β β’ get_table_details, list_warehouses β
β β’ run_python_file_on_databricks β
β β’ ka_create, mas_create, genie_create (Agent Bricks) β
β β’ create_pipeline, start_pipeline (SDP) β
β β’ ... and more β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β Python imports
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β databricks-mcp-core β
β β
β Pure Python library with high-level Databricks functions β
β β
β βββ sql/ SQL execution, warehouses, table stats β
β βββ unity_catalog/ Catalogs, schemas, tables β
β βββ compute/ Execution contexts, run code on clusters β
β βββ spark_declarative_pipelines/ DLT/SDP pipeline management β
β βββ agent_bricks/ Genie, Knowledge Assistants, MAS β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β API calls
βΌ
βββββββββββββββββββββββ
β Databricks β
β Workspace β
βββββββββββββββββββββββ
# Clone the repository
git clone https://github.com/databricks-solutions/ai-dev-kit.git
cd ai-dev-kit
# Install the core library
cd databricks-mcp-core
uv pip install -e .
# Install the MCP server
cd ../databricks-mcp-server
uv pip install -e .# Option 1: Environment variables
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-token"
# Option 2: Use a profile from ~/.databrickscfg
export DATABRICKS_CONFIG_PROFILE="your-profile"In your project directory, create .claude/mcp.json:
{
"mcpServers": {
"databricks": {
"command": "uv",
"args": ["run", "python", "-m", "databricks_mcp_server.server"],
"cwd": "/path/to/ai-dev-kit/databricks-mcp-server",
"defer_loading": true
}
}
}Replace /path/to/ai-dev-kit with the actual path where you cloned the repo.
Skills teach Claude best practices and patterns:
# In your project directory
curl -sSL https://raw.githubusercontent.com/databricks-solutions/ai-dev-kit/main/databricks-skills/install_skills.sh | bashThis installs to .claude/skills/:
- dabs-writer: Databricks Asset Bundles patterns
- sdp-writer: Spark Declarative Pipelines (DLT)
- synthetic-data-generation: Realistic test data generation
- databricks-python-sdk: SDK and API usage
cd /path/to/your/project
claudeClaude now has both skills (knowledge) and MCP tools (actions) for Databricks!
| Component | Description |
|---|---|
| databricks-mcp-core | Pure Python library with Databricks functions |
| databricks-mcp-server | MCP server wrapping core functions as tools |
| databricks-skills | Skills for Claude Code with patterns & examples |
The core library (databricks-mcp-core) is framework-agnostic. While databricks-mcp-server exposes it via MCP for Claude Code, you can use the same functions with any AI agent framework.
from databricks_mcp_core.sql import execute_sql, get_table_details, TableStatLevel
results = execute_sql("SELECT * FROM my_catalog.my_schema.customers LIMIT 10")
stats = get_table_details(
catalog="my_catalog",
schema="my_schema",
table_names=["customers", "orders"],
table_stat_level=TableStatLevel.DETAILED
)from langchain_core.tools import tool
from databricks_mcp_core.sql import execute_sql, get_table_details
from databricks_mcp_core.file import upload_folder
@tool
def run_sql(query: str) -> list:
"""Execute a SQL query on Databricks and return results."""
return execute_sql(query)
@tool
def get_table_info(catalog: str, schema: str, tables: list[str]) -> dict:
"""Get schema and statistics for Databricks tables."""
return get_table_details(catalog, schema, tables).model_dump()
@tool
def upload_to_workspace(local_path: str, workspace_path: str) -> dict:
"""Upload a local folder to Databricks workspace."""
result = upload_folder(local_path, workspace_path)
return {"success": result.success, "files": result.total_files}
# Use with any LangChain agent
tools = [run_sql, get_table_info, upload_to_workspace]from agents import Agent, function_tool
from databricks_mcp_core.sql import execute_sql
from databricks_mcp_core.spark_declarative_pipelines.pipelines import (
create_pipeline, start_update, get_update
)
@function_tool
def run_sql(query: str) -> list:
"""Execute a SQL query on Databricks."""
return execute_sql(query)
@function_tool
def create_sdp_pipeline(name: str, catalog: str, schema: str, notebook_paths: list[str]) -> dict:
"""Create a Spark Declarative Pipeline."""
result = create_pipeline(name, f"/Workspace/{name}", catalog, schema, notebook_paths)
return {"pipeline_id": result.pipeline_id}
agent = Agent(
name="Databricks Agent",
tools=[run_sql, create_sdp_pipeline],
)This separation allows you to:
- Use the same Databricks functions across different agent frameworks
- Build custom integrations without MCP overhead
- Test functions directly in Python scripts
- databricks-mcp-core README - Core library details, all functions
- databricks-mcp-server README - Server configuration
- databricks-skills README - Skills installation and usage
# Clone the repo
git clone https://github.com/databricks-solutions/ai-dev-kit.git
cd ai-dev-kit
# Install with uv
uv pip install -e databricks-mcp-core
uv pip install -e databricks-mcp-server
# Run tests
cd databricks-mcp-core
uv run pytest tests/integration/ -vΒ© 2025 Databricks, Inc. All rights reserved. The source in this project is provided subject to the Databricks License.
| Package | License | Copyright |
|---|---|---|
| databricks-sdk | Apache License 2.0 | Copyright (c) Databricks, Inc. |
| fastmcp | MIT License | Copyright (c) 2024 Jeremiah Lowin |
| pydantic | MIT License | Copyright (c) 2017 Samuel Colvin |
| sqlglot | MIT License | Copyright (c) 2022 Toby Mao |
| sqlfluff | MIT License | Copyright (c) 2019 Alan Cruickshank |