Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion .claude/skills/add-tools-langgraph/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ description: "Add tools to your agent and grant required permissions in databric

> **Profile reminder:** All `databricks` CLI commands must include the profile from `.env`: `databricks <command> --profile <profile>`

> Don't have the resource yet? See **create-tools** skill first.

**After adding any MCP server to your agent, you MUST grant the app access in `databricks.yml`.**

Without this, you'll get permission errors when the agent tries to use the resource.
Expand Down Expand Up @@ -97,10 +99,25 @@ env:

**Critical:** Every `value_from` value must match a `name` field in `databricks.yml` resources.

## MCP Error Handling

MCP tool calls can fail (network issues, permission errors, timeouts). Use `handle_tool_error` on MCP servers to catch errors and return them to the LLM instead of crashing the agent:

```python
DatabricksMCPServer(
name="genie",
url=f"{host}/api/2.0/mcp/genie/{space_id}",
handle_tool_error=True, # Return error messages to LLM instead of raising
timeout=60.0, # Increase timeout for slow tools like Genie
)
```

For local function tools defined with `@tool`, see `create-tools` skill > `examples/local-python-tools.md` for the `ToolException` + `handle_tool_error` pattern.

## Important Notes

- **MLflow experiment**: Already configured in template, no action needed
- **Multiple resources**: Add multiple entries under `resources:` list
- **Permission types vary**: Each resource type has specific permission values
- **Deploy + Run after changes**: Run both `databricks bundle deploy` AND `databricks bundle run agent_langgraph`
- **Deploy + Run after changes**: Run both `databricks bundle deploy` AND `databricks bundle run {{BUNDLE_NAME}}`
- **value_from matching**: Ensure `config.env` `value_from` values match `databricks.yml` resource `name` values
16 changes: 16 additions & 0 deletions .claude/skills/add-tools-openai/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ description: "Add tools to your agent and grant required permissions in databric

> **Profile reminder:** All `databricks` CLI commands must include the profile from `.env`: `databricks <command> --profile <profile>`

> Don't have the resource yet? See **create-tools** skill first.

**After adding any MCP server to your agent, you MUST grant the app access in `databricks.yml`.**

Without this, you'll get permission errors when the agent tries to use the resource.
Expand Down Expand Up @@ -77,6 +79,20 @@ databricks apps update-permissions <mcp-server-app-name> \

See `examples/custom-mcp-server.md` for detailed steps.

## MCP Error Handling

MCP tool calls can fail (network issues, permission errors, timeouts). The OpenAI Agents SDK catches tool errors by default and returns the error message to the LLM. To customize timeout behavior for MCP servers:

```python
mcp_server = McpServer(
url=f"{host}/api/2.0/mcp/genie/{space_id}",
name="genie",
timeout=60.0, # Increase timeout for slow tools like Genie (default: 20s)
)
```

For local function tools, see `create-tools` skill > `examples/local-python-tools.md` for `failure_error_function` patterns.

## Important Notes

- **MLflow experiment**: Already configured in template, no action needed
Expand Down
26 changes: 26 additions & 0 deletions .claude/skills/create-tools/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
name: create-tools
description: "Create Databricks resources that agents connect to as tools. Use when: (1) User needs to create a Genie space, vector search index, UC function, or UC connection, (2) User says 'create tool', 'set up genie', 'create vector search', 'register MCP server', (3) Before add-tools when the resource doesn't exist yet, (4) User asks 'what do I need to create before adding this tool'."
---

# Create Tool Resources

> This skill covers creating the Databricks resources your agent connects to.
> After creating a resource, use the **add-tools** skill to wire it into your agent and grant permissions.

## Which resource do you need?

| I want my agent to... | Resource to create | Guide |
|---|---|---|
| Answer questions about structured data | Genie space | `examples/genie-space.md` |
| Search documents / RAG | Vector Search index | `examples/vector-search-index.md` |
| Call custom SQL/Python logic | UC function | `examples/uc-function.md` |
| Connect to an external MCP server | UC connection | `examples/uc-connection.md` |
| Add inline Python tools | Local function tools | `examples/local-python-tools.md` |

## Workflow

1. **Discover** existing resources: `uv run discover-tools` (see **discover-tools** skill)
2. **Create** the resource if it doesn't exist (this skill)
3. **Add** the MCP server to your agent code + grant permissions (see **add-tools** skill)
4. **Deploy** (see **deploy** skill)
35 changes: 35 additions & 0 deletions .claude/skills/create-tools/examples/genie-space.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Create a Genie Space

Genie spaces let agents query structured data in Unity Catalog tables using natural language. A Genie space can include up to 30 tables or views.

## Create via Databricks UI

1. In your workspace, go to **Genie** in the left sidebar.
2. Click **New** to create a new Genie space.
3. Add the Unity Catalog tables or views your agent needs to query.
4. Configure instructions to guide how Genie interprets queries (optional but recommended).
5. Configure a default SQL warehouse: go to **Configure** > **Settings** > **Default warehouse**.
6. Share the space with the app's service principal:
- Click **Share** in the top right
- Enter the service principal name, click **Add**, and set the permission level to **CAN RUN**
- To find your app's service principal: `databricks apps get <app-name> --output json --profile <profile> | jq -r '.service_principal_name'`

## Find the space ID

The space ID is in the URL when viewing the Genie space:

```
https://<workspace>.databricks.com/genie/rooms/<space-id>?o=...
```

To list all Genie spaces via CLI:

```bash
databricks genie list-spaces --profile <profile>
Comment thread
dhruv0811 marked this conversation as resolved.
```

## Next step

Wire the Genie space into your agent and grant permissions. See the **add-tools** skill and use `examples/genie-space.yaml` for the `databricks.yml` resource grant.

MCP URL: `{host}/api/2.0/mcp/genie/{space_id}` (OAuth scope for on-behalf-of-user auth: `genie`)
103 changes: 103 additions & 0 deletions .claude/skills/create-tools/examples/local-python-tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Local Python Function Tools

For operations that don't need external data sources or MCP servers, define tools directly in your agent code. These run in the same process as your agent — no resource creation or `databricks.yml` permissions needed.

## When to use local tools vs. MCP

- **Local tools**: Simple logic, API calls with custom auth, data transformations, utility functions
- **MCP tools**: When you need Databricks-managed auth, UC governance, or access to Databricks resources (tables, indexes, Genie)

## OpenAI Agents SDK

```python
from agents import Agent, function_tool

@function_tool
def get_current_time() -> str:
"""Get the current date and time in ISO format."""
from datetime import datetime
return datetime.now().isoformat()

@function_tool
def calculate_discount(price: float, percent: float) -> str:
"""Calculate a discounted price. Returns the new price after applying the discount."""
discounted = price * (1 - percent / 100)
return f"${discounted:.2f}"

agent = Agent(
name="My agent",
instructions="You are a helpful assistant.",
model="databricks-claude-sonnet-4-5",
tools=[get_current_time, calculate_discount],
)
```

## LangGraph

```python
from langchain_core.tools import tool

@tool
def get_current_time() -> str:
"""Get the current date and time in ISO format."""
from datetime import datetime
return datetime.now().isoformat()

@tool
def calculate_discount(price: float, percent: float) -> str:
"""Calculate a discounted price. Returns the new price after applying the discount."""
discounted = price * (1 - percent / 100)
return f"${discounted:.2f}"

# Pass to create_react_agent or add to your tools list
tools = [get_current_time, calculate_discount]
```

## Error handling

Both SDKs handle tool errors gracefully by default — the error message is returned to the LLM so it can retry or respond to the user. For custom error messages, use the patterns below.

### OpenAI Agents SDK

`@function_tool` includes a built-in `default_tool_error_function` that catches exceptions and returns `"An error occurred while running the tool. Error: {error}"` to the LLM. To customize:

```python
from agents import RunContextWrapper, function_tool

def handle_api_error(ctx: RunContextWrapper, error: Exception) -> str:
"""Return a helpful error message the LLM can act on."""
return f"Tool failed: {error}. Try a different query or ask the user for clarification."

@function_tool(failure_error_function=handle_api_error)
def call_external_api(query: str) -> str:
"""Call an external API."""
# If this raises, handle_api_error returns a message to the LLM
...
```

### LangGraph

LangGraph tools raise by default. To return errors to the LLM instead of crashing, raise `ToolException` and set `handle_tool_error`:

```python
from langchain_core.tools import tool, ToolException

@tool
def call_external_api(query: str) -> str:
"""Call an external API."""
try:
...
except Exception as e:
raise ToolException(f"API call failed: {e}. Try a different query.")

# Enable error handling on the tool
call_external_api.handle_tool_error = True
```

Set `handle_tool_error=True` for a generic message, or assign a string/callable for custom messages. Only `ToolException` is caught — other exceptions still raise.

## Tips

- The docstring becomes the tool description the LLM sees — make it clear and specific
- Type annotations on parameters help the LLM provide correct arguments
- Local tools can call the Databricks SDK, external APIs, or any Python library
65 changes: 65 additions & 0 deletions .claude/skills/create-tools/examples/uc-connection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Create a UC Connection for External MCP Servers

Unity Catalog HTTP connections let you register external MCP servers so Databricks can securely proxy requests and manage credentials. After creating the connection, your agent accesses the external MCP server through a managed Databricks endpoint.

## Create the connection

### Option 1: Managed OAuth (Glean, GitHub, Atlassian, Google Drive, SharePoint)

For supported providers, Databricks manages the OAuth credentials. Create the connection in the Databricks UI:

1. Go to **Catalog** > **External Data** > **Connections**
2. Click **Create connection**
3. Select **HTTP** connection type
4. Choose **OAuth User to Machine Per User** auth type
5. Select the provider from the **OAuth Provider** drop-down
6. Configure scopes as needed

You can also install pre-built integrations from the **Databricks Marketplace**.

See [external MCP docs](https://docs.databricks.com/aws/en/generative-ai/mcp/external-mcp) for the full list of supported providers, scopes, and setup methods.

### Option 2: CLI with bearer token

```bash
databricks connections create --json '{
"name": "my-external-mcp",
"connection_type": "HTTP",
"options": {
"host": "https://mcp.example.com",
"base_path": "/api",
"bearer_token": "<your-token>"
}
}' --profile <profile>
```

### Option 3: CLI with OAuth M2M

```bash
databricks connections create --json '{
"name": "my-external-mcp",
"connection_type": "HTTP",
"options": {
"host": "https://mcp.example.com",
"base_path": "/mcp",
"client_id": "<client-id>",
"client_secret": "<client-secret>",
"token_endpoint": "https://auth.example.com/oauth/token",
"oauth_scope": "read write"
}
}' --profile <profile>
Comment thread
dhruv0811 marked this conversation as resolved.
```

## Verify

```bash
databricks connections get my-external-mcp --profile <profile>
```

## Next step

Wire the external MCP server into your agent. See the **add-tools** skill and use `examples/uc-connection.yaml` for the `databricks.yml` resource grant.

MCP URL: `{host}/api/2.0/mcp/external/{connection_name}`
Comment thread
dhruv0811 marked this conversation as resolved.

> You can also access the external server through the [UC connections proxy](https://docs.databricks.com/aws/en/query-federation/http#proxy), which works with any HTTP or MCP client and supports arbitrary sub-paths and all HTTP methods: `{host}/api/2.0/unity-catalog/connections/{connection_name}/proxy[/<sub-path>]`
67 changes: 67 additions & 0 deletions .claude/skills/create-tools/examples/uc-function.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Create a Unity Catalog Function

UC functions let agents run custom SQL or Python logic. Expose them as tools via the managed MCP server for UC functions.

## Option 1: SQL function (recommended for data lookups)

Run this in a SQL warehouse or notebook:

```sql
CREATE OR REPLACE FUNCTION catalog.schema.lookup_customer(
customer_name STRING COMMENT 'Name of the customer to look up'
)
RETURNS STRING
COMMENT 'Returns customer metadata including email and account ID. Use this when the user asks about a specific customer.'
RETURN SELECT CONCAT(
'Customer ID: ', customer_id, ', ',
'Email: ', email
)
FROM catalog.schema.customers
WHERE name = customer_name
LIMIT 1;
```

Via CLI:

```bash
databricks api post /api/2.0/sql/statements --json '{
"warehouse_id": "<warehouse-id>",
"statement": "CREATE OR REPLACE FUNCTION catalog.schema.my_func(...) ..."
}' --profile <profile>
```

## Option 2: Python function

```sql
CREATE OR REPLACE FUNCTION catalog.schema.analyze_text(
text STRING COMMENT 'Text to analyze'
)
RETURNS STRING
LANGUAGE PYTHON
COMMENT 'Analyzes text and returns a summary of key entities found.'
AS $$
# Python code runs in serverless compute
entities = [word for word in text.split() if word[0].isupper()]
return f"Found {len(entities)} potential entities: {', '.join(entities[:5])}"
$$;
```

## Writing effective tool descriptions

The `COMMENT` clause is critical — the LLM uses it to decide when to call the tool.

- **Function COMMENT**: Describe what the function does and when to use it
- **Parameter COMMENT**: Describe what values the parameter accepts
- Be specific: "Returns customer email and ID given a customer name" is better than "Looks up customer info"

## Verify

```bash
databricks functions get catalog.schema.my_func --profile <profile>
```

## Next step

Wire the UC function into your agent. See the **add-tools** skill and use `examples/uc-function.yaml` for the `databricks.yml` resource grant.

MCP URL: `{host}/api/2.0/mcp/functions/{catalog}/{schema}` (exposes all functions in the schema) or `{host}/api/2.0/mcp/functions/{catalog}/{schema}/{function_name}` (single function) (OAuth scope for on-behalf-of-user auth: `unity-catalog`)
Loading