Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 183 additions & 0 deletions .claude/skills/use-supervisor-api/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
---
name: use-supervisor-api
description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools."
---

# Use the Databricks Supervisor API

The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest.

## When to Use

Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections.

**Limitations:**
- Cannot mix hosted tools with client-side function tools in the same request
- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed

## Step 1: Install `databricks-openai`

Add to `pyproject.toml` if not already present:

```toml
[project]
dependencies = [
...
"databricks-openai>=0.14.0",
"databricks-sdk>=0.55.0",
]
```

Then run `uv sync`.

## Step 2: Declare Hosted Tools

Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace.

```python
TOOLS = [
# Genie space — natural language queries over structured data
{
"type": "genie_space",
"genie_space": {
"description": "Query sales data using natural language",
"space_id": "<genie-space-id>",
},
},
# UC function — SQL or Python UDF
{
"type": "unity_catalog_function",
"unity_catalog_function": {
"name": "<catalog>.<schema>.<function_name>",
"description": "Executes a custom UC function",
},
},
# KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent
# Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints.
# KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not.
{
"type": "agent_endpoint",
"agent_endpoint": {
"name": "my-ka-agent",
"description": "A Knowledge Assistant agent",
"endpoint_name": "<ka-serving-endpoint-name>",
},
},
# External MCP server via UC connection
{
"type": "external_mcp_server",
"external_mcp_server": {
"description": "An external MCP server",
"connection_name": "<uc-connection-name>",
},
},
]
```

## Step 3: Update `agent_server/agent.py`

Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely.

`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace.

When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal.

```python
import mlflow
from databricks.sdk import WorkspaceClient
from databricks.sdk.config import Config
from databricks_openai import DatabricksOpenAI
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
)

mlflow.openai.autolog()

MODEL = "databricks-claude-sonnet-4-5"
TOOLS = [...] # From Step 2

# Resolve and cache the AI Gateway URL once at module load
_wc = WorkspaceClient()
_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True)
_ai_gateway_base_url = str(_client.base_url)


def _get_client(obo_token: str | None = None) -> DatabricksOpenAI:
"""Return a client using the OBO token if provided, else service principal."""
if obo_token:
obo_wc = WorkspaceClient(
config=Config(host=_wc.config.host, token=obo_token)
)
return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url)
return _client


def _obo_token(request: ResponsesAgentRequest) -> str | None:
return (request.custom_inputs or {}).get("x-forwarded-access-token")


@invoke()
def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
mlflow.update_current_trace(
metadata={"mlflow.trace.session": request.context.conversation_id}
)
response = _get_client(_obo_token(request)).responses.create(
model=MODEL,
input=[i.model_dump() for i in request.input],
tools=TOOLS,
stream=False,
)
return ResponsesAgentResponse(output=[item.model_dump() for item in response.output])


@stream()
def stream_handler(request: ResponsesAgentRequest):
mlflow.update_current_trace(
metadata={"mlflow.trace.session": request.context.conversation_id}
)
return _get_client(_obo_token(request)).responses.create(
model=MODEL,
input=[i.model_dump() for i in request.input],
tools=TOOLS,
stream=True,
)
```

> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app.

## Step 4: Grant Permissions in `databricks.yml`

For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples.

| Tool type | Resource to grant |
|-----------|-------------------|
| `genie_space` | `genie_space` with `CAN_RUN` |
| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` |
| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) |
| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update these to correspond with the latest:

Here is the **complete, up-to-date list** of all tool types supported by the Supervisor API (`POST /api/2.0/mas/responses`), based on the current codebase:

## Allowlisted tool types (via `/responses` endpoint)

These are the tool types accepted by `_validate_responses_tool_types()` — the default allowlist in `_RESPONSES_DEFAULT_ALLOWED_TOOL_TYPES`:

| # | Tool type (as sent in `tools[].type`) | Description |
|---|---|---|
| 1 | **`genie_space`** | Genie AI/BI space — queries a Genie room by `space_id` |
| 2 | **`serving_endpoint`** | Serving endpoint (e.g., Knowledge Agent) — calls a model serving endpoint |
| 3 | **`unity_catalog_function`** | Unity Catalog function — executes a UC function by its 3-level path |
| 4 | **`external_mcp_server`** | External MCP server via UC connection — connects to an MCP server through a UC connection name |
| 5 | **`databricks_apps_mcp`** | Databricks Apps MCP — connects to an MCP server hosted as a Databricks App |
| 6 | **`code_interpreter`** | Code interpreter — executes Python code in a sandboxed REPL |

## Additional tool types (handled in code but not in the `/responses` allowlist by default)

These are processed by `_build_supervisor_conf_from_openai_format()` but are gated behind the `databricks.tiles.mas.enabledTools` Safe flag or used internally:

| # | Tool type | Description |
|---|---|---|
| 7 | **`function`** | Client-defined function tools (OpenAI-compatible) — always allowed, passed to the LLM as callable functions |
| 8 | **`agent`** | Sub-agent tool — delegates to another agent |
| 9 | **`vector_search`** | Vector search index — searches a Databricks vector search index (internal, stored in `request_context`) |
| 10 | **`iretriever`** | Instructed retriever — semantic document retrieval with knowledge sources (internal, stored in `request_context`) |

## Alias mapping

The codebase also accepts **alternative names** for some tool types via `_TOOL_TYPE_ALIASES`. These are Supervisor Agent naming conventions that map to the internal MAS names:

| Alias (what you can send) | Maps to internally |
|---|---|
| `genie_space` | `genie` |
| `serving_endpoint` | `agent_endpoint` |
| `unity_catalog_function` | `uc_function` |
| `external_mcp_server` | `mcp` |
| `databricks_apps_mcp` | `mcp` |

So for example, sending `type: "genie_space"` or `type: "genie"` both work — the alias is resolved before processing.

## Connector shorthands (sub-type of `external_mcp_server` / `mcp`)

When using `external_mcp_server` or `mcp` tool type, you can specify a `connector` shorthand instead of a `connection_name`. These auto-create system-managed UC connections:

| Connector shorthand | Service | Has built-in MCP server? |
|---|---|---|
| `google_drive` | Google Drive | Yes (uses internal MAS adapter) |
| `sharepoint` | SharePoint | Yes (uses internal MAS adapter) |
| `github_mcp` | GitHub | No (uses external MCP proxy) |
| `atlassian_mcp` | Atlassian (Jira/Confluence) | No (uses external MCP proxy) |

---

So your table should be updated to:

| Tool type | Resource to grant |
|-----------|-------------------|
| `genie_space` | `genie_space` with `CAN_RUN` |
| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` |
| `serving_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) |
| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` |
| `databricks_apps_mcp` | *(Databricks App access)* |
| `code_interpreter` | *(no external resource — sandboxed REPL)* |
| `function` | *(client-side — no server resource)* |
| `agent` | *(sub-agent delegation)* |
| `vector_search` | *(vector search index access)* |
| `iretriever` | *(instructed retriever — knowledge sources)* |

Also grant `CAN_QUERY` on the `MODEL` serving endpoint:

```yaml
- name: 'model-endpoint'
serving_endpoint:
name: 'databricks-claude-sonnet-4-5'
permission: 'CAN_QUERY'
```

## Step 5: Test and Deploy

```bash
uv run start-app # Test locally
databricks bundle deploy && databricks bundle run {{BUNDLE_NAME}} # Deploy
```

## Troubleshooting

**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team.

**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`).

**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ mlflow.db
!.claude/skills/agent-langgraph-memory/
!.claude/skills/agent-openai-memory/
!.claude/skills/migrate-from-model-serving/
!.claude/skills/use-supervisor-api/
!.claude/skills/enable-feedback/
!.claude/AGENTS.md
!.claude/CLAUDE.md
3 changes: 3 additions & 0 deletions .scripts/sync-skills.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ def sync_template(template: str, config: dict):
# Deploy skill (with substitution)
copy_skill(SOURCE / "deploy", dest / "deploy", subs)

# Supervisor API skill (with substitution for bundle name in deploy command)
copy_skill(SOURCE / "use-supervisor-api", dest / "use-supervisor-api", subs)

# SDK-specific skills (with substitution for bundle name references)
if isinstance(sdk, list):
# Multiple SDKs: copy skills for each, keeping SDK suffix in name
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions agent-langgraph-long-term-memory/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -218,3 +218,4 @@ sketch
!.claude/skills/lakebase-setup/
!.claude/skills/agent-memory/
!.claude/skills/migrate-from-model-serving/
!.claude/skills/use-supervisor-api/
Loading