Add load-testing skill to all agent templates: Load Testing Template + Skill + Locust Load Testing Script + Dashboard Generation by jennsun · Pull Request #184 · databricks/app-templates

jennsun · 2026-04-10T01:59:09Z

Adds a shared load-testing Claude Code skill that helps users benchmark their Databricks Apps QPS using Locust. The skill guides users through mocking LLM calls, setting up load test scripts, running ramp-to-saturation tests, and viewing interactive dashboards.

See what example resulting app template looks like from this repo:
#180

Users can use skill or run load testing script by running command like:

cd /Users/jenny.sun/app-templates/agent-load-testing/load-test-scripts && \
export DATABRICKS_HOST="https://eng-ml-inference-team-eu-central-1.cloud.databricks.com/" && \
export DATABRICKS_CLIENT_ID="redacted" && \
export DATABRICKS_CLIENT_SECRET="redacted" && \
python3 run_load_test.py \
  --app-url https://agent-load-test-medium-w2-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-medium-w4-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-medium-w6-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-medium-w8-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-large-w6-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-large-w8-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-large-w10-5763476629593792.aws.databricksapps.com/ \
  --app-url https://agent-load-test-large-w12-5763476629593792.aws.databricksapps.com/ \
  --label medium_2w --label medium_4w --label medium_6w --label medium_8w \
  --label large_6w --label large_8w --label large_10w --label large_12w \
  --compute-size medium --compute-size medium --compute-size medium --compute-size medium \
  --compute-size large --compute-size large --compute-size large --compute-size large \
  --max-users 1000 --step-size 20 --step-duration 45 \
  --dashboard --run-name agent_app_1000_load_test

which will allow them to see progress while load tests run

generates HTML dashboard at end
https://agent-app-load-test-results-5763476629593792.aws.databricksapps.com/

Adds a shared /load-testing Claude Code skill that guides users through load testing their Databricks Apps to find maximum QPS. Includes mock LLM setup, Locust test scripts, deployment matrix, and dashboard generation. - Add source skill at .claude/skills/load-testing/SKILL.md - Add "load-testing" to sync-skills.py shared skills list - Sync skill to all 5 registered agent templates - Whitelist skill in .gitignore for all 7 agent templates (including advanced) - Use `uv add` instead of `pip install` for dependency management - Pass --workers directly to start-server (no wrapper script needed) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

bbqiu

lgtm with a few comments!

… specific results - Make Step 2 (mocking) explicitly optional — users can test real agents E2E - Lower defaults to keep tests under 1 hour (max_users 300, step_duration 30s) - Change default workers from 4 to 2 - Switch all commands from `python` to `uv run` - Remove internal-specific reference results (too specific to mocked LLM case) - Re-sync skill to all templates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dhruv0811

Looks great!! It will be super useful for customers to be able to load test their agents easily now.

Left a few comments, also will need to rebase to use the new templates!

dhruv0811 · 2026-04-15T20:51:01Z

    copy_skill(SOURCE / "quickstart", dest / "quickstart", quickstart_subs)

    # Shared skills (no substitution needed)
-    for skill in ["run-locally", "discover-tools", "migrate-from-model-serving"]:


Can we exclude agent-non-conversational from getting this skill maybe? I think the input format we are using in this skill doesn't work with the non-conv template's expectations

dhruv0811 · 2026-04-15T20:53:11Z

+
+### Install Dependencies
+
+```bash


nit: can we clarify to either cd load-test-scripts/ or use a separate pyproject.toml, otherwise we end up adding locust and requests to the template's own pyproject.toml which feels like we are polluting the agent's prod setup.

dhruv0811 · 2026-04-15T20:54:18Z

Do we want to add load-test-runs/<run-name>/ to the gitignore out of the box as well?

dhruv0811 · 2026-04-15T21:08:39Z

+### Install Dependencies
+
+```bash
+uv add locust requests


Also seems that latest locust >=2.43 throws a RecursionError? The workaround was pinning locust>=2.32,<2.40 and urllib3<2.3. Maybe we can pin these in the instructions here too?

…gitignore runs - Exclude agent-non-conversational from load-testing skill (incompatible input format) - Use separate pyproject.toml in load-test-scripts/ to avoid polluting agent deps - Pin locust>=2.32,<2.40 and urllib3<2.3 to avoid RecursionError in locust>=2.43 - Add load-test-runs/ to .gitignore in all templates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jennsun requested review from bbqiu and dhruv0811 April 10, 2026 02:45

jennsun changed the title ~~Add load-testing skill to all agent templates~~ Add load-testing skill to all agent templates: Load Testing Template + Skill + Locust Load Testing Script + Dashboard Generation Apr 10, 2026

jennsun mentioned this pull request Apr 10, 2026

Agents on Apps Load Testing Template + Skill + Locust Load Testing Script + Dashboard Generation #180

Closed

bbqiu reviewed Apr 14, 2026

View reviewed changes

Comment thread .claude/skills/load-testing/SKILL.md Outdated

bbqiu reviewed Apr 14, 2026

View reviewed changes

Comment thread .claude/skills/load-testing/SKILL.md Outdated

jennsun force-pushed the load-testing-skill branch from 589c6dc to fce48b0 Compare April 14, 2026 19:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add load-testing skill to all agent templates: Load Testing Template + Skill + Locust Load Testing Script + Dashboard Generation#184

Add load-testing skill to all agent templates: Load Testing Template + Skill + Locust Load Testing Script + Dashboard Generation#184
jennsun wants to merge 3 commits intodatabricks:mainfrom
jennsun:load-testing-skill

jennsun commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bbqiu left a comment

Uh oh!

dhruv0811 left a comment

Uh oh!

dhruv0811 Apr 15, 2026

Uh oh!

dhruv0811 Apr 15, 2026

Uh oh!

dhruv0811 Apr 15, 2026

Uh oh!

dhruv0811 Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		### Install Dependencies

		```bash

Conversation

jennsun commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bbqiu left a comment

Choose a reason for hiding this comment

Uh oh!

dhruv0811 left a comment

Choose a reason for hiding this comment

Uh oh!

dhruv0811 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dhruv0811 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dhruv0811 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dhruv0811 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants