Problem
The Python SDK ships CallableSubdomainTenantRouter (host-routing callback, PR #544) and LazyPlatformRouter (per-tenant platform factory, PR #547). The JS SDK ships createTenantRegistry which is a higher-level primitive built on top of the same building blocks. Adopters running multi-tenant Python deployments (we're the most advanced one) end up reinventing the registry layer.
What JS has that Python doesn't
From @adcp/sdk/server's createTenantRegistry (see adcp-client/skills/build-decisioning-platform/advanced/MULTI-TENANT.md):
- Per-tenant health states —
pending (registered, not yet validated, refused with 503), healthy (serving), unverified (was healthy, transient validation failure, graceful-degrade), disabled (permanent failure, refused until admin recheck()). Per-tenant — one bad tenant doesn't block others.
- Runtime
register(tenantId, config) / unregister(tenantId) — add/remove tenants without restarting the process. Admin webhook saves a new tenant row, calls register, the next request to that host resolves it. We need this; today we plumb it ourselves through the admin flow.
recheck(tenantId) — re-validate a tenant after key rotation or config change. Status transitions disabled → healthy without a traffic gap.
awaitFirstValidation: true — boot-time semantic where register() doesn't return until the tenant has been validated, so the first request after register doesn't race the validation roundtrip.
resolveByHost(host) — synchronous lookup returning the registered server (or null). Composes naturally with serve().
JS-specific bits we don't need: JWKS validation (we use principal-token bearer auth, not JWT). The registry should be JWKS-agnostic — adopters that want JWKS can pass a validator, adopters that don't pass nothing.
What we have today
core/main.py builds:
- A
CallableSubdomainTenantRouter with a 60-second TTL cache (host → Tenant lookup against our DB)
- A
LazyPlatformRouter with per-tenant DecisioningPlatform factory
- An ad-hoc admin-flow
invalidate(host) call when a tenant is created / deactivated / has its subdomain rotated
What's missing relative to JS:
- No health states. A misconfigured tenant (e.g. GAM credentials missing) only fails on first request, with a generic 500 — there's no "this tenant is disabled" classifier the LB or admin UI can observe.
- No runtime register without invalidate-the-cache plumbing.
- No
recheck() — config changes require either a process restart or manual cache invalidation.
- No first-validation boot semantic — the first request to a freshly-registered tenant pays the platform-build cost.
Proposed SDK shape
from adcp.server import TenantRegistry, BearerTokenAuth, serve
registry = TenantRegistry(
default_serve_options={
\"name\": \"my-multi-tenant-host\",
\"validation\": {\"requests\": \"strict\", \"responses\": \"strict\"},
},
# Optional validator — adopters using JWT can pass a JWKS validator;
# principal-token adopters pass None.
validator=None,
auto_validate=True,
)
# Register at boot
for tenant in load_tenants_from_db():
await registry.register(
tenant.id,
agent_url=tenant.agent_url,
platform=build_platform_for(tenant),
await_first_validation=True,
)
# Resolve per request
def resolve(ctx) -> AdcpServer:
resolved = registry.resolve_by_host(ctx.host)
if resolved is None or resolved.health == \"disabled\":
raise HTTPException(503)
return resolved.server
serve(resolve, auth=BearerTokenAuth(validate_token=...), port=os.environ[\"PORT\"])
# Runtime admin operations
await registry.register(tenant_id, agent_url=..., platform=...) # add
registry.unregister(tenant_id) # remove
await registry.recheck(tenant_id) # re-validate
status = registry.health(tenant_id) # observe
The internals reuse CallableSubdomainTenantRouter and LazyPlatformRouter — TenantRegistry is the higher-level primitive that composes them with health tracking and runtime mutation, matching the JS surface area.
Why this matters
- Multi-tenant SaaS deployments are the common shape for AdCP sellers. Every Python adopter at our scale will land here.
- Building it ourselves means each adopter has slightly-different health semantics, slightly-different admin webhook plumbing, slightly-different cache-invalidation rules. A canonical primitive collapses that variation.
- Closes the JS↔Python parity gap on the most-touched server-side primitive.
Files
Problem
The Python SDK ships
CallableSubdomainTenantRouter(host-routing callback, PR #544) andLazyPlatformRouter(per-tenant platform factory, PR #547). The JS SDK shipscreateTenantRegistrywhich is a higher-level primitive built on top of the same building blocks. Adopters running multi-tenant Python deployments (we're the most advanced one) end up reinventing the registry layer.What JS has that Python doesn't
From
@adcp/sdk/server'screateTenantRegistry(see adcp-client/skills/build-decisioning-platform/advanced/MULTI-TENANT.md):pending(registered, not yet validated, refused with 503),healthy(serving),unverified(was healthy, transient validation failure, graceful-degrade),disabled(permanent failure, refused until adminrecheck()). Per-tenant — one bad tenant doesn't block others.register(tenantId, config)/unregister(tenantId)— add/remove tenants without restarting the process. Admin webhook saves a new tenant row, calls register, the next request to that host resolves it. We need this; today we plumb it ourselves through the admin flow.recheck(tenantId)— re-validate a tenant after key rotation or config change. Status transitionsdisabled → healthywithout a traffic gap.awaitFirstValidation: true— boot-time semantic whereregister()doesn't return until the tenant has been validated, so the first request after register doesn't race the validation roundtrip.resolveByHost(host)— synchronous lookup returning the registered server (or null). Composes naturally withserve().JS-specific bits we don't need: JWKS validation (we use principal-token bearer auth, not JWT). The registry should be JWKS-agnostic — adopters that want JWKS can pass a validator, adopters that don't pass nothing.
What we have today
core/main.pybuilds:CallableSubdomainTenantRouterwith a 60-second TTL cache (host → Tenant lookup against our DB)LazyPlatformRouterwith per-tenantDecisioningPlatformfactoryinvalidate(host)call when a tenant is created / deactivated / has its subdomain rotatedWhat's missing relative to JS:
recheck()— config changes require either a process restart or manual cache invalidation.Proposed SDK shape
The internals reuse
CallableSubdomainTenantRouterandLazyPlatformRouter—TenantRegistryis the higher-level primitive that composes them with health tracking and runtime mutation, matching the JS surface area.Why this matters
Files
_resolve_tenant,build_subdomain_router,build_platform_for_tenant,build_router)@adcp/sdk/server's tenant-registry.ts and tenant-store.ts.