Skip to content

Commit 1c4e57d

Browse files
bokelleyclaude
andauthored
feat(adagents): fetch_agent_authorizations_from_directory for AAO inverse lookup (#769)
* feat(adagents): fetch_agent_authorizations_from_directory for AAO inverse lookup (closes #746) Adds a client function for the AAO directory's `GET /v1/agents/{agent_url}/publishers` endpoint (adcp#4823 / #4828) — the inverse-lookup path that returns the set of publishers whose adagents.json authorizes a given agent_url. Result is a typed `AgentAuthorizationsDirectoryResult` (Pydantic, validated against the real wire body). A 404 from the directory is the "not indexed" answer and surfaces as a result with `publishers=[]`; timeouts raise `AdagentsTimeoutError`; malformed or schema-noncompliant responses raise `AdagentsValidationError`. The directory's answer is *discovery*, not authorization — callers should still verify each returned `publisher_domain` via `fetch_adagents` before trusting the edge. Same SSRF gates apply (HTTPS only, DNS pre-check, private/reserved address ban, 5 MiB body cap, no redirect follow). Also bumps the schema pin to 3.1.0-beta.2 so `schemas/cache/` includes `aao/agent-publishers.json`. Full Pydantic regen is deferred — datamodel-code-generator mis-resolves `../enums/channels.json` when the chain originates at a depth-0 schema (root-level `adagents.json` now transitively references the new `core/product-format-declaration.json`, which itself uses `../enums/...`). The hand-written models in this PR are scoped to the new endpoint; unblocking full regen is tracked separately. Tests use `httpx.MockTransport` to exercise the real wire shape end-to-end and assert against `.model_validate()` on the Pydantic classes — covering happy path, 404 → empty, `since` cursor passthrough, timeout, malformed JSON, schema-mismatch, non-HTTPS guard, and 5xx surface. Refs salesagent #511. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(adagents): scope PR to function add only; revert schema-pin bump The 3.1.0-beta.2 bundle introduces spec drift well beyond this PR's scope — `cache_scope` becomes required on product responses, new `sponsored-intelligence` specialism + `search_brands` webhook task type, new `validate_input_brand_claims` endpoint. Each of those needs its own focused change (constant updates, fixture refreshes, capability surface work); bundling them with the AAO inverse-lookup function would block landing both. This commit: - Reverts ADCP_VERSION to 3.0.7 (the prior pin). - Drops `schemas/cache/3.1.0-beta.2/` from the tree; the new `fetch_agent_authorizations_from_directory` works with hand-written Pydantic models and does not need the v3.1 bundle on disk. - Regenerates `tests/fixtures/public_api_snapshot.json` to record the intentional new public exports (function + result types). The v3.1 schema-pin bump (and the codegen `../`-resolution fix noted in the PR body) move to a separate PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 062c6f0 commit 1c4e57d

4 files changed

Lines changed: 409 additions & 1 deletion

File tree

src/adcp/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,18 @@
1616
AdagentsFetchResult,
1717
AdagentsValidationReport,
1818
AdAgentsValidationResult,
19+
AgentAuthorizationsDirectoryResult,
1920
AuthorizationContext,
21+
DirectoryDiscoveryMethod,
22+
DirectoryEdgeStatus,
23+
DirectoryPublisherEntry,
2024
DiscoveryMethod,
2125
EntryErrorKind,
2226
domain_matches,
2327
fetch_adagents,
2428
fetch_adagents_with_cache,
2529
fetch_agent_authorizations,
30+
fetch_agent_authorizations_from_directory,
2631
filter_revoked_selectors,
2732
get_all_properties,
2833
get_all_tags,
@@ -821,12 +826,17 @@ def get_adcp_version() -> str:
821826
"AdagentsEntryError",
822827
"AdagentsFetchResult",
823828
"AdagentsValidationReport",
829+
"AgentAuthorizationsDirectoryResult",
824830
"AuthorizationContext",
831+
"DirectoryDiscoveryMethod",
832+
"DirectoryEdgeStatus",
833+
"DirectoryPublisherEntry",
825834
"DiscoveryMethod",
826835
"EntryErrorKind",
827836
"fetch_adagents",
828837
"fetch_adagents_with_cache",
829838
"fetch_agent_authorizations",
839+
"fetch_agent_authorizations_from_directory",
830840
"filter_revoked_selectors",
831841
"validate_adagents_domain",
832842
"validate_adagents_structure",

src/adcp/adagents.py

Lines changed: 171 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,15 @@
1414
import re
1515
import socket
1616
from dataclasses import dataclass, field
17+
from datetime import datetime
1718
from typing import Any, Literal
18-
from urllib.parse import urlparse
19+
from urllib.parse import quote, urlparse
1920

2021
import httpx
22+
from pydantic import Field
2123

2224
from adcp.exceptions import AdagentsNotFoundError, AdagentsTimeoutError, AdagentsValidationError
25+
from adcp.types.base import AdCPBaseModel
2326
from adcp.validation import ValidationError, validate_adagents
2427

2528
DiscoveryMethod = Literal["direct", "authoritative_location", "ads_txt_managerdomain"]
@@ -1773,3 +1776,170 @@ async def fetch_authorization_for_domain(
17731776

17741777
# Build result dictionary, filtering out None values
17751778
return {domain: ctx for domain, ctx in results if ctx is not None}
1779+
1780+
1781+
# Wire schema for the AAO agent → publishers inverse-lookup endpoint
1782+
# (`schemas/aao/agent-publishers.json`, adcp#4828). The publisher's own
1783+
# adagents.json remains the trust root — these models describe a *discovery*
1784+
# response, and callers SHOULD verify each `publisher_domain` against its
1785+
# adagents.json via :func:`fetch_adagents` before trusting an authorization.
1786+
DirectoryDiscoveryMethod = Literal[
1787+
"direct",
1788+
"authoritative_location",
1789+
"adagents_authoritative",
1790+
"ads_txt_managerdomain",
1791+
]
1792+
1793+
DirectoryEdgeStatus = Literal["authorized", "revoked"]
1794+
1795+
1796+
class DirectoryPublisherEntry(AdCPBaseModel):
1797+
"""One publisher row in an AAO directory inverse-lookup response."""
1798+
1799+
publisher_domain: str
1800+
discovery_method: DirectoryDiscoveryMethod
1801+
manager_domain: str | None = None
1802+
properties_authorized: int = Field(ge=0)
1803+
properties_total: int = Field(ge=0)
1804+
signing_keys_pinned: bool | None = None
1805+
status: DirectoryEdgeStatus
1806+
last_verified_at: datetime
1807+
1808+
1809+
class AgentAuthorizationsDirectoryResult(AdCPBaseModel):
1810+
"""Response envelope for ``GET /v1/agents/{agent_url}/publishers``.
1811+
1812+
Maps directly to ``schemas/aao/agent-publishers.json`` in the AdCP
1813+
bundle (adcp#4828). The directory is a discovery accelerator — each
1814+
``publisher_domain`` row tells callers where to look; they SHOULD
1815+
verify the publisher's adagents.json directly before treating an
1816+
authorization as trusted.
1817+
"""
1818+
1819+
agent_url: str
1820+
directory_indexed_at: datetime | None
1821+
publishers: list[DirectoryPublisherEntry] = Field(default_factory=list)
1822+
next_cursor: str | None = None
1823+
1824+
1825+
# Per-page response cap. Matches MAX_POINTER_BYTES (5 MiB) — a directory
1826+
# page is a small envelope; pagination handles bulk responses.
1827+
MAX_DIRECTORY_PAGE_BYTES = 5 * 1024 * 1024
1828+
1829+
1830+
async def fetch_agent_authorizations_from_directory(
1831+
agent_url: str,
1832+
*,
1833+
directory_url: str,
1834+
since: str | None = None,
1835+
timeout: float = 10.0,
1836+
client: httpx.AsyncClient | None = None,
1837+
) -> AgentAuthorizationsDirectoryResult:
1838+
"""Query an AAO directory for publishers that authorize ``agent_url``.
1839+
1840+
Calls ``GET {directory_url}/v1/agents/{agent_url}/publishers`` per the
1841+
AAO inverse-lookup contract (adcp#4823 / #4828) and returns the parsed
1842+
response. The directory's answer is *discovery*, not authorization:
1843+
callers should still verify each returned ``publisher_domain`` via
1844+
:func:`fetch_adagents` before treating an edge as trusted.
1845+
1846+
Args:
1847+
agent_url: The agent whose publisher authorizations are being
1848+
queried. Passed verbatim in the path; the directory echoes
1849+
back a canonicalized form on the response.
1850+
directory_url: HTTPS base URL of the AAO directory
1851+
(e.g. ``"https://aao.example.com"``). The ``/v1/agents/...``
1852+
path is appended; pass the directory's root, not a
1853+
request-specific path.
1854+
since: Optional opaque cursor or RFC 3339 timestamp from a prior
1855+
``directory_indexed_at`` — passed through as ``?since=...``
1856+
to limit the result to edges that changed since that point.
1857+
timeout: Request timeout in seconds.
1858+
client: Optional shared ``httpx.AsyncClient`` for connection
1859+
pooling. Caller owns the client lifecycle.
1860+
1861+
Returns:
1862+
:class:`AgentAuthorizationsDirectoryResult`. On 404 from the
1863+
directory the function returns a result with ``publishers=[]``
1864+
and ``directory_indexed_at=None`` — directories MUST be allowed
1865+
to answer "I do not index this agent" without callers needing
1866+
to branch on exception type.
1867+
1868+
Raises:
1869+
AdagentsValidationError: If ``directory_url`` is malformed, the
1870+
response status is non-200/non-404, the body is not valid
1871+
JSON, or the body does not match the directory result schema.
1872+
AdagentsTimeoutError: If the request times out.
1873+
1874+
Notes:
1875+
- ``directory_url`` is gated through the same SSRF protection
1876+
(HTTPS only, DNS pre-check, private/reserved address ban) as
1877+
publisher-side fetches.
1878+
- Response bodies are capped at 5 MiB. Bulk responses paginate
1879+
via ``next_cursor``; pass that value as ``since`` on the next
1880+
call — same wire field, different semantics per the schema.
1881+
"""
1882+
if not isinstance(agent_url, str) or not agent_url:
1883+
raise AdagentsValidationError("agent_url must be a non-empty string")
1884+
if not isinstance(directory_url, str) or not directory_url:
1885+
raise AdagentsValidationError("directory_url must be a non-empty string")
1886+
1887+
base = directory_url.rstrip("/")
1888+
if not base.startswith("https://"):
1889+
raise AdagentsValidationError(f"directory_url must be an HTTPS URL, got: {directory_url!r}")
1890+
_validate_redirect_url(f"{base}/v1/agents/_/publishers")
1891+
1892+
request_url = f"{base}/v1/agents/{quote(agent_url, safe='')}/publishers"
1893+
if since is not None:
1894+
request_url = f"{request_url}?since={quote(since, safe='')}"
1895+
1896+
parsed = urlparse(request_url)
1897+
await _dns_validate_host(
1898+
parsed.hostname or "", parsed.port or (443 if parsed.scheme == "https" else 80)
1899+
)
1900+
1901+
headers = {"User-Agent": "AdCP-Client/1.0", "Accept": "application/json"}
1902+
1903+
try:
1904+
if client is not None:
1905+
body, status_code, _ = await _stream_capped(
1906+
client, request_url, headers, timeout, MAX_DIRECTORY_PAGE_BYTES
1907+
)
1908+
else:
1909+
async with httpx.AsyncClient() as new_client:
1910+
body, status_code, _ = await _stream_capped(
1911+
new_client, request_url, headers, timeout, MAX_DIRECTORY_PAGE_BYTES
1912+
)
1913+
except httpx.TimeoutException as e:
1914+
raise AdagentsTimeoutError(parsed.netloc, timeout) from e
1915+
except httpx.RequestError as e:
1916+
raise AdagentsValidationError(f"Failed to fetch agent-publishers directory: {e}") from e
1917+
1918+
if status_code == 404:
1919+
# Per adcp#4828, a directory that has not indexed this agent
1920+
# answers 404. Surface as an empty result so callers don't need
1921+
# to special-case the exception path for "no edges" — the
1922+
# protocol is intentionally permissive here.
1923+
return AgentAuthorizationsDirectoryResult(
1924+
agent_url=agent_url,
1925+
directory_indexed_at=None,
1926+
publishers=[],
1927+
next_cursor=None,
1928+
)
1929+
1930+
if status_code != 200:
1931+
raise AdagentsValidationError(f"Agent-publishers directory returned HTTP {status_code}")
1932+
1933+
try:
1934+
data = json.loads(body)
1935+
except json.JSONDecodeError as e:
1936+
raise AdagentsValidationError(
1937+
f"Invalid JSON in agent-publishers directory response: {str(e)[:200]}"
1938+
) from e
1939+
1940+
try:
1941+
return AgentAuthorizationsDirectoryResult.model_validate(data)
1942+
except Exception as e: # pydantic.ValidationError + any coercion failure
1943+
raise AdagentsValidationError(
1944+
f"Agent-publishers directory response failed schema validation: {e}"
1945+
) from e

tests/fixtures/public_api_snapshot.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
"AdagentsValidationError",
3636
"AdagentsValidationReport",
3737
"AdvertiserIndustry",
38+
"AgentAuthorizationsDirectoryResult",
3839
"AgentCapabilities",
3940
"AgentCompliance",
4041
"AgentConfig",
@@ -116,6 +117,9 @@
116117
"Destination",
117118
"DevicePlatform",
118119
"DeviceType",
120+
"DirectoryDiscoveryMethod",
121+
"DirectoryEdgeStatus",
122+
"DirectoryPublisherEntry",
119123
"DiscoveryMethod",
120124
"DomainLookupResult",
121125
"Duration",
@@ -352,6 +356,7 @@
352356
"fetch_adagents",
353357
"fetch_adagents_with_cache",
354358
"fetch_agent_authorizations",
359+
"fetch_agent_authorizations_from_directory",
355360
"filter_revoked_selectors",
356361
"generate_webhook_idempotency_key",
357362
"generated",

0 commit comments

Comments
 (0)