Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 82 additions & 4 deletions unstract/sdk1/src/unstract/sdk1/adapters/base1.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import inspect
import logging
import os
import re
from abc import ABC, abstractmethod
from importlib import import_module
from typing import TYPE_CHECKING
Expand All @@ -15,6 +16,79 @@

logger = logging.getLogger(__name__)

# Anthropic models that have deprecated sampling parameters (`temperature`,
# `top_p`, `top_k`). The patterns are substring-matched against the model id
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW — comment drift: “substring-matched” no longer accurate

Lines 20 and 58 still describe matching as “substring-matched against the model id”, but the implementation now uses re.compile(...) + .search() with a trailing-edge lookahead. A pure substring match would (correctly) flag claude-opus-4-70; the current regex does not. Reword as e.g. “regex-searched against the normalized id, with a trailing-edge boundary so that claude-opus-4-70 does not match.” Keeps the comment honest after the anchor change in 7fb66f15.

# after lowercasing and normalizing `.` / `_` to `-`, so a single entry covers:
# - Native Anthropic `claude-opus-4-7`, `anthropic/claude-opus-4-7`
# - Bedrock foundation model `anthropic.claude-opus-4-7-<date>-v1:0`
# - Bedrock route prefixes `converse/...`, `invoke/...`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW — misleading bullet: route prefixes are not part of the pattern

“Bedrock route prefixes converse/..., invoke/...” reads as if the regex enumerates prefixes; it does not. The regex matches the claude-opus-4-7 token regardless of leading text, so prefixes like converse/ / invoke/ / bedrock/converse/ are accepted only as a side-effect of the trailing-edge-only anchor. Either drop the bullet or rephrase as “prefixes such as converse/ and invoke/ pass through because the match is anchored only at the trailing edge.”

# - Bedrock cross-region profile `us.anthropic.claude-opus-4-7-...`,
# `eu.`, `apac.`, `global.` variants
# - Bedrock foundation-model ARN `arn:aws:bedrock:<region>::foundation-model/
# anthropic.claude-opus-4-7-...`
# - Bedrock inference-profile ARN `arn:aws:bedrock:<region>:<account>:
# inference-profile/us.anthropic.claude-opus-4-7-...`
# - Vertex AI `vertex_ai/claude-opus-4-7@<date>`
# - Azure AI Foundry deployments whose name embeds `claude-opus-4-7`
# Add new entries here when Anthropic deprecates sampling on more models.
# Patterns are anchored at the trailing edge: the version digit must be
# followed by end-of-string or one of `-:@/v`, so unrelated future ids like
# `claude-opus-4-70`, `claude-opus-4-75` do not match. The allowed delimiters
# cover the date suffix (`-20260101-v1:0`), Vertex `@<date>`, ARN paths, and
# the `v1:0` version tag.
# See https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW — verify canonical docs URL

https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7 — the canonical Anthropic docs host is docs.claude.com / docs.anthropic.com; platform.claude.com is the console host and may not serve this docs path stably. Worth checking the link resolves before merge so the rationale comment doesn’t rot.

_SAMPLING_DEPRECATED_MODEL_PATTERNS: tuple[re.Pattern[str], ...] = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPTIONAL (simplify) — tuple-of-patterns + _MODEL_ID_FIELDS are over-abstracted for one entry today

With exactly one regex and exactly two id-bearing fields, the indirection costs more than it saves. If you want to reduce surface:

_SAMPLING_DEPRECATED_MODEL_RE = re.compile(r"claude-opus-4-7(?=$|[-:@/v])")
…
def _strip_deprecated_sampling_params(validated):
    if _has_deprecated_sampling_params(validated.get("model")) \
       or _has_deprecated_sampling_params(validated.get("model_id")):
        for param in _DEPRECATED_SAMPLING_PARAMS:
            validated.pop(param, None)
    return validated

Going back to a tuple/loop when (if) a second deprecation lands is a 2-line change. Reasonable people will disagree — the current shape is readable; this is purely a YAGNI call. Skip if you prefer keeping the table for forward-compat.

re.compile(r"claude-opus-4-7(?=$|[-:@/v])"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH — anchor lookahead admits v-prefixed words, not just version tags

(?=$|[-:@/v]) puts a bare v in the character class, so the trailing-edge boundary fires for any token starting with the letter v — e.g. claude-opus-4-7verbose, claude-opus-4-7vnext, or any future suffix-with-v id would match (currently theoretical, but the comment on lines 34–38 explicitly claims the anchor blocks unrelated future ids).

Fix: require the version pattern explicitly, e.g.

re.compile(r"claude-opus-4-7(?=$|[-:@/]|v\d)")

Keeps …-v1:0 / …v1 valid while rejecting alpha continuations starting with v. Extends the anchor concern raised earlier by greptile rather than duplicating it.

)
_DEPRECATED_SAMPLING_PARAMS: tuple[str, ...] = ("temperature", "top_p", "top_k")
# Fields whose value can carry a model id. `model` is universal; `model_id` is
Comment thread
pk-zipstack marked this conversation as resolved.
# Bedrock's separate ARN field used for Application Inference Profile cost
# tracking — when callers route through an AIP, the standard model id often
# only appears here, not in `model`.
_MODEL_ID_FIELDS: tuple[str, ...] = ("model", "model_id")


def _has_deprecated_sampling_params(model: str | None) -> bool:
"""Return True when the model rejects sampling parameters.

Anthropic deprecated `temperature`, `top_p`, and `top_k` starting with
Claude Opus 4.7; sending any of them yields a 400 from Anthropic and from
the providers that proxy it (Bedrock, Azure AI Foundry, Vertex AI).

The check normalizes case and `.`/`_` separators to `-`, then substring-
matches against the patterns. This catches every format that embeds the
model id (foundation model ids, cross-region profiles, foundation-model
ARNs, inference-profile ARNs, Vertex `@`-suffixed ids).
Comment thread
pk-zipstack marked this conversation as resolved.

It does NOT catch:
- Bedrock Application Inference Profile ARNs (e.g.
`arn:aws:bedrock:...:application-inference-profile/abcd1234`), whose
tail is an opaque profile id — the underlying model is not recoverable
from the string. Pass the AIP ARN in `model_id` and keep the standard
model id in `model`, or the strip won't fire.
- Azure AI Foundry deployment names that omit the model id; rename the
deployment to include `claude-opus-4-7` so detection works.
"""
if not model:
return False
normalized = model.lower().replace(".", "-").replace("_", "-")
return any(rx.search(normalized) for rx in _SAMPLING_DEPRECATED_MODEL_PATTERNS)


def _strip_deprecated_sampling_params(validated: dict[str, "Any"]) -> dict[str, "Any"]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MED — mutate-and-return is inconsistent with surrounding callers

The rest of this file consistently builds result_metadata = adapter_metadata.copy() (e.g. lines 264, 350, 584) when avoiding mutation. This helper mutates validated in place and returns the same dict; today’s four call sites are safe, but the asymmetry invites a future caller to do before = validated; after = _strip_deprecated_sampling_params(validated) and be surprised that before is after.

Fix (lowest-risk): rename to _strip_deprecated_sampling_params_inplace and return None, or shallow-copy and return the copy. The dict is small — a copy costs nothing and matches surrounding style.

Type-design note: invariant strength would also improve if the pattern ↔ forbidden-params link were structural rather than positional (parallel _SAMPLING_DEPRECATED_MODEL_PATTERNS and _DEPRECATED_SAMPLING_PARAMS tuples). Worth folding into ((pattern, frozenset(params)), …) only when a second deprecation lands with a different param set — not before.

"""Drop sampling params for models that reject them (e.g. Claude Opus 4.7).

Pydantic's `model_dump()` re-emits the default `temperature=0.1` even when
Comment thread
pk-zipstack marked this conversation as resolved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW — docstring “why” understates the scope of the strip

“Pydantic’s model_dump() re-emits the default temperature=0.1” is the load-bearing reason for popping temperature, but top_p / top_k are not declared fields on BaseChatCompletionParameters and so are not subject to the same Pydantic-default re-emit. The strip’s reason for handling them is to defend against caller-supplied values flowing through **adapter_metadata. Worth one extra clause so future readers understand both halves of the invariant.

(Distinct from the existing pk-zipstack / chandrasekharan-zipstack thread on this line about whether litellm handles the cleanup upstream.)

the caller never set one, so the field must be popped after validation to
keep it off the wire. Checks both `model` and `model_id` so Bedrock callers
routing through an Application Inference Profile are covered when the
standard model id only appears in `model_id`.
"""
if any(_has_deprecated_sampling_params(validated.get(f)) for f in _MODEL_ID_FIELDS):
for param in _DEPRECATED_SAMPLING_PARAMS:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MED — post-strip temperature key invariant breaks validated["temperature"] consumers

BaseChatCompletionParameters.temperature defaults to Field(default=0.1), so before this PR every validate() return was guaranteed to carry temperature. After the strip, callers using validated["temperature"] (vs. .get) on an Opus 4.7 path now raise KeyError. Anything that reads this key for cost-tracking, telemetry, or UI display in the platform service / SDK runtime would silently break only on 4.7.

Fix: grep platform-service / SDK consumers of validate() output for direct ["temperature"] access and switch to .get("temperature"); document in this docstring that temperature may be absent post-strip so future consumers know the key is now optional.

validated.pop(param, None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH — silent no-op when sampling params are present but no model-id field matches

If a Bedrock Application Inference Profile ARN with an opaque tail (or an Azure Foundry deployment whose name omits the model id) reaches this helper while temperature/top_p/top_k are populated, the strip silently returns the dict unchanged. The failure surfaces only as a 400 from Anthropic after dispatch, with no breadcrumb tying it back to the strip path. Six months from now an operator hitting that 400 has no way to know detection was attempted-and-skipped vs. never attempted.

Fix: add a single logger.debug (or logger.warning if you want it to page operators) on the no-strip branch only when at least one of _DEPRECATED_SAMPLING_PARAMS is present in validated:

if matched:
    for p in _DEPRECATED_SAMPLING_PARAMS:
        validated.pop(p, None)
elif any(p in validated for p in _DEPRECATED_SAMPLING_PARAMS):
    logger.debug(
        "Sampling-strip skipped; model ids did not match deprecation patterns: %s",
        {f: validated.get(f) for f in _MODEL_ID_FIELDS if validated.get(f)},
    )

Keeps the AIP/Foundry fallback non-fatal but observable.

return validated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MED — zero committed tests for the new helpers or the four wired call sites

git diff origin/main…HEAD --name-only shows only base1.py — the PR description’s 24-positive / 8-negative / AIP-fallback / four-end-to-end matrix is ad-hoc local. Repo-wide grep for claude-opus-4-7, _strip_deprecated_sampling_params, _has_deprecated_sampling_params matches only this file. The Vertex AI wiring gap that surfaced in review (commit 5a4ea27f shipped without it; addressed in 7fb66f15) is exactly the failure mode a parametrized adapter test would have caught.

Suggested minimal lock-in (unstract/sdk1/tests/adapters/test_base1_sampling_strip.py):

  • test_has_deprecated_sampling_params_positivepytest.mark.parametrize over the 24 id formats from the PR description (native, anthropic. Bedrock, us./eu./apac./global. cross-region, foundation-model ARN, inference-profile ARN, vertex_ai/...@<date>, Azure deployment, converse/, invoke/, ./_ separators, mixed case).
  • test_has_deprecated_sampling_params_negativeclaude-opus-4-6, claude-opus-4-70, claude-opus-4-75, claude-sonnet-4-7, claude-haiku-4-5, gpt-4o, gemini-2.0-flash, opaque AIP ARN, None, "". The 4-70/4-75 cases lock the (?=$|[-:@/v]) anchor.
  • test_strip_checks_model_id_field_fallback — opus id in model only, in model_id only, both opaque (no strip).
  • test_validate_strips_for_<adapter> — four parametrized cases (AnthropicLLMParameters, AWSBedrockLLMParameters, AzureAIFoundryLLMParameters, VertexAILLMParameters), each asserting the three keys absent for opus 4.7 / present for opus 4.6. A future refactor that drops a single call site regresses silently without these.



def register_adapters(adapters: dict[str, dict[str, "Any"]], adapter_type: str) -> None:
"""Register all SDK v1 adapters of given type.
Expand Down Expand Up @@ -462,7 +536,7 @@ def validate(adapter_metadata: dict[str, "Any"]) -> dict[str, "Any"]:
if "thinking" in result_metadata:
validated_data["thinking"] = result_metadata["thinking"]

return validated_data
return _strip_deprecated_sampling_params(validated_data)

@staticmethod
def validate_model(adapter_metadata: dict[str, "Any"]) -> str:
Expand Down Expand Up @@ -538,7 +612,7 @@ def validate(adapter_metadata: dict[str, "Any"]) -> dict[str, "Any"]:
if enable_thinking and "thinking" in result_metadata:
validated["thinking"] = result_metadata["thinking"]

return validated
return _strip_deprecated_sampling_params(validated)

@staticmethod
def validate_model(adapter_metadata: dict[str, "Any"]) -> str:
Expand Down Expand Up @@ -627,7 +701,7 @@ def validate(adapter_metadata: dict[str, "Any"]) -> dict[str, "Any"]:
if enable_extended_context:
validated["extra_headers"] = {"anthropic-beta": "context-1m-2025-08-07"}

return validated
return _strip_deprecated_sampling_params(validated)

@staticmethod
def validate_model(adapter_metadata: dict[str, "Any"]) -> str:
Expand Down Expand Up @@ -837,7 +911,11 @@ def validate(adapter_metadata: dict[str, "Any"]) -> dict[str, "Any"]:
adapter_metadata
)

return AzureAIFoundryLLMParameters(**adapter_metadata).model_dump()
validated = AzureAIFoundryLLMParameters(**adapter_metadata).model_dump()
# Azure AI Foundry proxies Anthropic models that reject sampling params
# (e.g. Claude Opus 4.7); detection relies on the deployment name
# embedding the model id.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW — inline comment is redundant and understates the failure mode

The helper docstring already documents the Foundry deployment-name caveat, so the three lines here mostly restate it. If the comment stays, strengthen the second half: “detection relies on the deployment name embedding the Anthropic model id; opaque deployment names skip the strip and will surface as a 400 from the provider.” Otherwise delete — the helper call reads self-explanatorily.

return _strip_deprecated_sampling_params(validated)

@staticmethod
def validate_model(adapter_metadata: dict[str, "Any"]) -> str:
Expand Down