Skip to content

feat(agent-server): typed settings update endpoints + ETag/If-Match#3234

Open
rbren wants to merge 1 commit into
mainfrom
feature/typed-settings-update
Open

feat(agent-server): typed settings update endpoints + ETag/If-Match#3234
rbren wants to merge 1 commit into
mainfrom
feature/typed-settings-update

Conversation

@rbren
Copy link
Copy Markdown
Contributor

@rbren rbren commented May 12, 2026

This PR was opened by an AI agent (OpenHands) on behalf of the user.

What's wrong with PATCH /api/settings today

The current endpoint accepts:

class SettingsUpdateRequest(BaseModel):
    agent_settings_diff: dict[str, Any] | None = None
    conversation_settings_diff: dict[str, Any] | None = None

and applies the diffs via _deep_merge onto the persisted settings. A few of the consequences:

  1. The API contract is dict[str, Any]. Generated clients are useless; typos like lllm are silently merged and forgotten; field renames in OpenHandsAgentSettings aren't even breaking changes at the wire level.
  2. The public schema lies about itself. The internal SettingsUpdatePayload accepts an active_profile key that the public SettingsUpdateRequest doesn't expose. The handler casts past this mismatch.
  3. "Diff" is undocumented deep-merge. Dicts merge, lists replace, and there is no way to delete a key — to clear mcp.headers["Authorization"] you must overwrite the parent dict, which has all the read-modify-write hazards below.
  4. Discriminated-union variants are not respected. A PATCH that switches agent_kind: openhands → acp ends up with leftover OpenHands-only fields silently dropped by re-validation, with no warning.
  5. No optimistic concurrency. The file lock prevents on-disk corruption, not lost updates. Two clients each doing GET → modify → PATCH cheerfully clobber each other; the server has no ETag/If-Match/version to detect it. Most concerning case: list/dict fields (tools, mcp_config.mcpServers) which replace wholesale, so concurrent edits to different MCP servers silently undo one another.
  6. No way for a client to confirm what was stored. The PATCH response intentionally strips secrets and there is no version stamp, so verifying "did my change land?" requires a follow-up GET whose result may already include another writer's interleaved changes.
  7. Generic 422 with no field info. Even errors that carry no secret (max_iterations: 0) collapse to "Settings validation failed". The client cannot point at the bad field.

Idempotency is accidentally preserved for the common single-client case, but breaks under concurrent writers, encrypted-secret round-trips, and migration scenarios.

What this PR does

1. Typed update models in the SDK (openhands.sdk.settings.update_models)

class OpenHandsAgentSettingsUpdate(_UpdateBase):  # extra="forbid"
    agent_kind: Literal["openhands", "llm"] = "openhands"
    agent: str | None = None
    llm: LLMUpdate | None = None          # all-Optional partial of LLM
    tools: list[Tool] | None = None
    enable_sub_agents: bool | None = None
    enable_switch_llm_tool: bool | None = None
    mcp_config: MCPConfig | None = None
    agent_context: AgentContext | None = None
    condenser: CondenserSettingsUpdate | None = None
    verification: VerificationSettingsUpdate | None = None

class ACPAgentSettingsUpdate(_UpdateBase): ...
class ConversationSettingsUpdate(_UpdateBase): ...

AgentSettingsUpdate = Annotated[..., Discriminator(_agent_update_discriminator)]

With apply_agent_update(current, update) and apply_conversation_update(current, update) helpers that:

  • Merge only the fields the client explicitly sent (model_fields_set).
  • Treat nested partials (llm, condenser, verification) as one-level merges into the current sub-object, so e.g. {"llm": {"model": "gpt-4o"}} preserves the existing api_key.
  • Re-validate the merged result through the source model's full validator chain (constraints, secret validators, MCP env decryption — all the usual invariants).
  • Refuse to switch agent_kind (use PUT to switch variants).

2. New endpoints in settings_router.py

PATCH /api/settings/agent          — typed partial update (extra="forbid")
PUT   /api/settings/agent          — full replace (allows variant switching)
PATCH /api/settings/conversation   — typed partial update
PUT   /api/settings/conversation   — full replace

All four:

  • Read the body as a Pydantic model with extra="forbid" (typos → 422).
  • Honour If-Match against the global settings ETag.
  • Return the new resource state with ETag header in the response.
  • Reuse the existing file-locked store.update() so writes remain atomic and serialise with each other and with the legacy endpoint.

3. ETag / If-Match plumbing (_settings_etag.py)

  • compute_settings_etag() hashes a plaintext-canonical projection of PersistedSettings. Crucially not the on-disk bytes — Fernet nonces would make identical state look like a change on every save, defeating idempotency.
  • check_if_match() returns 412 Precondition Failed on stale If-Match, with the current ETag echoed in the response so the client can retry against the live state. If-Match: * is accepted (resource always exists, since settings default).
  • The shared _run_settings_update helper performs the If-Match check inside the file lock, so the ETag and the state the apply will run against are the same.

4. Bonus fix in api.py

The shared HTTPException handler in _add_exception_handlers was constructing the JSONResponse from status_code + detail but dropping exc.headers, so any handler-set response header on an error (e.g. our 412 ETag, or a WWW-Authenticate: ... on 401) was lost. Propagate it.

5. Legacy endpoint preserved + upgraded

PATCH /api/settings is unchanged in semantics and still accepts the old dict[str, Any] diff payload. In addition it now:

  • Honours If-Match and emits the new ETag header in success responses.
  • Emits Deprecation: true and Link: </api/settings/agent>; rel="successor-version", </api/settings/conversation>; rel="successor-version" (RFC 8594 / RFC 9745) so forward-compatible clients can migrate at their leisure.

No existing tests were modified; the legacy suite (28 tests) still passes unchanged.

Tests

tests/agent_server/test_typed_settings_router.py (21 tests) covers:

  • ETag stability across identical writes (the core idempotency property).
  • ETag changes on real change.
  • If-Match matching, stale → 412, wildcard.
  • Legacy PATCH /api/settings honouring If-Match and emitting the deprecation headers.
  • Typed PATCH: extra="forbid", variant invariance, agent_kind: "llm" alias, partial LLM preserving api_key.
  • PUT: full replace, variant switch (openhandsacp), 422 on invalid body.
  • Conversation: partial preserve, forbid extra, validation re-fires.
  • End-to-end concurrency: documented loss without If-Match, detected with If-Match.

Legacy: 28/28 tests/agent_server/test_settings_router.py still pass.
SDK: 87/87 tests/sdk/test_settings.py + tests/sdk/settings/ still pass.

(The 15 unrelated failures in tests/agent_server/test_skills_router.py and test_terminal_service.py exist on main before this PR.)

Compatibility

  • Wire-compatible: no existing request body shape changes; SettingsUpdateRequest is unchanged.
  • New response header (ETag) on GET /api/settings, PATCH /api/settings, and the new endpoints. Old clients that ignore unknown headers see no change.
  • Deprecation / Link on the legacy PATCH /api/settings response are advisory; old clients ignore them.

Follow-ups (not in this PR)

  • MCP servers should move to per-server CRUD (GET /api/settings/mcp-servers, GET/PUT/PATCH/DELETE /api/settings/mcp-servers/{name}). Today even the new typed PATCH still treats mcp_config as one blob: editing one server's env requires round-tripping every other server's encrypted secrets. The CRUD shape would also give per-server ETags so concurrent edits to different servers stop conflicting at all. Mirroring the existing /api/settings/secrets router design.
  • Strict-mode If-Match: a config flag to require If-Match on all settings writes, for deployments that want concurrency safety to be mandatory rather than opt-in.
  • Field-level 422 errors: today the typed handlers still collapse re-validation errors to a generic message to avoid leaking secret values in tracebacks. We could pass through loc + a sanitised msg for fields known not to carry secrets.

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:7d67356-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-7d67356-python \
  ghcr.io/openhands/agent-server:7d67356-python

All tags pushed for this build

ghcr.io/openhands/agent-server:7d67356-golang-amd64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-golang-amd64
ghcr.io/openhands/agent-server:feature-typed-settings-update-golang-amd64
ghcr.io/openhands/agent-server:7d67356-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:7d67356-golang-arm64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-golang-arm64
ghcr.io/openhands/agent-server:feature-typed-settings-update-golang-arm64
ghcr.io/openhands/agent-server:7d67356-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:7d67356-java-amd64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-java-amd64
ghcr.io/openhands/agent-server:feature-typed-settings-update-java-amd64
ghcr.io/openhands/agent-server:7d67356-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:7d67356-java-arm64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-java-arm64
ghcr.io/openhands/agent-server:feature-typed-settings-update-java-arm64
ghcr.io/openhands/agent-server:7d67356-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:7d67356-python-amd64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-python-amd64
ghcr.io/openhands/agent-server:feature-typed-settings-update-python-amd64
ghcr.io/openhands/agent-server:7d67356-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:7d67356-python-arm64
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-python-arm64
ghcr.io/openhands/agent-server:feature-typed-settings-update-python-arm64
ghcr.io/openhands/agent-server:7d67356-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:7d67356-golang
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-golang
ghcr.io/openhands/agent-server:feature-typed-settings-update-golang
ghcr.io/openhands/agent-server:7d67356-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:7d67356-java
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-java
ghcr.io/openhands/agent-server:feature-typed-settings-update-java
ghcr.io/openhands/agent-server:7d67356-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:7d67356-python
ghcr.io/openhands/agent-server:7d67356e72102f378012ac81dbd67f569485f8cd-python
ghcr.io/openhands/agent-server:feature-typed-settings-update-python
ghcr.io/openhands/agent-server:7d67356-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 7d67356-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 7d67356-python-amd64) are also available if needed

Adds typed PATCH/PUT endpoints for agent and conversation settings as a
discoverable, concurrency-safe alternative to the legacy untyped
dict[str, Any] PATCH /api/settings:

  PATCH /api/settings/agent          — typed partial update (extra='forbid')
  PUT   /api/settings/agent          — full replace (variant switching)
  PATCH /api/settings/conversation   — typed partial update
  PUT   /api/settings/conversation   — full replace

Other changes:

* SDK exposes OpenHandsAgentSettingsUpdate, ACPAgentSettingsUpdate,
  ConversationSettingsUpdate, LLMUpdate etc. + apply_agent_update /
  apply_conversation_update helpers.
* All settings endpoints now emit an ETag header (computed over the
  plaintext-canonical state so it survives per-save Fernet nonces) and
  honour If-Match for optimistic concurrency (412 on mismatch, with the
  current ETag echoed back).
* The legacy PATCH /api/settings is preserved as-is for backwards
  compatibility and additionally honours If-Match plus emits
  Deprecation / Link headers pointing at the typed successors.
* The shared HTTPException handler in api.py now propagates exc.headers
  so the 412 ETag actually reaches the client.

See PR description for a fuller analysis of the issues this addresses.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

All typed settings endpoints work correctly with ETag/If-Match support for optimistic concurrency control.

Does this PR achieve its stated goal?

Yes. This PR successfully adds ETag/If-Match support to prevent lost updates and provides new typed endpoints (PATCH/PUT /api/settings/agent and /api/settings/conversation) with strict extra="forbid" validation. The legacy endpoint now honors If-Match and signals deprecation via headers. All functional requirements verified through real HTTP requests to a running agent-server.

Phase Result
Environment Setup make build succeeded, agent-server started on port 8000
CI Status ⚠️ pre-commit failing, other checks in progress (not blocking functional verification)
Functional Verification ✅ All 15 endpoint behaviors verified via curl
Functional Verification

Test 1: GET /api/settings returns ETag header

Verification:
Ran curl -i http://127.0.0.1:8000/api/settings:

HTTP/1.1 200 OK
etag: "95f24b306974722b34e0ae4614b1f9ea"

This confirms the GET endpoint now returns a quoted ETag header that clients can capture for subsequent If-Match requests.


Test 2: ETag stability across identical saves

Step 1 — Establish baseline:
Made two identical PATCH requests (setting llm.model to gpt-4o).

Result:

First ETag: "8179e34f3475c871d432bba8ee4aecbf"
Second ETag: "8179e34f3475c871d432bba8ee4aecbf"
✓ ETags match - stable

This proves ETags are computed over plaintext-canonical state (not encrypted bytes), so identical saves produce identical ETags — a critical improvement over byte-level hashing.


Test 3: ETag changes on real content change

Step 1 — Baseline ETag:

ETag before: "8179e34f3475c871d432bba8ee4aecbf"

Step 2 — Change model to gpt-4:
Ran PATCH /api/settings/agent with {"llm":{"model":"gpt-4"}}.

Result:

ETag after: "ecbafbb7009b615cf6375f2a056cbd66"
✓ ETags differ - change detected

ETags correctly change when content changes, enabling clients to detect concurrent modifications.


Test 4: If-Match validation - 412 on stale ETag

Step 1 — Capture current ETag:

Captured ETag: "ecbafbb7009b615cf6375f2a056cbd66"

Step 2 — Another client moves state forward:
Made a PATCH changing model to claude-sonnet-4-20250514.

Step 3 — Attempt update with stale ETag:
Ran PATCH /api/settings/agent with If-Match: "ecbafbb7009b615cf6375f2a056cbd66".

Result:

HTTP/1.1 412 Precondition Failed
etag: "02c53e400a17e6325149228fb822de82"
{"detail":"If-Match does not match the current settings ETag. Re-fetch /api/settings and retry with the new ETag."}

The 412 response correctly rejects stale updates and returns the current ETag so clients can retry. This is the core optimistic concurrency feature working.


Test 5: If-Match validation - success with current ETag

Verification:
Captured current ETag and immediately made a PATCH with that ETag in If-Match header.

Result:

HTTP/1.1 200 OK
etag: "8179e34f3475c871d432bba8ee4aecbf"

Update succeeded when If-Match matches current state.


Test 6: Legacy PATCH /api/settings honors If-Match and signals deprecation

Verification:
Ran PATCH /api/settings (legacy endpoint) with valid If-Match header.

Result:

HTTP/1.1 200 OK
etag: "ecbafbb7009b615cf6375f2a056cbd66"
deprecation: true
link: </api/settings/agent>; rel="successor-version", </api/settings/conversation>; rel="successor-version"

The legacy endpoint:

  1. Honors If-Match for optimistic concurrency
  2. Returns Deprecation: true header (RFC 8594)
  3. Returns Link headers pointing to typed successors

Backward compatibility maintained while signaling migration path.


Test 7: PATCH /api/settings/agent preserves untouched fields (api_key)

Step 1 — Set API key:
Used legacy PATCH to set llm.api_key to sk-test-original-key.

Step 2 — Change only model via typed PATCH:
Ran PATCH /api/settings/agent with {"llm":{"model":"gpt-4o"}}.

Step 3 — Verify API key preserved:
Fetched settings with X-Expose-Secrets: plaintext header.

Result:

api_key: sk-test-original-key
model: gpt-4o

This is the key practical benefit — clients can update model names without round-tripping secrets.


Test 8: extra="forbid" validation rejects unknown fields

Verification:
Sent PATCH /api/settings/agent with typo: {"lllm":{"model":"gpt-4o"}}.

Result:

HTTP/1.1 422 Unprocessable Content
{"detail":"Agent settings update validation failed"}

Unknown fields fail loudly instead of being silently ignored (major improvement over legacy deep-merge).


Test 9: Variant invariance - PATCH cannot switch agent_kind

Verification:
Attempted PATCH /api/settings/agent with {"agent_kind":"acp"} while current kind is openhands.

Result:

HTTP/1.1 422 Unprocessable Content
{"detail":"PATCH cannot switch agent_kind (current='openhands', requested='acp'); use PUT to replace the whole agent settings object."}

Variant switching is correctly blocked on PATCH, with clear guidance to use PUT.


Test 10: PUT /api/settings/agent can switch agent_kind

Step 1 — Switch to ACP:
Ran PUT /api/settings/agent with {"agent_kind":"acp","acp_server":"claude-code",...}.

Result:

{"agent_kind": "acp", "acp_server": "claude-code"}

Step 2 — Switch back to OpenHands:
Ran PUT /api/settings/agent with {"agent_kind":"openhands",...}.

Result:

{"agent_kind": "openhands", "agent": "CodeActAgent"}

PUT successfully switches variants in both directions.


Test 11: PATCH /api/settings/conversation preserves untouched fields

Step 1 — Set initial state:
Used PUT /api/settings/conversation to set {"max_iterations":100,"confirmation_mode":true}.

Step 2 — Partially update:
Ran PATCH /api/settings/conversation with {"max_iterations":200}.

Result:

{"max_iterations": 200, "confirmation_mode": true}

Partial update changed only the specified field; confirmation_mode preserved.


Test 12: Conversation settings extra="forbid" validation

Verification:
Sent PATCH /api/settings/conversation with typo: {"max_iterationz":300}.

Result:

HTTP/1.1 422 Unprocessable Content
{"detail":[{"type":"extra_forbidden","loc":["body","max_iterationz"],"msg":"Extra inputs are not permitted"}]}

Typo rejected with detailed Pydantic validation error.


Test 13: If-Match: * wildcard succeeds

Verification:
Sent PATCH /api/settings/agent with If-Match: *.

Result:

HTTP/1.1 200 OK

Wildcard succeeds because settings resource always exists (defaults returned when no file on disk).


Test 14: PUT validation rejects invalid values

Verification:
Sent PUT /api/settings/conversation with {"max_iterations":0} (violates ge=1 constraint).

Result:

HTTP/1.1 422 Unprocessable Content
{"detail":[{"type":"greater_than_equal","loc":["body","max_iterations"],"msg":"Input should be greater than or equal to 1"}]}

Constraints enforced on full PUT.


Test 15: Write endpoints return ETag in response

Verification:
Checked response headers from PATCH and PUT operations.

Result:

PATCH response: etag: "a2933dbe8b07c7ea757b4ac7c38660ec"
PUT response: etag: "78157b2cf9e85cfc1cd3a0aa2cb177a8"

Clients receive new ETag immediately without follow-up GET.

Issues Found

None.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Solid implementation of optimistic concurrency with typed endpoints. One critical bug in dict/list merge semantics needs fixing, plus test coverage gaps.


[CRITICAL ISSUES]

🔴 Data Structure / Merge Semantics Bug: The apply logic violates documented behavior for collections.

🔴 Missing Critical Test Coverage: Tests don't verify key security and semantic behaviors.

[IMPROVEMENT OPPORTUNITIES]

🟠 Test Coverage Gaps: Several important scenarios untested.

🟠 Documentation Clarity: Some edge cases and design decisions lack explanation.


[RISK ASSESSMENT]

⚠️ Risk Assessment: 🟡 MEDIUM

This PR changes settings persistence semantics and adds optimistic concurrency. Key risk factors:

  • Data Loss Risk: Dict merge bug could cause settings corruption for acp_env and mcp_config fields
  • Breaking Change: New typed endpoints deprecate legacy PATCH (though backward compat maintained)
  • Security Surface: If-Match headers and ETag computation over plaintext secrets (secure, but novel)
  • Test Coverage: Critical behaviors like secret redaction and PUT replace semantics not fully tested

The implementation is well-architected (atomic If-Match checks, proper audit logging, RFC-compliant headers), but the merge bug and test gaps need addressing before merge.

Recommendation: Fix the critical dict merge bug and add the missing tests. The concurrency design is sound, but we need proof that it works correctly for all collection types.


VERDICT:

Needs rework: Fix the critical merge bug and add missing test coverage first.

KEY INSIGHT:

The ETag-over-plaintext-canonical approach elegantly solves Fernet nonce instability, but the partial update merge logic doesn't distinguish between "nested objects to merge" (llm, condenser) and "collections to replace" (acp_env, tools, mcp_config), creating a subtle data corruption vector.

# One-level merge: explicit keys in value override existing keys.
nested = dict(existing)
nested.update(value)
merged[key] = nested
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical - Data Structure Bug: This code merges ALL dicts (lines 278-282), but the module docstring (lines 21-22) explicitly states:

"Lists and dicts (tools, acp_env, mcp_config) replace the current value when present."

This means acp_env: dict[str, str] will be incorrectly merged instead of replaced.

Example failure:

  • Current: {"OLD_KEY": "old_val"}
  • Update: {"NEW_KEY": "new_val"}
  • Expected: {"NEW_KEY": "new_val"} (replace)
  • Actual: {"OLD_KEY": "old_val", "NEW_KEY": "new_val"} (merge)

This breaks the documented semantics and could leave stale environment variables that the user thought were removed.

Fix: Only merge nested update models (instances of _UpdateBase), not arbitrary dicts:

for key, value in overlay.items():
    existing = base.get(key)
    # Only merge nested update models (llm, condenser, verification)
    if isinstance(value, dict) and isinstance(existing, dict):
        # Check if the source field expects an update model (not a plain dict)
        source_field = type(update).model_fields.get(key)
        if source_field and hasattr(source_field.annotation, '__mro__') and _UpdateBase in source_field.annotation.__mro__:
            # This is a nested update model - merge it
            nested = dict(existing)
            nested.update(value)
            merged[key] = nested
        else:
            # Plain dict/list - replace it
            merged[key] = value
    else:
        merged[key] = value

Or simpler: check isinstance(update.model_fields[key].annotation, type) and issubclass(...) to identify update model fields.

)
assert response.status_code == 200

# Read back with plaintext exposure: api_key must still be 'sk-original'.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical - Missing Security Test: This test verifies api_key is preserved via X-Expose-Secrets: plaintext, but doesn't test the default redaction behavior.

Add before line 213:

# Verify default GET redacts the secret
redacted = client.get('/api/settings').json()
assert redacted['agent_settings']['llm']['api_key'] == '**********'

This ensures PATCH responses don't accidentally leak secrets (per settings_router.py:279 design).

new_etag = response.headers["ETag"]
assert new_etag.startswith('"')

# PUT does *not* honour the legacy "extra ignored" behaviour — it
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical - Missing Semantic Test: test_put_agent_full_replace doesn't verify PUT truly replaces (vs merges).

Add:

# Seed a field that's not in the PUT payload
client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'agent': 'CustomAgent'})

# PUT without that field
response = client.put(
    '/api/settings/agent',
    json={
        'agent_kind': 'openhands',
        'llm': {'model': 'gpt-4o', 'usage_id': 'u'},
    },
)
assert response.status_code == 200

# Verify 'agent' was reset to default (not preserved)
final = client.get('/api/settings').json()['agent_settings']
assert final['agent'] == 'CodeActAgent'  # default, not 'CustomAgent'

This is essential to prove PUT vs PATCH semantics.

assert agent_settings["llm"]["api_key"] == "sk-original"


def test_patch_agent_unknown_field_returns_422(client):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Missing Collection Replace Test: No test verifies lists/dicts are replaced (not merged) per update_models.py:21-22.

Add:

def test_patch_replaces_collections_not_merge(client):
    # Seed tools
    client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'tools': [Tool(name='terminal').model_dump()]})
    
    # PATCH with different tools - should REPLACE
    response = client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'tools': [Tool(name='browser').model_dump()]})
    
    tools = response.json()['agent_settings']['tools']
    assert len(tools) == 1
    assert tools[0]['name'] == 'browser'  # Not ['terminal', 'browser']

This directly tests the bug found in update_models.py:278.

)
assert response.status_code == 200

# Read back with plaintext exposure: api_key must still be 'sk-original'.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Missing Deep Merge Test: test_patch_agent_partial_preserves_api_key only tests one LLM field. Should verify:

# After line 213, add:
# Verify other LLM fields preserved
assert agent_settings['llm'].get('base_url') is None  # Preserved
assert agent_settings['llm'].get('timeout') is not None  # Preserved

# Test nested object preservation (condenser, verification)
client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'condenser': {'enabled': False}})
response = client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'llm': {'model': 'gpt-3.5'}})
assert response.json()['agent_settings']['condenser']['enabled'] is False  # Preserved across llm update

json={"max_iterations": 300},
headers={"If-Match": etag},
)
assert b.status_code == 412
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Missing Retry Happy Path: test_concurrent_clients_detected_with_if_match shows detection but not resolution.

Add after line 377:

# Client B retries with the new ETag from the 412 response
fresh_etag = b.headers['ETag']
retry = client.patch(
    '/api/settings/conversation',
    json={'max_iterations': 300},
    headers={'If-Match': fresh_etag},
)
assert retry.status_code == 200
assert client.get('/api/settings').json()['conversation_settings']['max_iterations'] == 300

This proves the 412 response carries a valid ETag for retry.

json={"agent_kind": "openhands", "llm": {"model": "gpt-4o"}},
)
etag2 = client.get("/api/settings").headers["ETag"]
assert etag1 != etag2
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Missing Restart Stability Test: ETag stability tests don't verify persistence across server restarts (the "not over encrypted bytes" claim).

Add:

def test_etag_stable_across_restarts(temp_persistence_dir, secret_key):
    # First app instance
    config1 = Config(static_files_path=None, session_api_keys=[], secret_key=SecretStr(secret_key))
    client1 = TestClient(create_app(config1))
    client1.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'llm': {'model': 'gpt-4'}})
    etag1 = client1.get('/api/settings').headers['ETag']
    
    # Simulate restart - new app instance, same persistence dir
    reset_stores()
    config2 = Config(static_files_path=None, session_api_keys=[], secret_key=SecretStr(secret_key))
    client2 = TestClient(create_app(config2))
    etag2 = client2.get('/api/settings').headers['ETag']
    
    assert etag1 == etag2  # Proves ETag computed over plaintext, not Fernet ciphertext

fields: dict[str, tuple[Any, Any]] = {}
for fname, finfo in model_cls.model_fields.items():
ann = finfo.annotation
if ann is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: The if ann is None: continue check skips fields with None annotation without explanation. When would a properly-defined Pydantic field have None as its annotation (not Optional[X])?

Either add a comment explaining the edge case this handles, or remove if defensive.

class ConversationSettingsUpdate(_UpdateBase):
"""Typed partial update for :class:`ConversationSettings`."""

max_iterations: int | None = Field(default=None, ge=1)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: ConversationSettingsUpdate.max_iterations includes ge=1 constraint, but _make_partial() docstring (lines 88-89) says "Constraints ... are NOT carried over".

This inconsistency could confuse developers. Either:

  1. Remove the constraint (match the stated philosophy), or
  2. Update docs to clarify hand-written update models may include constraints for defense-in-depth.

canonical = settings.model_dump(
mode="json", context={"expose_secrets": "plaintext"}
)
blob = json.dumps(canonical, sort_keys=True, separators=(",", ":")).encode("utf-8")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: SHA256 truncated to 32 hex chars (128 bits) without explanation. While 128 bits provides strong collision resistance, consider adding a comment:

digest = hashlib.sha256(blob).hexdigest()[:32]  # 128 bits sufficient for collision resistance while keeping headers compact

@github-actions
Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   api.py2362191%97, 99–104, 106, 108, 110, 144, 156, 171, 177, 425, 428, 432–434, 436, 442
   settings_router.py1841094%272, 274–275, 526–527, 637, 639–640, 678, 683
openhands-sdk/openhands/sdk/settings
   __init__.py21290%181, 183
   update_models.py1003763%96, 170–173, 175–176, 196, 215, 226–230, 232–233, 260–263, 270–271, 273, 275–278, 280–282, 284, 288, 290, 304–307
TOTAL27133625776% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants