feat(agent-server): typed settings update endpoints + ETag/If-Match#3234
feat(agent-server): typed settings update endpoints + ETag/If-Match#3234rbren wants to merge 1 commit into
Conversation
Adds typed PATCH/PUT endpoints for agent and conversation settings as a discoverable, concurrency-safe alternative to the legacy untyped dict[str, Any] PATCH /api/settings: PATCH /api/settings/agent — typed partial update (extra='forbid') PUT /api/settings/agent — full replace (variant switching) PATCH /api/settings/conversation — typed partial update PUT /api/settings/conversation — full replace Other changes: * SDK exposes OpenHandsAgentSettingsUpdate, ACPAgentSettingsUpdate, ConversationSettingsUpdate, LLMUpdate etc. + apply_agent_update / apply_conversation_update helpers. * All settings endpoints now emit an ETag header (computed over the plaintext-canonical state so it survives per-save Fernet nonces) and honour If-Match for optimistic concurrency (412 on mismatch, with the current ETag echoed back). * The legacy PATCH /api/settings is preserved as-is for backwards compatibility and additionally honours If-Match plus emits Deprecation / Link headers pointing at the typed successors. * The shared HTTPException handler in api.py now propagates exc.headers so the 412 ETag actually reaches the client. See PR description for a fuller analysis of the issues this addresses. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
All typed settings endpoints work correctly with ETag/If-Match support for optimistic concurrency control.
Does this PR achieve its stated goal?
Yes. This PR successfully adds ETag/If-Match support to prevent lost updates and provides new typed endpoints (PATCH/PUT /api/settings/agent and /api/settings/conversation) with strict extra="forbid" validation. The legacy endpoint now honors If-Match and signals deprecation via headers. All functional requirements verified through real HTTP requests to a running agent-server.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build succeeded, agent-server started on port 8000 |
| CI Status | |
| Functional Verification | ✅ All 15 endpoint behaviors verified via curl |
Functional Verification
Test 1: GET /api/settings returns ETag header
Verification:
Ran curl -i http://127.0.0.1:8000/api/settings:
HTTP/1.1 200 OK
etag: "95f24b306974722b34e0ae4614b1f9ea"
This confirms the GET endpoint now returns a quoted ETag header that clients can capture for subsequent If-Match requests.
Test 2: ETag stability across identical saves
Step 1 — Establish baseline:
Made two identical PATCH requests (setting llm.model to gpt-4o).
Result:
First ETag: "8179e34f3475c871d432bba8ee4aecbf"
Second ETag: "8179e34f3475c871d432bba8ee4aecbf"
✓ ETags match - stable
This proves ETags are computed over plaintext-canonical state (not encrypted bytes), so identical saves produce identical ETags — a critical improvement over byte-level hashing.
Test 3: ETag changes on real content change
Step 1 — Baseline ETag:
ETag before: "8179e34f3475c871d432bba8ee4aecbf"
Step 2 — Change model to gpt-4:
Ran PATCH /api/settings/agent with {"llm":{"model":"gpt-4"}}.
Result:
ETag after: "ecbafbb7009b615cf6375f2a056cbd66"
✓ ETags differ - change detected
ETags correctly change when content changes, enabling clients to detect concurrent modifications.
Test 4: If-Match validation - 412 on stale ETag
Step 1 — Capture current ETag:
Captured ETag: "ecbafbb7009b615cf6375f2a056cbd66"
Step 2 — Another client moves state forward:
Made a PATCH changing model to claude-sonnet-4-20250514.
Step 3 — Attempt update with stale ETag:
Ran PATCH /api/settings/agent with If-Match: "ecbafbb7009b615cf6375f2a056cbd66".
Result:
HTTP/1.1 412 Precondition Failed
etag: "02c53e400a17e6325149228fb822de82"
{"detail":"If-Match does not match the current settings ETag. Re-fetch /api/settings and retry with the new ETag."}
The 412 response correctly rejects stale updates and returns the current ETag so clients can retry. This is the core optimistic concurrency feature working.
Test 5: If-Match validation - success with current ETag
Verification:
Captured current ETag and immediately made a PATCH with that ETag in If-Match header.
Result:
HTTP/1.1 200 OK
etag: "8179e34f3475c871d432bba8ee4aecbf"
Update succeeded when If-Match matches current state.
Test 6: Legacy PATCH /api/settings honors If-Match and signals deprecation
Verification:
Ran PATCH /api/settings (legacy endpoint) with valid If-Match header.
Result:
HTTP/1.1 200 OK
etag: "ecbafbb7009b615cf6375f2a056cbd66"
deprecation: true
link: </api/settings/agent>; rel="successor-version", </api/settings/conversation>; rel="successor-version"
The legacy endpoint:
- Honors If-Match for optimistic concurrency
- Returns
Deprecation: trueheader (RFC 8594) - Returns
Linkheaders pointing to typed successors
Backward compatibility maintained while signaling migration path.
Test 7: PATCH /api/settings/agent preserves untouched fields (api_key)
Step 1 — Set API key:
Used legacy PATCH to set llm.api_key to sk-test-original-key.
Step 2 — Change only model via typed PATCH:
Ran PATCH /api/settings/agent with {"llm":{"model":"gpt-4o"}}.
Step 3 — Verify API key preserved:
Fetched settings with X-Expose-Secrets: plaintext header.
Result:
api_key: sk-test-original-key
model: gpt-4o
This is the key practical benefit — clients can update model names without round-tripping secrets.
Test 8: extra="forbid" validation rejects unknown fields
Verification:
Sent PATCH /api/settings/agent with typo: {"lllm":{"model":"gpt-4o"}}.
Result:
HTTP/1.1 422 Unprocessable Content
{"detail":"Agent settings update validation failed"}
Unknown fields fail loudly instead of being silently ignored (major improvement over legacy deep-merge).
Test 9: Variant invariance - PATCH cannot switch agent_kind
Verification:
Attempted PATCH /api/settings/agent with {"agent_kind":"acp"} while current kind is openhands.
Result:
HTTP/1.1 422 Unprocessable Content
{"detail":"PATCH cannot switch agent_kind (current='openhands', requested='acp'); use PUT to replace the whole agent settings object."}
Variant switching is correctly blocked on PATCH, with clear guidance to use PUT.
Test 10: PUT /api/settings/agent can switch agent_kind
Step 1 — Switch to ACP:
Ran PUT /api/settings/agent with {"agent_kind":"acp","acp_server":"claude-code",...}.
Result:
{"agent_kind": "acp", "acp_server": "claude-code"}Step 2 — Switch back to OpenHands:
Ran PUT /api/settings/agent with {"agent_kind":"openhands",...}.
Result:
{"agent_kind": "openhands", "agent": "CodeActAgent"}PUT successfully switches variants in both directions.
Test 11: PATCH /api/settings/conversation preserves untouched fields
Step 1 — Set initial state:
Used PUT /api/settings/conversation to set {"max_iterations":100,"confirmation_mode":true}.
Step 2 — Partially update:
Ran PATCH /api/settings/conversation with {"max_iterations":200}.
Result:
{"max_iterations": 200, "confirmation_mode": true}Partial update changed only the specified field; confirmation_mode preserved.
Test 12: Conversation settings extra="forbid" validation
Verification:
Sent PATCH /api/settings/conversation with typo: {"max_iterationz":300}.
Result:
HTTP/1.1 422 Unprocessable Content
{"detail":[{"type":"extra_forbidden","loc":["body","max_iterationz"],"msg":"Extra inputs are not permitted"}]}
Typo rejected with detailed Pydantic validation error.
Test 13: If-Match: * wildcard succeeds
Verification:
Sent PATCH /api/settings/agent with If-Match: *.
Result:
HTTP/1.1 200 OK
Wildcard succeeds because settings resource always exists (defaults returned when no file on disk).
Test 14: PUT validation rejects invalid values
Verification:
Sent PUT /api/settings/conversation with {"max_iterations":0} (violates ge=1 constraint).
Result:
HTTP/1.1 422 Unprocessable Content
{"detail":[{"type":"greater_than_equal","loc":["body","max_iterations"],"msg":"Input should be greater than or equal to 1"}]}
Constraints enforced on full PUT.
Test 15: Write endpoints return ETag in response
Verification:
Checked response headers from PATCH and PUT operations.
Result:
PATCH response: etag: "a2933dbe8b07c7ea757b4ac7c38660ec"
PUT response: etag: "78157b2cf9e85cfc1cd3a0aa2cb177a8"
Clients receive new ETag immediately without follow-up GET.
Issues Found
None.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Solid implementation of optimistic concurrency with typed endpoints. One critical bug in dict/list merge semantics needs fixing, plus test coverage gaps.
[CRITICAL ISSUES]
🔴 Data Structure / Merge Semantics Bug: The apply logic violates documented behavior for collections.
🔴 Missing Critical Test Coverage: Tests don't verify key security and semantic behaviors.
[IMPROVEMENT OPPORTUNITIES]
🟠 Test Coverage Gaps: Several important scenarios untested.
🟠 Documentation Clarity: Some edge cases and design decisions lack explanation.
[RISK ASSESSMENT]
This PR changes settings persistence semantics and adds optimistic concurrency. Key risk factors:
- Data Loss Risk: Dict merge bug could cause settings corruption for
acp_envandmcp_configfields - Breaking Change: New typed endpoints deprecate legacy PATCH (though backward compat maintained)
- Security Surface: If-Match headers and ETag computation over plaintext secrets (secure, but novel)
- Test Coverage: Critical behaviors like secret redaction and PUT replace semantics not fully tested
The implementation is well-architected (atomic If-Match checks, proper audit logging, RFC-compliant headers), but the merge bug and test gaps need addressing before merge.
Recommendation: Fix the critical dict merge bug and add the missing tests. The concurrency design is sound, but we need proof that it works correctly for all collection types.
VERDICT:
❌ Needs rework: Fix the critical merge bug and add missing test coverage first.
KEY INSIGHT:
The ETag-over-plaintext-canonical approach elegantly solves Fernet nonce instability, but the partial update merge logic doesn't distinguish between "nested objects to merge" (llm, condenser) and "collections to replace" (acp_env, tools, mcp_config), creating a subtle data corruption vector.
| # One-level merge: explicit keys in value override existing keys. | ||
| nested = dict(existing) | ||
| nested.update(value) | ||
| merged[key] = nested |
There was a problem hiding this comment.
🔴 Critical - Data Structure Bug: This code merges ALL dicts (lines 278-282), but the module docstring (lines 21-22) explicitly states:
"Lists and dicts (tools, acp_env, mcp_config) replace the current value when present."
This means acp_env: dict[str, str] will be incorrectly merged instead of replaced.
Example failure:
- Current:
{"OLD_KEY": "old_val"} - Update:
{"NEW_KEY": "new_val"} - Expected:
{"NEW_KEY": "new_val"}(replace) - Actual:
{"OLD_KEY": "old_val", "NEW_KEY": "new_val"}(merge)
This breaks the documented semantics and could leave stale environment variables that the user thought were removed.
Fix: Only merge nested update models (instances of _UpdateBase), not arbitrary dicts:
for key, value in overlay.items():
existing = base.get(key)
# Only merge nested update models (llm, condenser, verification)
if isinstance(value, dict) and isinstance(existing, dict):
# Check if the source field expects an update model (not a plain dict)
source_field = type(update).model_fields.get(key)
if source_field and hasattr(source_field.annotation, '__mro__') and _UpdateBase in source_field.annotation.__mro__:
# This is a nested update model - merge it
nested = dict(existing)
nested.update(value)
merged[key] = nested
else:
# Plain dict/list - replace it
merged[key] = value
else:
merged[key] = valueOr simpler: check isinstance(update.model_fields[key].annotation, type) and issubclass(...) to identify update model fields.
| ) | ||
| assert response.status_code == 200 | ||
|
|
||
| # Read back with plaintext exposure: api_key must still be 'sk-original'. |
There was a problem hiding this comment.
🔴 Critical - Missing Security Test: This test verifies api_key is preserved via X-Expose-Secrets: plaintext, but doesn't test the default redaction behavior.
Add before line 213:
# Verify default GET redacts the secret
redacted = client.get('/api/settings').json()
assert redacted['agent_settings']['llm']['api_key'] == '**********'This ensures PATCH responses don't accidentally leak secrets (per settings_router.py:279 design).
| new_etag = response.headers["ETag"] | ||
| assert new_etag.startswith('"') | ||
|
|
||
| # PUT does *not* honour the legacy "extra ignored" behaviour — it |
There was a problem hiding this comment.
🔴 Critical - Missing Semantic Test: test_put_agent_full_replace doesn't verify PUT truly replaces (vs merges).
Add:
# Seed a field that's not in the PUT payload
client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'agent': 'CustomAgent'})
# PUT without that field
response = client.put(
'/api/settings/agent',
json={
'agent_kind': 'openhands',
'llm': {'model': 'gpt-4o', 'usage_id': 'u'},
},
)
assert response.status_code == 200
# Verify 'agent' was reset to default (not preserved)
final = client.get('/api/settings').json()['agent_settings']
assert final['agent'] == 'CodeActAgent' # default, not 'CustomAgent'This is essential to prove PUT vs PATCH semantics.
| assert agent_settings["llm"]["api_key"] == "sk-original" | ||
|
|
||
|
|
||
| def test_patch_agent_unknown_field_returns_422(client): |
There was a problem hiding this comment.
🟠 Important - Missing Collection Replace Test: No test verifies lists/dicts are replaced (not merged) per update_models.py:21-22.
Add:
def test_patch_replaces_collections_not_merge(client):
# Seed tools
client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'tools': [Tool(name='terminal').model_dump()]})
# PATCH with different tools - should REPLACE
response = client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'tools': [Tool(name='browser').model_dump()]})
tools = response.json()['agent_settings']['tools']
assert len(tools) == 1
assert tools[0]['name'] == 'browser' # Not ['terminal', 'browser']This directly tests the bug found in update_models.py:278.
| ) | ||
| assert response.status_code == 200 | ||
|
|
||
| # Read back with plaintext exposure: api_key must still be 'sk-original'. |
There was a problem hiding this comment.
🟠 Important - Missing Deep Merge Test: test_patch_agent_partial_preserves_api_key only tests one LLM field. Should verify:
# After line 213, add:
# Verify other LLM fields preserved
assert agent_settings['llm'].get('base_url') is None # Preserved
assert agent_settings['llm'].get('timeout') is not None # Preserved
# Test nested object preservation (condenser, verification)
client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'condenser': {'enabled': False}})
response = client.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'llm': {'model': 'gpt-3.5'}})
assert response.json()['agent_settings']['condenser']['enabled'] is False # Preserved across llm update| json={"max_iterations": 300}, | ||
| headers={"If-Match": etag}, | ||
| ) | ||
| assert b.status_code == 412 |
There was a problem hiding this comment.
🟠 Important - Missing Retry Happy Path: test_concurrent_clients_detected_with_if_match shows detection but not resolution.
Add after line 377:
# Client B retries with the new ETag from the 412 response
fresh_etag = b.headers['ETag']
retry = client.patch(
'/api/settings/conversation',
json={'max_iterations': 300},
headers={'If-Match': fresh_etag},
)
assert retry.status_code == 200
assert client.get('/api/settings').json()['conversation_settings']['max_iterations'] == 300This proves the 412 response carries a valid ETag for retry.
| json={"agent_kind": "openhands", "llm": {"model": "gpt-4o"}}, | ||
| ) | ||
| etag2 = client.get("/api/settings").headers["ETag"] | ||
| assert etag1 != etag2 |
There was a problem hiding this comment.
🟠 Important - Missing Restart Stability Test: ETag stability tests don't verify persistence across server restarts (the "not over encrypted bytes" claim).
Add:
def test_etag_stable_across_restarts(temp_persistence_dir, secret_key):
# First app instance
config1 = Config(static_files_path=None, session_api_keys=[], secret_key=SecretStr(secret_key))
client1 = TestClient(create_app(config1))
client1.patch('/api/settings/agent', json={'agent_kind': 'openhands', 'llm': {'model': 'gpt-4'}})
etag1 = client1.get('/api/settings').headers['ETag']
# Simulate restart - new app instance, same persistence dir
reset_stores()
config2 = Config(static_files_path=None, session_api_keys=[], secret_key=SecretStr(secret_key))
client2 = TestClient(create_app(config2))
etag2 = client2.get('/api/settings').headers['ETag']
assert etag1 == etag2 # Proves ETag computed over plaintext, not Fernet ciphertext| fields: dict[str, tuple[Any, Any]] = {} | ||
| for fname, finfo in model_cls.model_fields.items(): | ||
| ann = finfo.annotation | ||
| if ann is None: |
There was a problem hiding this comment.
🟡 Suggestion: The if ann is None: continue check skips fields with None annotation without explanation. When would a properly-defined Pydantic field have None as its annotation (not Optional[X])?
Either add a comment explaining the edge case this handles, or remove if defensive.
| class ConversationSettingsUpdate(_UpdateBase): | ||
| """Typed partial update for :class:`ConversationSettings`.""" | ||
|
|
||
| max_iterations: int | None = Field(default=None, ge=1) |
There was a problem hiding this comment.
🟡 Suggestion: ConversationSettingsUpdate.max_iterations includes ge=1 constraint, but _make_partial() docstring (lines 88-89) says "Constraints ... are NOT carried over".
This inconsistency could confuse developers. Either:
- Remove the constraint (match the stated philosophy), or
- Update docs to clarify hand-written update models may include constraints for defense-in-depth.
| canonical = settings.model_dump( | ||
| mode="json", context={"expose_secrets": "plaintext"} | ||
| ) | ||
| blob = json.dumps(canonical, sort_keys=True, separators=(",", ":")).encode("utf-8") |
There was a problem hiding this comment.
🟡 Suggestion: SHA256 truncated to 32 hex chars (128 bits) without explanation. While 128 bits provides strong collision resistance, consider adding a comment:
digest = hashlib.sha256(blob).hexdigest()[:32] # 128 bits sufficient for collision resistance while keeping headers compact
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||
This PR was opened by an AI agent (OpenHands) on behalf of the user.
What's wrong with
PATCH /api/settingstodayThe current endpoint accepts:
and applies the diffs via
_deep_mergeonto the persisted settings. A few of the consequences:dict[str, Any]. Generated clients are useless; typos likelllmare silently merged and forgotten; field renames inOpenHandsAgentSettingsaren't even breaking changes at the wire level.SettingsUpdatePayloadaccepts anactive_profilekey that the publicSettingsUpdateRequestdoesn't expose. The handlercasts past this mismatch.mcp.headers["Authorization"]you must overwrite the parent dict, which has all the read-modify-write hazards below.agent_kind: openhands → acpends up with leftover OpenHands-only fields silently dropped by re-validation, with no warning.GET → modify → PATCHcheerfully clobber each other; the server has no ETag/If-Match/version to detect it. Most concerning case: list/dict fields (tools,mcp_config.mcpServers) which replace wholesale, so concurrent edits to different MCP servers silently undo one another.GETwhose result may already include another writer's interleaved changes.max_iterations: 0) collapse to"Settings validation failed". The client cannot point at the bad field.Idempotency is accidentally preserved for the common single-client case, but breaks under concurrent writers, encrypted-secret round-trips, and migration scenarios.
What this PR does
1. Typed update models in the SDK (
openhands.sdk.settings.update_models)With
apply_agent_update(current, update)andapply_conversation_update(current, update)helpers that:model_fields_set).llm,condenser,verification) as one-level merges into the current sub-object, so e.g.{"llm": {"model": "gpt-4o"}}preserves the existingapi_key.agent_kind(usePUTto switch variants).2. New endpoints in
settings_router.pyAll four:
extra="forbid"(typos → 422).If-Matchagainst the global settings ETag.ETagheader in the response.store.update()so writes remain atomic and serialise with each other and with the legacy endpoint.3. ETag / If-Match plumbing (
_settings_etag.py)compute_settings_etag()hashes a plaintext-canonical projection ofPersistedSettings. Crucially not the on-disk bytes — Fernet nonces would make identical state look like a change on every save, defeating idempotency.check_if_match()returns412 Precondition Failedon staleIf-Match, with the current ETag echoed in the response so the client can retry against the live state.If-Match: *is accepted (resource always exists, since settings default)._run_settings_updatehelper performs the If-Match check inside the file lock, so the ETag and the state the apply will run against are the same.4. Bonus fix in
api.pyThe shared
HTTPExceptionhandler in_add_exception_handlerswas constructing theJSONResponsefromstatus_code+detailbut droppingexc.headers, so any handler-set response header on an error (e.g. our 412 ETag, or aWWW-Authenticate: ...on 401) was lost. Propagate it.5. Legacy endpoint preserved + upgraded
PATCH /api/settingsis unchanged in semantics and still accepts the olddict[str, Any]diff payload. In addition it now:If-Matchand emits the newETagheader in success responses.Deprecation: trueandLink: </api/settings/agent>; rel="successor-version", </api/settings/conversation>; rel="successor-version"(RFC 8594 / RFC 9745) so forward-compatible clients can migrate at their leisure.No existing tests were modified; the legacy suite (28 tests) still passes unchanged.
Tests
tests/agent_server/test_typed_settings_router.py(21 tests) covers:If-Matchmatching, stale → 412, wildcard.PATCH /api/settingshonouringIf-Matchand emitting the deprecation headers.extra="forbid", variant invariance,agent_kind: "llm"alias, partial LLM preservingapi_key.openhands→acp), 422 on invalid body.If-Match, detected withIf-Match.Legacy: 28/28
tests/agent_server/test_settings_router.pystill pass.SDK: 87/87
tests/sdk/test_settings.py+tests/sdk/settings/still pass.(The 15 unrelated failures in
tests/agent_server/test_skills_router.pyandtest_terminal_service.pyexist onmainbefore this PR.)Compatibility
SettingsUpdateRequestis unchanged.ETag) onGET /api/settings,PATCH /api/settings, and the new endpoints. Old clients that ignore unknown headers see no change.Deprecation/Linkon the legacyPATCH /api/settingsresponse are advisory; old clients ignore them.Follow-ups (not in this PR)
GET /api/settings/mcp-servers,GET/PUT/PATCH/DELETE /api/settings/mcp-servers/{name}). Today even the new typed PATCH still treatsmcp_configas one blob: editing one server'senvrequires round-tripping every other server's encrypted secrets. The CRUD shape would also give per-server ETags so concurrent edits to different servers stop conflicting at all. Mirroring the existing/api/settings/secretsrouter design.If-Match: a config flag to requireIf-Matchon all settings writes, for deployments that want concurrency safety to be mandatory rather than opt-in.loc+ a sanitisedmsgfor fields known not to carry secrets.Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:7d67356-pythonRun
All tags pushed for this build
About Multi-Architecture Support
7d67356-python) is a multi-arch manifest supporting both amd64 and arm647d67356-python-amd64) are also available if needed