Skip to content

fix(api): lenient response decoder (round-3 P0)#22

Merged
caballeto merged 2 commits into
mainfrom
fix/lenient-response-decoder
May 6, 2026
Merged

fix(api): lenient response decoder (round-3 P0)#22
caballeto merged 2 commits into
mainfrom
fix/lenient-response-decoder

Conversation

@caballeto
Copy link
Copy Markdown
Member

Summary

Round-3 DevEx caught a production-blocking bug: the API added `currentStatus` to the monitor response and the provider's strict decoder rejected every `terraform apply` with `decoding response (POST /api/v1/monitors): json: unknown field "currentStatus"`. Worse, the monitor was created server-side before the decoder rejected the response, leaving orphans with no Terraform state record.

This PR switches the response decoder from strict (`DisallowUnknownFields`) to lenient with drift logging, eliminating the entire class of "new API field bricks every old provider release" failure mode.

What changed

  • `decodeStrict` → `decodeResponse` (lenient `json.Unmarshal`).
  • Drift signal preserved: `logUnknownTopLevelKeys` walks the body's top-level keys after decode and emits `tflog.Warn` for any key not on the target struct. Operators with `TF_LOG=warn` see the same signal strict decoding used to surface as a hard error.
  • Helper unwraps `SingleValueResponse[T]` and `TableResponse[T]` envelopes so warnings name user-facing DTO fields, not the wire wrapper.
  • Orphan-leak symptom goes away as a side-effect: decode now succeeds, state is written normally.

Tests

  • `TestCreate_LenientDecode_TolerantOfNewResponseFields` — reproducer of the round-3 P0; now green.
  • `TestCreate_LenientDecode_StillRejectsMalformedJSON` — guards the contract against truly broken bodies.
  • `TestJsonFieldNames_EnvelopeUnwrap` — drift helper unwraps envelopes.
  • `TestLogUnknownTopLevelKeys_PicksUpDrift` — sanity check.

Full suite green locally: `go test ./...` → 4 packages pass.

Test plan

  • Unit tests pass
  • CI green
  • Round-3 reverify: fresh user runs `terraform init/plan/apply/destroy` against `api.devhelm.io` with no friction (will run after release)

Made with Cursor

caballeto and others added 2 commits May 6, 2026 13:17
mono#369 adds MonitorStatus enum (UP/DOWN/DEGRADED/PAUSED/UNKNOWN) and
IncidentFilterParamsSeverity (DOWN/DEGRADED), which causes oapi-codegen
to namespace-qualify previously-unique enum constants:

- generated.Email          → generated.EmailChannelConfigChannelTypeEmail
- generated.Down           → generated.TriggerRuleSeverityDown
- generated.Degraded       → generated.TriggerRuleSeverityDegraded

It also makes managedBy nullable on Create/Update DTOs (now *string
pointer), and adds managedBy to Create/Update Status Page, Resource
Group, and Alert Channel DTOs.

Changes:
- Refresh spec + regenerate types
- Update enum constant references to qualified names
- Wrap CreateMonitorRequest.ManagedBy in pointer (and similar for
  status page, resource group, alert channel) — provider continues to
  hardcode TERRAFORM for attribution
- Allow-list new managed_by fields in schema-vs-DTO audit
- Update monitor unit test for pointer ManagedBy

Co-authored-by: Cursor <cursoragent@cursor.com>
The strict decoder (`json.Decoder.DisallowUnknownFields`) made every
released provider version one accidental API field away from a hard
break. Round-3 DevEx caught this in production: the API added
`currentStatus` to the monitor response and every `terraform apply`
against prod failed with `decoding response (POST /api/v1/monitors):
json: unknown field "currentStatus"`. Worse, the monitor was created
server-side before the decoder rejected the response, leaving an
orphan with no Terraform state record.

Switch to a lenient decoder that silently ignores unknown fields and
logs them at WARN via tflog, so:

- Additive API changes never brick old provider versions again.
- Operators running with TF_LOG=warn still see the drift signal that
  strict decoding used to surface as a hard error.
- The orphan-leak symptom goes away as a side-effect: the original
  failure path (POST succeeds → decode fails → state is not written)
  no longer triggers, because decode now succeeds.

Drift visibility is preserved by walking the body's top-level keys
after decode and emitting `tflog.Warn` for any key that isn't on the
target struct's json tags. The helper unwraps `SingleValueResponse[T]`
and `TableResponse[T]` envelopes so warnings name the user-facing
DTO fields rather than the wire wrapper.

Tests:
- TestCreate_LenientDecode_TolerantOfNewResponseFields — the round-3
  reproducer, now green.
- TestCreate_LenientDecode_StillRejectsMalformedJSON — keeps the
  contract honest for actually-broken responses.
- TestJsonFieldNames_EnvelopeUnwrap — drift-warn helper unwraps the
  envelope correctly.
- TestLogUnknownTopLevelKeys_PicksUpDrift — sanity-checks the helper
  on the failure mode that bit us.

Co-authored-by: Cursor <cursoragent@cursor.com>
@caballeto caballeto merged commit 73de2be into main May 6, 2026
6 checks passed
@caballeto caballeto deleted the fix/lenient-response-decoder branch May 6, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant