@@ -25,19 +25,53 @@ downstream calls behave on the wire.
2525
2626---
2727
28+ ## Architecture at a glance
29+
30+ ``` mermaid
31+ flowchart LR
32+ Client[Client<br/>curl / browser] --> Ctrl[Customer360Controller]
33+ Ctrl --> Agg[Customer360AggregatorService]
34+
35+ Agg -->|sync| SapPartner[SAP OData<br/>A_BusinessPartner]
36+ Agg -->|async| SapAddr[SAP OData<br/>to_BusinessPartnerAddress]
37+ Agg -->|async| SapRole[SAP OData<br/>to_BusinessPartnerRole]
38+ Agg -->|async| PgTags[(Postgres<br/>customer_tag)]
39+ Agg -->|async| PgNotes[(Postgres<br/>customer_note)]
40+
41+ SapPartner -.->|HTTP/1.1 + TLS<br/>keep-alive| SAPAPI[SAP Sandbox]
42+ SapAddr -.-> SAPAPI
43+ SapRole -.-> SAPAPI
44+ PgTags -.->|JDBC + TLS| PG[(Postgres 16)]
45+ PgNotes -.-> PG
46+ ```
47+
48+ A few things to notice about this shape:
49+
50+ - The synchronous SAP ` A_BusinessPartner ` fetch runs first and acts as the
51+ existence check — if it fails the whole request short-circuits.
52+ - The four remaining calls (two SAP nav collections + two Postgres
53+ queries) run ** in parallel** via ` CompletableFuture.allOf ` dispatched on
54+ the dedicated ` sapCallExecutor ` thread pool.
55+ - All three SAP calls hit the same host (same SNI) and share a
56+ connection-pooled Apache ` HttpComponents5 ` client with keep-alive; the
57+ two Postgres calls share a HikariCP pool.
58+ - The five results are merged into a single JSON envelope and returned to
59+ the caller in one trip — one inbound request, five concurrent backend
60+ conversations, one response.
61+
62+ ---
63+
2864## Why this shape is interesting for Keploy
2965
3066The service is deliberately structured to exercise the trickiest parts of
3167Keploy's interception layer in a single flow.
3268
3369- ** Parallel outbound TLS** — every ` /360 ` request opens 3 concurrent HTTPS
3470 connections to the SAP sandbox plus 2 concurrent TLS-enabled Postgres
35- queries. This shape reliably surfaces parser-level concurrency bugs .
71+ queries, giving Keploy a dense concurrency pattern to capture and replay .
3672- ** Chunked HTTP/1.1 + keep-alive reuse** — SAP's sandbox returns chunked
37- responses over a reused keep-alive connection. This is the path that
38- exposed a 60-second idle-timeout stall inside Keploy (a single ` /360 `
39- request went from ~ 50 s down to ~ 586 ms after the fix). See
40- [ keploy/keploy #4110 ] ( https://github.com/keploy/keploy/pull/4110 ) .
73+ responses over a reused keep-alive connection, so the recorded mocks
74+ preserve the same wire shape your service sees in production.
4175- ** Schema diversity in a single repo** — GET / POST / DELETE verbs, JSON
4276 request bodies, a custom ` X-Correlation-Id ` header, actuator health
4377 probes, both chunked and Content-Length responses, and the OpenAPI
@@ -46,6 +80,19 @@ Keploy's interception layer in a single flow.
4680 connection pool, which exercises the v3 Postgres parser's
4781 prepared-statement cache handling and pool-reuse semantics.
4882
83+ ### Why Keploy?
84+
85+ - Captures live production-shape traffic, including the concurrent SAP
86+ fan-out, without mocks.
87+ - Replays the exact same multi-TLS concurrency pattern inside CI, so
88+ regressions in the real HTTP/Postgres stack are caught before release.
89+ - Auto-detects non-deterministic fields (timestamps, correlation IDs) and
90+ marks them as noise.
91+ - In-cluster mode spins up an ephemeral replica and runs the test set
92+ automatically on every new pod version — no manual test writing.
93+ - No code changes to the Spring Boot app — Keploy sits in the network
94+ path via eBPF.
95+
4996---
5097
5198## Requirements
@@ -199,13 +246,10 @@ Classic Spring Boot layering, with one custom wrinkle for the fan-out:
199246- ** ` 502 SAP upstream error ` on ` /360 ` .** Check ` SAP_API_KEY ` ; the SAP
200247 sandbox also rate-limits at roughly 120 requests/minute. The built-in
201248 Resilience4j circuit breaker will open if you punch through that.
202- - ** Recording stalls / ` /360 ` takes ~ 60 s.** You're probably on Keploy
203- < v3.3, which had an HTTP chunked-terminator bug on keep-alive reuse.
204- Upgrade to v3.3.x or newer (fixed in
205- [ keploy/keploy #4110 ] ( https://github.com/keploy/keploy/pull/4110 ) ).
206- - ** Tests fail only on ` X-Correlation-Id ` .** Make sure the header is in
207- ` test.globalNoise.global ` in ` keploy.yml ` ; it's generated per request
208- and can never match otherwise.
249+ - ** Tests drift on ` X-Correlation-Id ` .** Configure ` X-Correlation-Id `
250+ as noise in ` keploy.yml ` under ` globalNoise.header.X-Correlation-Id ` .
251+ Keploy respects case-insensitive header matching, so you can use any
252+ casing.
209253- ** ` ImagePullBackOff ` / ` ErrImageNeverPull ` in kind.** You forgot to
210254 ` kind load docker-image customer360:local ` — run ` ./deploy_kind.sh build ` .
211255- ** Liveness probe flaps at startup.** The 40 s ` startupProbe ` grace is
0 commit comments