Skip to content

Ai gateway#2937

Open
predic8 wants to merge 44 commits into
masterfrom
ai-gateway
Open

Ai gateway#2937
predic8 wants to merge 44 commits into
masterfrom
ai-gateway

Conversation

@predic8
Copy link
Copy Markdown
Member

@predic8 predic8 commented May 7, 2026

Summary by CodeRabbit

  • New Features
    • Added LLM Gateway interceptor supporting Claude, Google Gemini, and OpenAI providers.
    • Implemented token-based rate limiting with per-user quotas and configurable reset periods.
    • Enabled API key sharing with per-user token budgets and model allowlisting.
    • Added input/output token limit enforcement and usage tracking.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an LLM gateway interceptor with provider abstractions (OpenAI, Claude, Google), SSE parsing, JSON helpers, token usage tracking and stores, error creators, tutorials, and integration/unit tests.

Changes

LLM Gateway Integration

Layer / File(s) Summary
Core contracts and utilities
core/.../provider/*(LLM*), core/.../json/JsonUtil.java, core/.../util/http/SSEParser.java
Introduces LLMRequest/Response/Provider, error creator interface/base, JSON helpers, and SSE streaming parser with tests.
Provider adapters
core/.../provider/openai/*, core/.../provider/claude/*, core/.../provider/google/*
Adds provider-specific request parsing, token estimates, SSE response handling, and error response builders.
Gateway interceptor orchestration
core/.../ai/LLMGatewayInterceptor.java
Validates/authenticates, enforces token/model limits, rewrites API keys, forwards JSON, processes usage, and persists it.
Usage tracking and stores
core/.../ai/store/*, core/.../util/jdbc/AbstractJdbcSupport.java
Defines Usage/AiApiUser, in-memory and JDBC stores, and JDBC connection helper.
Events and parsing
core/.../ai/*Event.java
Parses provider SSE events into structured event types.
Tutorials and tests
distribution/tutorials/ai/llm-gateway/**, distribution/...TutorialTest.java
Adds tutorials (OpenAI/Claude/Google) with scripts and comprehensive integration tests.
Minor infrastructure
annot/.../ObjectBinder.java, core/.../security/AbstractSecurityScheme.java, MCP/jsonrpc headers
Instantiation helper and import/header cleanups; no behavior change in security/MCP/jsonrpc parts.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Gateway as LLMGatewayInterceptor
  participant Provider as LLMProvider
  participant Upstream as Upstream LLM API
  participant Store as AiApiStore

  Client->>Gateway: HTTP request (JSON, API key)
  Gateway->>Provider: getLLMRequest(exchange)
  Gateway->>Store: getUser(token)? / checkLimit(...)
  alt Auth/limits fail
    Gateway-->>Client: Error response (auth/limits/model/input)
  else OK
    Gateway->>Provider: getLLMResponse(exchange, postProcessor)
    Provider->>Upstream: Forward adapted request
    Upstream-->>Provider: Streamed/JSON response
    Provider-->>Gateway: LLMResponse (usage, events)
    Gateway->>Store: store(user, usage)
    Gateway-->>Client: Upstream response
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

7.x

Suggested reviewers

  • christiangoerdes
  • rrayst

Poem

A rabbit taps keys with a gentle thrum,
Streams turn to sense as the tokens come.
Claude, OpenAI, and Gemini sing—
The gateway conducts like a velvet spring.
Limits are counted, errors well-spun,
Burrows of tests say, “It’s done!” 🐇✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ai-gateway

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (6)
annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java (1)

102-108: 💤 Low value

instantiate declares checked exceptions that are already swallowed by bind's catch-all

The helper correctly surfaces NoSuchMethodException as a descriptive error. Minor note: InvocationTargetException, InstantiationException, and IllegalAccessException are declared in the throws clause, but since instantiate is private and its only caller (bind) already catches Throwable, these declarations are noise. Consider removing them, which also removes the need for bind to implicitly rely on the catch-all for those paths.

🔧 Optional cleanup
-    private static <T> `@NotNull` T instantiate(Class<T> clazz) throws InvocationTargetException, InstantiationException, IllegalAccessException {
+    private static <T> `@NotNull` T instantiate(Class<T> clazz) {
         try {
             return clazz.getConstructor().newInstance();
         } catch (NoSuchMethodException e) {
             throw new ConfigurationParsingException("Class %s does not have a public no-arg constructor.".formatted(clazz.getName()));
+        } catch (InvocationTargetException | InstantiationException | IllegalAccessException e) {
+            throw new ConfigurationParsingException(e);
         }
     }

This keeps all exception handling inside the helper and makes the contract self-contained, while bind()'s catch (ConfigurationParsingException e) still applies the source-file context as before.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`
around lines 102 - 108, The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.
core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java (1)

60-71: Open @TODO: migrate subclass connections to getConnection()

The comment on line 60 documents that subclasses should be migrated to this method but haven't been yet. Without that migration the deduplication benefit is unrealised.

Would you like me to open a tracking issue for migrating existing subclasses to use getConnection()?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`
around lines 60 - 71, Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java (1)

16-16: 💤 Low value

Consider documenting the checkLimit return value contract

It's unclear whether the return value means "remaining tokens," "0 means deny," or something else. A brief Javadoc would help implementors (e.g., JDBCAiApiUsageStore currently returns 0 always, which may be interpreted as "limit reached").

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`
at line 16, Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (2)

91-108: 💤 Low value

Optional: replace bare RuntimeException with a typed exception.

new RuntimeException(e) in setJsonResponse/getJson swallows the failure category; a domain-specific exception (or at least an IllegalStateException) makes upstream handling/logging clearer and avoids mixing programmer errors with checked-IO failures.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 91 - 108, Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).

71-89: ⚡ Quick win

Recommended: avoid parse/re-serialize when no transformation is needed.

getJson (line 71) and setJsonResponse (line 87) run on every request even when both maxOutputTokens and maxInputTokens are 0 (i.e., nothing to mutate). This adds latency and forces every request through this interceptor to be a JSON object — a non-JSON request will throw RuntimeException from getJson and surface as an internal server error. Skip the body work when there is nothing to do.

♻️ Proposed shape
-        var json = getJson(exc, REQUEST);
-
-        if (maxOutputTokens != 0) {
-            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
-        }
-
-        if (maxInputTokens != 0) {
-            var input = json.get("input");
-            if (input != null) {
-                var estimated = estimateTokens(input.asText());
-                if (estimated > maxInputTokens) {
-                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
-                    return RETURN;
-                }
-            }
-        }
-        setJsonResponse(exc, json);
-        return CONTINUE;
+        if (maxOutputTokens == 0 && maxInputTokens == 0) {
+            return CONTINUE;
+        }
+        var json = getJson(exc, REQUEST);
+        if (maxInputTokens != 0) {
+            var input = json.get("input");
+            if (input != null) {
+                var estimated = estimateTokens(input.asText());
+                if (estimated > maxInputTokens) {
+                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
+                    return RETURN;
+                }
+            }
+        }
+        if (maxOutputTokens != 0) {
+            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
+        }
+        setJsonResponse(exc, json);
+        return CONTINUE;

Also note: input-token check happens after max_output_tokens is written into json; it's safer to validate first so a rejected request never has its body mutated.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 71 - 89, Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

11-11: 💤 Low value

Minor: formatting nit on @MCElement attribute list.

Missing space after the comma in component = false, id="simple-ai-api-store" (name="simpleStore",component). Cosmetic only.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
at line 11, The `@MCElement` annotation on the SimpleAiApiStore class has a
missing space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java`:
- Around line 45-48: The current check in AiUtil that uses
ah.indexOf(BEARER_PREFIX) allows "Bearer" to appear anywhere in the header
(e.g., "XBearer ..."); change the logic to ensure the Authorization header value
starts with the Bearer scheme by using ah.trim().startsWith(BEARER_PREFIX) (or
require index == 0 after trimming) before extracting the token in the method
that currently performs the indexOf check; after confirming the prefix at the
start, strip the exact prefix and any leading whitespace to return the token.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 65: The stored property MEMBRANE_AI_USERTOKEN currently holds an
Optional<AiApiUser> (set via exc.setProperty(..., user)) but is later read with
exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) for per-user tracking;
change the stored representation to the actual bearer/user token string so the
typed accessor matches: when setting MEMBRANE_AI_USERTOKEN store
user.get().getToken() (or whichever String field represents the user/bearer)
instead of the Optional, and ensure any code that reads it
(exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) and
SimpleAiApiStore.store(String user, ...)) expects that String token
consistently.
- Around line 147-149: The getAiStore() accessor currently returns null; change
it to return the configured AiApiStore instance used by the interceptor (e.g.,
return the aiStore/aiApiStore field or delegate to the existing getter used by
other accessors) so reflective access and tests see the actual store; update the
OpenAIAPIInterceptor#getAiStore method to return the interceptor's stored
AiApiStore instance (matching the field name used in the class) instead of null.
- Around line 111-126: handleResponse currently calls store.store(...)
unconditionally which causes an NPE when no aiStore is configured; update the
OpenAIAPIInterceptor.handleResponse method to guard the usage-storage logic with
a null-check (if (store != null)) before calling store.store and getUsage,
mirroring the guard used in handleRequest, and ensure you still continue
processing (return CONTINUE) when store is null; reference symbols:
handleResponse, store, store.store(...), getUsage(...), MEMBRANE_AI_USERTOKEN.
- Around line 128-134: In OpenAIAPIInterceptor.getUsage, the current calls
usage.get("input_tokens").asInt() etc. will NPE if any field is missing; change
to use the safe JSON path accessor (e.g., usage.path("input_tokens").asInt(0))
for each token field so missing keys fall back to 0, and keep the existing guard
for a completely missing usage node (return new Usage(0,0,0)). Ensure you
construct the Usage with the three safe asInt(0) results.
- Around line 120-123: The current check json.get("error").isNull() causes an
NPE when the "error" field is absent; change to a null-safe path lookup using
JsonNode.path(): obtain JsonNode error = json.path("error") and then check both
!error.isMissingNode() && !error.isNull() before returning CONTINUE (update the
code in OpenAIAPIInterceptor where json.get("error") is used).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java`:
- Around line 25-35: contextLengthExceeded currently serializes ErrorBody
directly producing a flat JSON; change it to wrap the ErrorBody inside an
ErrorEnvelope (same pattern used by authenticationFailed() and
tokenLimitExceeded()) before calling createJson so the response shape becomes
{"error":{...}}; update the return in contextLengthExceeded to build the
badRequest().json(createJson(new ErrorEnvelope(new ErrorBody(...)))) response
using the existing ErrorEnvelope and ErrorBody classes.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 17-38: AiApiLimit has concurrent mutations on tokens and
nextReset; make mutations atomic by synchronizing access: mark both checkLimit()
and addTokens(long) as synchronized so reads/writes to tokens and nextReset
happen under the same intrinsic lock for AiApiLimit, ensuring the reset branch
in checkLimit() and the this.tokens += tokens update are thread-safe;
alternatively, replace long tokens with AtomicLong and guard nextReset/reset
logic with a synchronized block, but keep checkLimit() and addTokens() logically
synchronized on the instance so tokens and nextReset stay consistent.
- Around line 22-24: The constructor sets nextReset using period which defaults
to 0, causing immediate resets; change initialization so nextReset is computed
lazily in checkLimit() (or validate period>0 in init()). Specifically, in class
AiApiLimit defer setting nextReset in the constructor, and in method
checkLimit() detect an uninitialized state (e.g., period <= 0 or nextReset ==
null/equals now sentinel) and then set nextReset = now().plusSeconds(period)
(after ensuring period>0) and initialize tokens appropriately; alternatively add
an init() validation in AiApiLimit to assert period > 0 and throw/configure
before any checkLimit() runs. Ensure references: AiApiLimit constructor, field
nextReset, field period, method checkLimit, and optional init().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Around line 53-61: The methods in JDBCAiApiUsageStore — getUser(String) and
checkLimit(AiApiUser) — are stubbed and silently break
authentication/rate-limiting; replace the current bodies so they throw
UnsupportedOperationException with a clear message like "getUser not implemented
in JDBCAiApiUsageStore" / "checkLimit not implemented in JDBCAiApiUsageStore"
(or alternatively add explicit TODO comments) so callers fail fast instead of
being silently denied; ensure messages reference the method names and suggest
implementing them or using a different store, and keep the existing store()
functionality intact.
- Around line 13-22: The CREATE_TABLE_SQL in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks MySQL/MariaDB;
update createTablesIfNotExist()/JDBCAiApiUsageStore to use database-specific
DDL: detect the JDBC URL/DatabaseMetaData and emit "BIGINT AUTO_INCREMENT
PRIMARY KEY" for MySQL/MariaDB and keep the IDENTITY form for PostgreSQL, or
alternatively switch to a portable approach (e.g., remove DB identity and
generate IDs in application code) so the CREATE_TABLE_SQL constant is
chosen/constructed per database engine.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 20-33: store(...) and checkLimit(...) currently use a single
shared AiApiLimit (limit) so per-user quotas are not enforced and concurrent
updates are unsafe; change to maintain per-user limits (e.g., a
ConcurrentHashMap<String, AiApiLimit> or Map<AiApiUser, AiApiLimit> keyed by
token/username) and update store(user, usage) to add tokens to the caller's
AiApiLimit, and change checkLimit(AiApiUser) to consult that user's AiApiLimit;
also make AiApiLimit thread-safe (use AtomicLong/LongAdder for counters or
synchronize access) so addTokens/checkLimit are safe under concurrent requests,
or if you intend a global budget instead rename the field/methods to reflect a
shared limit and document it.
- Around line 16-28: The users field in SimpleAiApiStore is not initialized,
causing getUser(String) to NPE on users.stream(); fix by defensively
initializing the users field (private List<AiApiUser> users = new ArrayList<>())
or by guarding getUser(...) with a null/empty check (e.g., return
Optional.empty() if users is null) and ensure imports (java.util.ArrayList) are
added; update any related logic (store(...) and getUser(...)) to work with the
initialized/guarded users list.

---

Nitpick comments:
In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`:
- Around line 102-108: The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 91-108: Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).
- Around line 71-89: Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`:
- Line 16: Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Line 11: The `@MCElement` annotation on the SimpleAiApiStore class has a missing
space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`:
- Around line 60-71: Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5199bbff-c303-4ad8-a638-e79ae07a87e6

📥 Commits

Reviewing files that changed from the base of the PR and between 1af0001 and d792f5f.

📒 Files selected for processing (12)
  • annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/Usage.java
  • core/src/main/java/com/predic8/membrane/core/security/AbstractSecurityScheme.java
  • core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
Comment on lines +13 to +22
private static final String CREATE_TABLE_SQL = """
CREATE TABLE IF NOT EXISTS ai_api_usage (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
username VARCHAR(255) NOT NULL,
input_tokens INT NOT NULL,
output_tokens INT NOT NULL,
total_tokens INT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
)
""";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

Does MySQL support BIGINT GENERATED ALWAYS AS IDENTITY syntax in CREATE TABLE?

💡 Result:

No, MySQL does not support the BIGINT GENERATED ALWAYS AS IDENTITY syntax in CREATE TABLE. The official MySQL documentation for versions 8.0 [1][2][3] and 8.4 [4] shows no such syntax. Instead, MySQL uses AUTO_INCREMENT for auto-generating unique values in integer columns, including BIGINT. Examples include BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY [5][6][7]. GENERATED ALWAYS AS requires an expression like (expr) [VIRTUAL|STORED], not IDENTITY [2]. This syntax is from PostgreSQL [8], and MySQL users convert it to AUTO_INCREMENT [9]. As of MySQL 8.4 (latest per release notes [4]), no support exists.

Citations:


BIGINT GENERATED ALWAYS AS IDENTITY is not supported by MySQL/MariaDB

The SQL:2003 syntax GENERATED ALWAYS AS IDENTITY is specific to PostgreSQL (10+). MySQL/MariaDB do not support this; they use AUTO_INCREMENT instead. Running this DDL against MySQL will fail at startup during createTablesIfNotExist().

If multi-database support is required, use database-specific DDL generation or a more portable approach (e.g., application-layer ID generation or separate schema scripts per database).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`
around lines 13 - 22, The CREATE_TABLE_SQL in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks MySQL/MariaDB;
update createTablesIfNotExist()/JDBCAiApiUsageStore to use database-specific
DDL: detect the JDBC URL/DatabaseMetaData and emit "BIGINT AUTO_INCREMENT
PRIMARY KEY" for MySQL/MariaDB and keep the IDENTITY form for PostgreSQL, or
alternatively switch to a portable approach (e.g., remove DB identity and
generate IDs in application code) so the CREATE_TABLE_SQL constant is
chosen/constructed per database engine.

predic8 added 5 commits May 7, 2026 15:58
- Introduced thread-safe `AtomicLong` for token management.
- Synchronized reset logic in `AiApiLimit`.
- Improved error handling and null checks in OpenAI API interactions.
- Default-initialized user list in `SimpleAiApiStore`.
- Fixed getter for `AiApiStore` in interceptor.
- Removed `AiUtil` and replaced with modular `AiProvider` interface.
- Added provider implementations: `Claude`, `OpenAI`.
- Updated `OpenAIAPIInterceptor` to use configurable providers and enforce model restrictions.
- Introduced `NoAiApiLimit` for simplified limit management.
- Enhanced error handling with model validation in `OpenAiApiUtil`.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (1)

69-69: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Property type mismatch breaks usage tracking (AiApiUser stored, String retrieved).

You store user object in MEMBRANE_AI_USERTOKEN but later read it as String.class; this will yield null/cast issues and wrong accounting keys.

Also applies to: 137-137

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
at line 69, You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

21-34: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Quota accounting is global, not per user.

user is ignored for limit tracking, so one heavy caller can exhaust quota for everyone. This defeats per-user enforcement.

Based on learnings: verify shared-state protections are preserved when logic is split or delegated; here the shared limit state is central and should be explicitly scoped per user.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 21 - 34, The quota accounting currently uses a single shared
"limit" object so store(String user, Usage usage) and checkLimit(AiApiUser user)
ignore the user and apply/globalize tokens; change this to a per-user limit map
(e.g., Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up
the caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.
🧹 Nitpick comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java (1)

3-8: ⚡ Quick win

NoAiApiLimit enforces a fixed limit of 1000 instead of "no limit".

The class name implies unlimited access, but checkLimit() returns a hardcoded 1000. Since checkLimit() semantically represents remaining quota and is used to reject requests when the value drops to zero, this effectively caps all users at 1000 tokens. Either rename the class to reflect the actual limit (e.g., DefaultAiApiLimit) or return Long.MAX_VALUE (or similar unbounded sentinel) to match the "no limit" semantics.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`
around lines 3 - 8, The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 80-89: The current logic only calls
provider.estimateInputTokens(json) if json.get("input") is present, which skips
providers that estimate from other fields; change the block in
OpenAIAPIInterceptor so that whenever maxInputTokens != 0 you call
provider.estimateInputTokens(json) unconditionally (remove the input null
check), then compare the returned estimated value to maxInputTokens and, if
exceeded, call exc.setResponse(contextLengthExceeded(maxInputTokens, estimated))
and return RETURN; ensure you keep the same symbols: maxInputTokens,
provider.estimateInputTokens(json), exc.setResponse(...),
contextLengthExceeded(...), and RETURN.
- Around line 56-57: provider is used without validation and can NPE when
calling provider.getApiKey(...) or provider.setApiKey(...); add a non-null
validation either at startup in the OpenAIAPIInterceptor.init() method (throw a
clear configuration exception if provider is null) or add a request-time guard
in the request handling path that checks if provider == null and returns a 5xx
response with a clear error message; update references to
provider.getApiKey(...) and provider.setApiKey(...) to only be executed after
this check so no NPE can occur.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java`:
- Around line 35-42: In OpenAI.java, the Bearer token extraction currently uses
ah.indexOf(BEARER_PREFIX) which matches the prefix anywhere; change the logic in
the method that reads ah to require a strict prefix (e.g., after trimming
leading whitespace the header must startWith BEARER_PREFIX or match a
case‑insensitive regex like ^Bearer\s+(\S+)$) and then extract the token from
that exact location (using BEARER_PREFIX.length()), returning null for any
header that does not strictly start with the Bearer prefix or that yields an
empty token; reference BEARER_PREFIX and the variable ah when making this
change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 27-29: getUser in SimpleAiApiStore uses u.getToken().equals(token)
which can NPE if a configured AiApiUser has a null token; change the equality
check to Objects.equals(u.getToken(), token) inside the getUser method to safely
handle nulls and add an import for java.util.Objects if not already present so
the stream filter becomes safe.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 69: You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 21-34: The quota accounting currently uses a single shared "limit"
object so store(String user, Usage usage) and checkLimit(AiApiUser user) ignore
the user and apply/globalize tokens; change this to a per-user limit map (e.g.,
Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up the
caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`:
- Around line 3-8: The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ac0a859-8e9a-4f2a-bead-13d81b96b81b

📥 Commits

Reviewing files that changed from the base of the PR and between 8fa0d95 and 356a4c5.

📒 Files selected for processing (9)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
✅ Files skipped from review due to trivial changes (2)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java Outdated
@membrane-ci-server
Copy link
Copy Markdown

This pull request needs "/ok-to-test" from an authorized committer.

…bstraction

- Added `AiApiRequest` and `AiApiResponse` abstractions for request/response handling.
- Introduced `AbstractAiApiRequest` and `AbstractAiApiResponse` as base classes.
- Implemented providers: `Google`, `OpenAI`, and `Claude` with concrete request/response handling.
- Updated `AiProvider` to handle request/response creation.
- Refactored `OpenAIAPIInterceptor` to leverage request/response abstraction and enforce contract restrictions.
- Enhanced `JsonUtil` with helper methods for JSON body parsing and updates.
- Updated `AiApiStore` and related classes for improved usage tracking and user abstraction.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java (1)

40-47: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bearer token extraction is too permissive — matches anywhere in the header value.

indexOf(BEARER_PREFIX) accepts headers like "Basic abc Bearer xyz" and extracts "xyz" as the token. The standard requires the scheme to be at the start of the value.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`
around lines 40 - 47, The current extraction uses ah.indexOf(BEARER_PREFIX)
which wrongly matches the scheme anywhere in the header; change the logic to
require the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a
regex anchored with ^) before calling ah.substring(...) and trimming, and return
null if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java`:
- Around line 15-35: The method estimateInputTokens uses an int accumulator
named tokens which can overflow before being returned as a long; change the
accumulator declaration in estimateInputTokens from int to long and ensure all
constants/updates use long arithmetic (e.g., use 1000L and ensure divisions
produce long results or cast operands) when adding system, message, block and
image token estimates so no intermediate int overflow occurs when computing the
total from json.path("system") and json.path("messages").

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java`:
- Around line 107-118: getGenerationConfig() can NPE because json may be null
(see estimateInputTokens guard); update getGenerationConfig() (used by
setMaxOutputTokens) to first ensure json is non-null and is an ObjectNode
(create/replace it with a new ObjectNode when null or not an object) before
calling json.get("generationConfig"), then return or create the
"generationConfig" ObjectNode; reference methods: getGenerationConfig,
setMaxOutputTokens and the field json.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java`:
- Around line 15-17: The estimateInputTokens() method in OpenAiAiRequest
currently uses json.path("input").asText("") which returns an empty string for
Chat Completions (no "input" field) and for ArrayNode inputs, making token
checks no-op; update estimateInputTokens() to inspect both "input" and
"messages": if "messages" is an array (Chat Completions) iterate it and
concatenate or sum the lengths of text-like fields (e.g., each message's
"content" or all string values) to compute total chars, if "input" is an array
iterate elements and for each element use asText("") for strings or extract
textual fields from objects before summing; finally compute tokens as
ceil(totalChars / 4.0) and return that long so OpenAIAPIInterceptor's
maxInputTokens check works correctly.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Line 15: The SQL DDL string in JDBCAiApiUsageStore (created in the
createTablesIfNotExist block) contains a Java text-block literal with a `//
`@TODO` ...` comment which is being embedded into the SQL and breaks parsing;
remove the `// `@TODO` ...` from the SQL text block or replace it with a valid SQL
comment (e.g., `/* ... */` or `-- ...`), ensuring the `id BIGINT GENERATED
ALWAYS AS IDENTITY PRIMARY KEY` line ends cleanly (no stray `//` token) so the
DDL executes successfully.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 89-98: getJsonObject currently throws a bare
RuntimeException/IOException on non-JSON or non-object bodies which bubbles out
and crashes the pipeline; change getJsonObject to return Optional<ObjectNode>
(signature: getJsonObject(Message msg) -> Optional<ObjectNode>) and implement it
to return Optional.empty() for non-object JSON or when parsing fails (wrap IO
errors in UncheckedIOException only if you must rethrow); then update callers
(notably AbstractAiApiResponse constructor and any usages in
OpenAIAPIInterceptor) to check the Optional and convert an empty result into a
structured error response or a controlled exception (e.g.,
IllegalArgumentException) so the interceptor can map it to an HTTP error instead
of letting an unchecked RuntimeException propagate.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`:
- Around line 40-47: The current extraction uses ah.indexOf(BEARER_PREFIX) which
wrongly matches the scheme anywhere in the header; change the logic to require
the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a regex
anchored with ^) before calling ah.substring(...) and trimming, and return null
if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b118fe28-0fcb-4a38-a1db-897ea2b30c18

📥 Commits

Reviewing files that changed from the base of the PR and between 356a4c5 and 863e5cc.

📒 Files selected for processing (20)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Google.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
✅ Files skipped from review due to trivial changes (4)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java

Comment thread core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java Outdated
…ponents

- Updated `checkLimit` method to consider input and output tokens.
- Improved token calculation logic in AI request providers.
- Enhanced JSON parsing in `JsonUtil` with Optional for safer operations.
- Added detailed error handling in `OpenAIAPIInterceptor` for invalid requests.
- Refined token estimation logic with safety margins and JSON structure considerations.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 69-72: The setter setPeriod(int period) must validate that period
> 0 and must update the `@GuardedBy`("lock") field nextReset while holding the
lock; modify setPeriod to throw an IllegalArgumentException for non-positive
period values and move the assignment of nextReset = now().plusSeconds(period)
inside the same critical section that protects nextReset (use the class's lock
or synchronized(lock)) so both period and nextReset are updated atomically under
the lock.
- Around line 39-53: The reset-check and token updates must be made atomic: move
the nextReset null/now.isAfter(nextReset) check and the subsequent
tokens.set(0)/nextReset update inside the same synchronized(lock) block and
ensure any token-modifying methods use that same lock; specifically wrap the
block that computes and returns maxTokens - tokens.get() - tokensForNextRequest
in synchronized(lock) (the method containing nextReset) and change
addTokens(long tokens) to synchronize on lock as well (or perform its add inside
synchronized(lock)) so tokens, nextReset and tokensForNextRequest are always
updated/read under the same lock.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 22-23: Replace the INFO-level log that prints per-user identifiers
in SimpleAiApiStore (the log.info call that uses user.getName() and usage) with
a lower-verbosity, non-identifying message: change log.info(...) to
log.debug(...) (or log.trace(...) if you prefer) and remove the user.getName()
from the message so only non-sensitive data like usage.totalTokens() or a
redacted indicator is logged; keep the limit.addTokens(usage.totalTokens())
behavior unchanged.
- Around line 40-42: The setter setUsers currently allows assigning null to the
users field which later causes NPEs in getUser when it calls users.stream();
update setUsers to guard against null by assigning an empty immutable list
(e.g., Collections.emptyList() or List.of()) when the incoming users argument is
null, so that getUser can safely call users.stream(); keep the field name users
and the method getUser unchanged so callers remain consistent.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 98-103: The code in JsonUtil.java reads a JsonNode with "JsonNode
jsonNode = om.readTree(msg.getBodyAsStreamDecoded())" but dereferences jsonNode
via jsonNode.getNodeType(), which can NPE when readTree returns null for empty
input; update the logic in the method containing that line to first check "if
(jsonNode == null) return Optional.empty();" (and/or log that input was empty)
before any access, and only call getNodeType() when jsonNode is non-null; also
keep the existing branch that returns Optional.of(on) when the node is an
ObjectNode and return Optional.empty() for any other node types.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f29cb49e-f9a8-4c53-b7fc-24c1061b037c

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 2933dad.

📒 Files selected for processing (11)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java Outdated
predic8 added 10 commits May 8, 2026 13:28
- Synchronized token management methods in `AiApiLimit` to ensure thread safety.
- Adjusted log levels for `SimpleAiApiStore` to reduce verbosity.
- Added PostgreSQL dependency to the distribution.
- Updated logging configuration to set debug level for AI interceptors.
- Introduced `SSEUtil` for parsing Server-Sent Events (SSE) from chunks.
- Enhanced `AbstractAiApiRequest` to handle JSON requests conditionally.
- Deprecated and replaced `max_output_tokens` usage in specific providers.
- Improved stream support in `OpenAiAiRequest` with response usage tracking.
- Refactored token limit logic in `OpenAIAPIInterceptor` for better flow.
- Renamed `AiApiRequest` to `LLMRequest` and `AiApiResponse` to `LLMResponse`.
- Updated providers (`Google`, `OpenAI`, `Claude`) to align with `LLMProvider` interface.
- Refactored `OpenAIAPIInterceptor` to `LLMGatewayInterceptor` and related utilities.
- Removed `SSEUtil` and replaced with `SSEParser`.
- Improved streaming and token usage handling in `AbstractLLMResponse`.
- Modularized LLM responses for `Claude` and `OpenAI` providers.
- Replaced `LLMResponse` interface and `AbstractLLMResponse` with updated abstractions.
- Added `ChatCompletionsSSEParser` for advanced SSE chunk handling.
- Introduced specific SSE event classes: `ChatCompletionEvent`, `ChatCompletionDoneEvent`, `ResponsesApiEvent`.
- Renamed and restructured classes for consistency in AI namespace.
- Improved token usage tracking and event-based streaming.
…acking logic

- Deleted `ChatCompletionsSSEParser` and related classes/methods.
- Simplified `ChatCompletionEvent` by removing token usage parsing.
- Updated tool extraction logic in `AbstractLLMRequest` to handle function-specific tools.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 18, 2026

@coderabbitai plz review

@predic8 predic8 closed this May 18, 2026
@predic8 predic8 reopened this May 18, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

@predic8 Sure! I'll start a full review of the PR right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

♻️ Duplicate comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java (1)

108-118: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

setMaxOutputTokens() can still NPE when json is null.

On Line 113, json.get("generationConfig") dereferences json unguarded. This is the same underlying issue previously flagged and appears to persist after the rename.

Proposed fix
     `@Override`
     public void setMaxOutputTokens(int maxOutputTokens) {
+        if (json == null || json.isNull()) {
+            return;
+        }
         getGenerationConfig().put("max_output_tokens", maxOutputTokens);
     }
 
     private ObjectNode getGenerationConfig() {
+        if (!(json instanceof ObjectNode objectJson)) {
+            json = com.fasterxml.jackson.databind.node.JsonNodeFactory.instance.objectNode();
+            objectJson = (ObjectNode) json;
+        }
-        var gc = json.get("generationConfig");
+        var gc = objectJson.get("generationConfig");
         if (gc instanceof ObjectNode objectNode) {
             return objectNode;
         }
-        return json.putObject("generationConfig");
+        return objectJson.putObject("generationConfig");
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`
around lines 108 - 118, setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.
🧹 Nitpick comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java (1)

64-72: ⚡ Quick win

Catch JsonProcessingException specifically instead of broad Exception.

ObjectMapper.writeValueAsString() throws JsonProcessingException on serialization failures. Catching the specific exception type makes the error handling intent clearer and follows best practices for exception handling.

♻️ Proposed fix
+import com.fasterxml.jackson.core.JsonProcessingException;
+
 public static String createJson(Object o) {
     try {
         return om.writeValueAsString(o);
-    } catch (Exception e) {
+    } catch (JsonProcessingException e) {
         return """
                 { "error": "Could not create JSON" }
                 """;
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`
around lines 64 - 72, The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java`:
- Around line 35-40: In AbstractLLMEvent, the Optional holding the parsed JSON
(the variable named opt used after sse.data()) is dereferenced unguarded (var
json = opt.get()), which can throw NoSuchElementException; update the code in
the method that handles SSE parsing to check opt.isEmpty() and bail out (e.g.,
return or skip processing) after logging the "Unknown event format" message so
you never call opt.get() when empty, ensuring safe handling of non-object or
failed parses.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java`:
- Around line 41-42: In ChatCompletionEvent (around the log.debug call), avoid
logging raw tool-call arguments from fn.path("arguments").asText(); instead
build a safe, redacted string (e.g., replace sensitive patterns or mask the
content and/or truncate to a max length like 100 chars with "..." appended) and
log that; update the log.debug invocation to use the redacted/truncated value so
the source (fn.path("arguments").asText()) is never written raw to logs.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 65-77: The input token limit is bypassed when no aiStore is
configured because inputTokens is only set inside the store branch; compute and
enforce input token counting regardless of store presence: in
LLMGatewayInterceptor, call aiReq.estimateInputTokens() when
exc.getRequest().isPOSTRequest() before or outside the store != null branch (or
duplicate the estimate in the else) so inputTokens holds the actual value for
the later maxInputTokens check; apply the same change for the second occurrence
referenced around lines 92-97 so the maxInputTokens check always uses the real
estimated inputTokens even when store is null.
- Around line 116-119: The callback passed to provider.getLLMResponse currently
calls store.store(...) whenever store != null, which can pass a null AiApiUser
if MEMBRANE_AI_USER isn't set on the exchange; change the callback to first read
the user from the exchange (exc.getProperty(MEMBRANE_AI_USER, AiApiUser.class)),
check that the user is not null (and only then call store.store(user,
res.getUsage())), leaving the existing store != null guard intact so persistence
only happens when both store and a valid AiApiUser are present.
- Around line 144-146: The getDisplayName method in LLMGatewayInterceptor
currently returns the stale provider-specific string "OpenAI API"; update
LLMGatewayInterceptor.getDisplayName() to return a provider-agnostic name (e.g.,
"LLM Gateway" or "AI Gateway") so the interceptor no longer advertises a single
provider, and adjust any related UI/tooltip text retrieval (methods referencing
getDisplayName) to remain consistent with the new neutral label.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java`:
- Around line 81-83: getModel() can throw an NPE because the backing json field
may be null for non-JSON requests; update AbstractLLMRequest.getModel to first
check whether json is null (or json.path("model") is missing) and return a safe
value (e.g., null or an empty string) instead of dereferencing json, or
otherwise handle non-JSON requests upstream. Locate the json field and the
method getModel() in class AbstractLLMRequest and add a null-check/guard that
avoids calling json.path(...) when json==null and documents the chosen fallback
behavior.
- Around line 65-72: The current extraction in AbstractLLMRequest uses
ah.indexOf(BEARER_PREFIX) which accepts "Bearer" anywhere; change it to parse
the Authorization header strictly as a Bearer scheme: trim ah, verify the header
starts with the bearer scheme (case-insensitive) followed by whitespace (e.g.
check ah.trim().toLowerCase().startsWith((BEARER_PREFIX + " ").toLowerCase()) or
use regionMatches(true,...)), and only then take the substring after
BEARER_PREFIX, trim it and return null on empty; update the token extraction
logic around BEARER_PREFIX and the variable ah accordingly.
- Around line 35-40: The mapping in AbstractLLMRequest that extracts tool names
from tools.valueStream() is unsafe because chained get(...).get(...).asText()
can NPE or throw for malformed entries; update the lambda in the stream to
defensively check each node before accessing "function" and "name" (e.g., verify
node.isObject() / has("function") and has("name") and that name is textual or
non-null), return the function.name when present and valid, otherwise the
top-level name when valid, and fallback to a safe default or skip the entry
instead of calling asText() blindly so malformed tool objects no longer throw in
get(...).get(...).asText().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java`:
- Around line 61-71: The lambda passed to terminal.ifPresent in
AbstractLLMResponse currently returns early on a "[DONE]" event which skips
calling postProcessor and can leave json null causing isError() to NPE; change
the handling so that when event.data() equals "[DONE]" you do not return early
but instead assign json to a safe empty/error JsonNode (e.g.,
JsonNodeFactory.instance.objectNode().put("done", true) or a descriptive error
node) and then call postProcessor.accept(AbstractLLMResponse.this); apply the
same fix to the other terminal handling block referenced (the similar code at
lines ~75-77) so postProcessor is always invoked and json is always non-null
before isError() is called.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java`:
- Around line 21-67: The code assumes the field json is non-null in
setMaxOutputTokens and estimateInputTokens which leads to NPEs when the request
body is absent or parsing failed; update both setMaxOutputTokens and
estimateInputTokens to guard against a null json (either early-return with
sensible defaults or lazily initialize json to an empty ObjectNode) before
accessing json.path(...), and ensure any subsequent logic (e.g.,
thinking/budget_tokens adjustments and token estimation) handles the empty json
case safely; reference the json field and the methods setMaxOutputTokens and
estimateInputTokens when making the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java`:
- Around line 48-50: The current log.debug call in ClaudeLLMResponse that prints
inputJson (log.debug("Tool {} with {}", tool, inputJson.toString())) must not
log raw payloads; change it to log a redacted or summarized form instead (e.g.,
log tool name plus either the set of top-level keys, input size, or a sanitized
version) by replacing the direct inputJson.toString() usage with a safe helper
(e.g., redactToolInput(inputJson) or summarizeToolInput(inputJson)) that
strips/omits sensitive values; update the log.debug invocation to use the helper
and keep the tool identifier (tool) intact so callers can locate the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java`:
- Around line 25-31: The Usage object in MessageDelta is currently constructed
using only md.inputTokens and md.outputTokens, omitting billable cache tokens;
update the Usage creation in the MessageDelta parsing logic so that it includes
md.cacheCreationInputTokens and md.cacheReadInputTokens (i.e., total =
inputTokens + outputTokens + cacheCreationInputTokens + cacheReadInputTokens) so
billing/rate-limit accounting is correct for the MessageDelta instance (look for
the MessageDelta class and the md.usage = new Usage(...) line).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java`:
- Around line 25-33: The method estimateInputTokens uses the field json without
guarding for null; add a null-check at the start of estimateInputTokens (or a
guard in any helper it calls) so that if json is null the method returns 0 (or
an appropriate default) instead of calling json.path(...) and risking an NPE;
update estimateInputTokens to check json before calling
countText(json.path("input")), estimateChatCompletitions(), and
countText(json.path("system")) so those invocations only run when json is
non-null.
- Around line 17-20: The current code in OpenAiLLMRequest uses
json.putObject("stream_options") which unconditionally replaces any existing
stream_options; change this to merge instead: check for an existing JsonObject
via json.getJsonObject("stream_options") (or equivalent), if it exists and is an
object call put("include_usage", true) on it, otherwise create a new JsonObject
with include_usage=true and assign it (only when
json.path("stream").asBoolean(false) && isChatCompletionsRequest(exchange));
keep the check/behavior around isChatCompletionsRequest(exchange) intact.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java`:
- Around line 34-42: The getUsage() logic in OpenAiLLMResponse is ignoring
returned values from getInputTokens/getOutputTokens and is reading total_tokens
from the wrong JSON node; assign the results of getInputTokens(usage) and
getOutputTokens(usage) to inputTokens and outputTokens respectively, and read
totalTokens from usage.path("total_tokens") (not json.path("total_tokens")) for
both the new and older chat completions branches so usage.total_tokens is used
consistently.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java`:
- Around line 29-31: The current log statement in ResponsesApiEvent.java logs
raw function-call arguments via on.path("arguments").asText() at INFO; change
this to a non-production level (use log.debug) and avoid printing full
payloads—log only metadata such as the function name (on.path("name").asText()),
argument size or keys, or a redacted/truncated version of arguments (e.g., first
N characters or replace values with "<REDACTED>") instead of the raw text;
update the log invocation that references on.path("name") and
on.path("arguments") accordingly so sensitive data is not emitted at INFO.

In `@core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java`:
- Line 38: The current log call in SSEParser that does log.debug("Parsing SSE
chunk: {}", chunk) exposes raw streamed payloads; change it to avoid printing
chunk contents and instead log safe metadata: compute the chunk size (e.g.,
chunk.length() or bytes) and a simple event count (e.g., count occurrences of
"data:" or empty-line delimiters) and use those values in the debug message.
Update the logging in the SSEParser code path (referencing the chunk variable
and the log instance) to emit only size and event-count metadata and remove any
direct inclusion of the chunk string.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`:
- Around line 108-118: setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.

---

Nitpick comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`:
- Around line 64-72: The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 11071819-f5de-4c68-9b96-669afb387962

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 062ac5c.

📒 Files selected for processing (33)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMMessage.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
  • core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java
✅ Files skipped from review due to trivial changes (3)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
distribution/tutorials/ai/llm-gateway/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Convert to Windows line endings (CRLF).

The file uses Unix line endings (LF-only), which causes cmd.exe to have unpredictable GOTO/CALL label resolution behavior on Windows. This is a documented parsing issue where cmd.exe fails to find batch labels at certain offsets when using LF-only line endings instead of CRLF. Since this file contains GOTO statements, it is vulnerable to label-not-found failures.

Configure git to enforce CRLF for all .cmd files in the repository:

*.cmd text eol=crlf

Alternatively, convert this file directly by changing line endings from LF to CRLF.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/membrane.cmd` around lines 1 - 25, The
batch script (labels :search_up, :found, :notfound and the call to
"%MEMBRANE_HOME%\scripts\run-membrane.cmd") must use Windows CRLF endings
because cmd.exe misparses GOTO/CALL labels with LF-only files; convert this
file's line endings from LF to CRLF (or add a .gitattributes entry like "*.cmd
text eol=crlf" to enforce CRLF) and re-save membrane.cmd so all lines end with
CRLF to ensure the GOTO and CALL label resolution works reliably.
♻️ Duplicate comments (8)
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java (2)

96-107: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Handle null return from readTree() before calling getNodeType().

On line 102, node.getNodeType() will throw a NullPointerException when ObjectMapper.readTree(String) returns null for empty input. Jackson's documented behavior is to return null when no content is available.

Proposed fix
 public static Optional<ObjectNode> getJsonObject(String s) {
     try {
         var node = om.readTree(s);
         if (node instanceof ObjectNode on) {
             return Optional.of(on);
         }
-        log.debug("Expected JSON Object but got: {}",node.getNodeType());
+        if (node != null) {
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
+        } else {
+            log.debug("Expected JSON Object but got: null/empty input");
+        }
     } catch (Exception e) {
         log.debug("Error reading JSON: {}", e.getMessage());
     }
     return empty();
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` around
lines 96 - 107, The getJsonObject method can get a null node from
om.readTree(s); add a null check right after var node = om.readTree(s) in
JsonUtil.getJsonObject to avoid calling node.getNodeType() on null: if node is
null log a debug message like "Empty JSON input" (or similar) and return
Optional.empty(); otherwise proceed with the existing instanceof ObjectNode
check and the existing debug log for non-object node types.

114-125: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix typo in method name and handle null return from readTree().

Two issues in this helper method:

  1. Typo: The method name getJsonObjectFromSteam should be getJsonObjectFromStream.
  2. NPE risk: On line 120, node.getNodeType() will throw a NullPointerException when ObjectMapper.readTree(InputStream) returns null for empty input.
Proposed fix
-    private static Optional<ObjectNode> getJsonObjectFromSteam(InputStream obj) {
+    private static Optional<ObjectNode> getJsonObjectFromStream(InputStream obj) {
         try {
             var node = om.readTree(obj);
             if (node instanceof ObjectNode on) {
                 return Optional.of(on);
             }
-            log.info("Expected JSON Object but got: {}",node.getNodeType());
+            if (node != null) {
+                log.info("Expected JSON Object but got: {}", node.getNodeType());
+            } else {
+                log.info("Expected JSON Object but got: null/empty input");
+            }
         } catch (Exception e) {
             log.info("Error reading JSON: {}", e.getMessage());
         }
         return empty();
     }

And update the caller on line 111:

     public static Optional<ObjectNode> getJsonObject(Message msg) {
-       return getJsonObjectFromSteam(msg.getBodyAsStreamDecoded());
+       return getJsonObjectFromStream(msg.getBodyAsStreamDecoded());
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` around
lines 114 - 125, Rename the helper method getJsonObjectFromSteam to
getJsonObjectFromStream and update all callers to use the new name; inside the
method, after var node = om.readTree(obj), check if node == null and log a clear
message (e.g. "Empty JSON input") instead of calling node.getNodeType(), then
only test instanceof ObjectNode (ObjectNode on) and return Optional.of(on) or
Optional.empty() as before, and keep exception handling but consider logging the
exception message.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/AbstractOpenAiLLMRequest.java (1)

14-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a null-guard before token estimation.

estimateInputTokens() dereferences json directly. If request parsing fails or body is missing, this can throw before graceful invalid-request handling.

💡 Proposed fix
 `@Override`
 public long estimateInputTokens() {
+    if (json == null || json.isNull()) {
+        return 0;
+    }
 
     long chars = countText(json.path("input"));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/AbstractOpenAiLLMRequest.java`
around lines 14 - 25, The method estimateInputTokens() currently dereferences
the field json directly; add a null/missing-node guard at the start of
AbstractOpenAiLLMRequest. Specifically, in estimateInputTokens() check if json
is null or json.isMissingNode()/json.isNull() and return 0 (or the same graceful
invalid-request sentinel used elsewhere) before calling countText(...) or
countJsonSize(...); this prevents NPEs during parsing failures and keeps
subsequent calls to countText, estimateChatCompletitions, countJsonSize safe.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (2)

27-30: ⚠️ Potential issue | 🟠 Major

Don't log user identifiers and usage at INFO by default.

This writes per-user identifiers and token-consumption details into normal production logs on every request. Keep it at DEBUG and drop the identifier, or default logUsage to false. Based on learnings: debug-level logging with sensitive raw content is exempt in this codebase, but higher log levels should still be flagged.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 27 - 30, The store method in SimpleAiApiStore currently logs
per-user identifiers and usage at INFO when logUsage is true; change this to
avoid sensitive data exposure by (1) defaulting the logUsage field to false and
(2) if logging is needed, use DEBUG level and remove user identifiers — e.g.
replace the log.info call in SimpleAiApiStore.store with a log.debug message
that does not include user.getName() or raw usage content (you may log only
aggregated, non-identifying metrics such as usage.totalTokens()), and ensure
limit.addTokens(usage.totalTokens()) remains unchanged.

63-70: ⚠️ Potential issue | 🟠 Major

The new description promises per-user limits, but the implementation is shared.

limit is a single store-wide instance, and both store(...) and checkLimit(...) ignore the caller when accounting. One user can still exhaust the budget for everyone, so either key limits by user or rename this to a shared/global limit.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 63 - 70, The doc claims per-user limits but the field `limit` in
SimpleAiApiStore and methods `store(...)` and `checkLimit(...)` use a single
shared instance; fix by switching to per-caller limits (e.g., change the `limit`
field to a Map keyed by caller id and update `setLimit(AiApiLimit)` or add
`setLimit(String caller, AiApiLimit)` and make `store(...)` and
`checkLimit(...)` look up the caller's AiApiLimit) or, if global behavior is
intended, update the documentation and parameter names to make it clear this is
a shared/global limit; ensure all references in SimpleAiApiStore (field `limit`,
`setLimit`, `store`, `checkLimit`) are updated consistently to use the chosen
approach.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java (1)

53-66: ⚠️ Potential issue | 🟠 Major

Parse Authorization as a strict Bearer scheme.

indexOf("Bearer") still accepts malformed headers where Bearer appears later in the value, so an unintended suffix can be treated as the API key. Require the scheme at the start of the header and only then strip the token.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java`
around lines 53 - 66, In getApiKey(), don't use indexOf(BEARER_PREFIX) — require
the Authorization header to start with the Bearer scheme and perform a
case-insensitive check before extracting the token; specifically, verify the
header (exchange.getRequest().getHeader().getAuthorization()) startsWith the
BEARER_PREFIX in a case-insensitive way (or use regionMatches(true,...,
BEARER_PREFIX.length())), then substring from BEARER_PREFIX.length(), trim and
return null if empty; if the header does not start with the scheme return null.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java (1)

16-20: ⚠️ Potential issue | 🟠 Major

The DDL is still PostgreSQL-only.

The TODO is accurate, but BIGINT GENERATED ALWAYS AS IDENTITY will still fail when this store is pointed at MySQL/MariaDB. If this class is meant to stay JDBC-generic, generate database-specific DDL or switch to a portable key strategy.

Does MySQL or MariaDB support `BIGINT GENERATED ALWAYS AS IDENTITY` in `CREATE TABLE`?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`
around lines 16 - 20, The current CREATE_TABLE_SQL constant in
JDBCAiApiUsageStore uses PostgreSQL-only syntax (BIGINT GENERATED ALWAYS AS
IDENTITY); replace the single static DDL with a DB-aware creation path: detect
the database product via the JDBC Connection/DatabaseMetaData (in the method
that initializes the store), and pick DDL per product—e.g. for PostgreSQL keep
"BIGINT GENERATED ALWAYS AS IDENTITY", for MySQL/MariaDB use "BIGINT
AUTO_INCREMENT PRIMARY KEY", and for unknown DBs either create a plain "BIGINT
PRIMARY KEY" plus a sequence or switch to a portable key (UUID VARCHAR(36))
strategy; update any code that references CREATE_TABLE_SQL to use this chosen
SQL string at runtime.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java (1)

61-71: ⚠️ Potential issue | 🟠 Major

Keep json non-null before postProcessor runs.

processTerminalEvent() is now responsible for terminal handling, but the base implementation is empty and postProcessor still runs unconditionally afterwards. On the streaming path that leaves isError() able to dereference a null json.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java`
around lines 61 - 71, The postProcessor is invoked unconditionally after
terminal handling which allows isError() to dereference a null json on streaming
paths; modify the terminal.ifPresent block so that either
processTerminalEvent(SSEParser.SSEEvent) guarantees json is set (or initializes
json to an empty/object node) before returning, or only call
postProcessor.accept(AbstractLLMResponse.this) after confirming json != null
(i.e. check json != null || initialize it) — update the logic around
terminal.ifPresent, processTerminalEvent, postProcessor, and isError to ensure
json is non-null before postProcessor runs.
🧹 Nitpick comments (2)
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java (1)

120-122: ⚡ Quick win

Prefer DEBUG level for expected non-error conditions.

Lines 120 and 122 log at INFO level when the body is not a JSON object or fails to parse. Since getJsonObject returns Optional.empty() for these cases (indicating they are expected, handled conditions rather than errors), and the String overload (line 102-104) uses DEBUG level for consistency, these should also use DEBUG level to reduce log noise in production.

Proposed change
-            log.info("Expected JSON Object but got: {}",node.getNodeType());
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
         } catch (Exception e) {
-            log.info("Error reading JSON: {}", e.getMessage());
+            log.debug("Error reading JSON: {}", e.getMessage());
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` around
lines 120 - 122, Change the two INFO logs in JsonUtil to DEBUG because non-JSON
bodies and parse failures are expected and handled; specifically in the JsonUtil
class's getJsonObject(...) flow replace the log.info("Expected JSON Object but
got: {}", node.getNodeType()) and the catch block log.info("Error reading JSON:
{}", e.getMessage()) with debug-level logging (log.debug(...)) to match the
String overload behavior and reduce noise.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (1)

147-151: 💤 Low value

Logging guard condition mismatch.

The condition checks log.isInfoEnabled() but then logs at DEBUG level. The guard should match the actual log level.

♻️ Proposed fix
-            if (log.isInfoEnabled() && user != null) {
+            if (log.isDebugEnabled() && user != null) {
                 log.debug("Token usage of user {}: {}", user, res.getUsage());
             } else {
                 log.info("Token usage: {}", res.getUsage());
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 147 - 151, Logging guard mismatch: the code in
LLMGatewayInterceptor checks log.isInfoEnabled() but calls log.debug(...) when
user != null; change the guard to log.isDebugEnabled() so the debug call is
properly protected (i.e., replace log.isInfoEnabled() with
log.isDebugEnabled()), keeping the else branch that calls log.info(...) with
res.getUsage() and preserving the user check and message text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Line 111: In LLMGatewayInterceptor fix the mismatched log: the debug message
currently says "max-tokens from client" but logs aiReq.getModel(); change the
logged value to the correct method (aiReq.getRequestedMaxOutputTokens()) if the
intent is to log max tokens, or change the message to reference the model (e.g.,
"model from client") if the intent is to log aiReq.getModel(); update the
log.debug call accordingly so message and argument match.
- Around line 226-230: The Javadoc block in LLMGatewayInterceptor contains a
misspelled tag "`@desciptions`" for the "models" parameter; update the Javadoc to
use the correct "`@description`" tag (the comment above the models param inside
class LLMGatewayInterceptor) so documentation generators will pick up the
description. Ensure only the tag spelling is changed and the rest of the comment
content remains unchanged.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`:
- Around line 53-58: The getter getRequestedMaxOutputTokens in GoogleLLMRequest
reads generationConfig.maxOutputTokens but the writer uses
generationConfig.max_output_tokens; update the reader to read the same
snake_case field (generationConfig.max_output_tokens) or normalize both by
checking both keys (max_output_tokens then maxOutputTokens) so the requested max
output tokens are correctly read; modify
GoogleLLMRequest.getRequestedMaxOutputTokens accordingly to match the writer's
field name or to support both variants.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMChatCompletionsRequest.java`:
- Around line 46-48: The read/write mismatch uses "max_completion_tokens" when
writing and "max_tokens" when reading; update the getter in
OpenAiLLMChatCompletionsRequest (getRequestedMaxOutputTokens) to read the same
field used when writing (use "max_completion_tokens" instead of "max_tokens"),
ensuring both serialization and deserialization reference the identical token
key so requested max output tokens are consistently detected.
- Around line 30-43: The getTools() method currently maps non-function entries
to null and can NPE on missing "function" or "name" nodes; modify getTools() to
first guard tools == null (keep returning emptyList()), then stream
tools.valueStream() and filter only nodes where "type" is missing or equals
"function", and also verify node.has("function") &&
node.get("function").has("name") before extracting the name; skip any entries
that fail these checks so the resulting List<String> contains only valid tool
names and never nulls (refer to getToolsNode() and getTools()).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponsesRequest.java`:
- Around line 15-27: In getTools(), filter out any non-function or malformed
tool nodes and avoid nulls/NPEs by first checking getToolsNode() for null, then
for each node require has("type") &&
"function".equals(node.get("type").asText()) && node.has("name") before
extracting node.get("name").asText(); skip nodes that fail these checks so the
returned List<String> contains only valid tool names; refer to getTools() and
getToolsNode() when making the changes.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java`:
- Around line 18-34: The provider currently uses a shared mutable field
isResponsesAPI set in getLLMRequest and later read in getLLMResponse, which is
not thread-safe; remove the class field and instead compute the request type
locally in each method by calling isResponsesApi(exchange) inside getLLMRequest
and again inside getLLMResponse, returning OpenAiLLMResponsesRequest or
OpenAiLLMChatCompletionsRequest from getLLMRequest and creating an
OpenAiLLMResponsesResponse or the chat response in getLLMResponse based on the
local boolean; ensure no shared state remains (delete isResponsesAPI) so
concurrent exchanges each determine their adapter from isResponsesApi(exchange).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 63-70: The Javadoc for setMaxTokens in AiApiLimit claims "0 (no
limit)" but checkLimit() treats maxTokens as a numeric counter (so 0 disallows
all requests); update checkLimit() to treat maxTokens <= 0 as unlimited by
early-returning/ignoring token checks when this.maxTokens <= 0, or alternatively
change the default comment to reflect that 0 currently forbids requests; modify
the logic in the checkLimit() method (and any token-subtraction paths) to skip
decrementing/limit-checks when maxTokens <= 0 so the documented "no limit"
behavior is honored.

---

Outside diff comments:
In `@distribution/tutorials/ai/llm-gateway/membrane.cmd`:
- Around line 1-25: The batch script (labels :search_up, :found, :notfound and
the call to "%MEMBRANE_HOME%\scripts\run-membrane.cmd") must use Windows CRLF
endings because cmd.exe misparses GOTO/CALL labels with LF-only files; convert
this file's line endings from LF to CRLF (or add a .gitattributes entry like
"*.cmd text eol=crlf" to enforce CRLF) and re-save membrane.cmd so all lines end
with CRLF to ensure the GOTO and CALL label resolution works reliably.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java`:
- Around line 53-66: In getApiKey(), don't use indexOf(BEARER_PREFIX) — require
the Authorization header to start with the Bearer scheme and perform a
case-insensitive check before extracting the token; specifically, verify the
header (exchange.getRequest().getHeader().getAuthorization()) startsWith the
BEARER_PREFIX in a case-insensitive way (or use regionMatches(true,...,
BEARER_PREFIX.length())), then substring from BEARER_PREFIX.length(), trim and
return null if empty; if the header does not start with the scheme return null.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java`:
- Around line 61-71: The postProcessor is invoked unconditionally after terminal
handling which allows isError() to dereference a null json on streaming paths;
modify the terminal.ifPresent block so that either
processTerminalEvent(SSEParser.SSEEvent) guarantees json is set (or initializes
json to an empty/object node) before returning, or only call
postProcessor.accept(AbstractLLMResponse.this) after confirming json != null
(i.e. check json != null || initialize it) — update the logic around
terminal.ifPresent, processTerminalEvent, postProcessor, and isError to ensure
json is non-null before postProcessor runs.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/AbstractOpenAiLLMRequest.java`:
- Around line 14-25: The method estimateInputTokens() currently dereferences the
field json directly; add a null/missing-node guard at the start of
AbstractOpenAiLLMRequest. Specifically, in estimateInputTokens() check if json
is null or json.isMissingNode()/json.isNull() and return 0 (or the same graceful
invalid-request sentinel used elsewhere) before calling countText(...) or
countJsonSize(...); this prevents NPEs during parsing failures and keeps
subsequent calls to countText, estimateChatCompletitions, countJsonSize safe.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Around line 16-20: The current CREATE_TABLE_SQL constant in
JDBCAiApiUsageStore uses PostgreSQL-only syntax (BIGINT GENERATED ALWAYS AS
IDENTITY); replace the single static DDL with a DB-aware creation path: detect
the database product via the JDBC Connection/DatabaseMetaData (in the method
that initializes the store), and pick DDL per product—e.g. for PostgreSQL keep
"BIGINT GENERATED ALWAYS AS IDENTITY", for MySQL/MariaDB use "BIGINT
AUTO_INCREMENT PRIMARY KEY", and for unknown DBs either create a plain "BIGINT
PRIMARY KEY" plus a sequence or switch to a portable key (UUID VARCHAR(36))
strategy; update any code that references CREATE_TABLE_SQL to use this chosen
SQL string at runtime.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 27-30: The store method in SimpleAiApiStore currently logs
per-user identifiers and usage at INFO when logUsage is true; change this to
avoid sensitive data exposure by (1) defaulting the logUsage field to false and
(2) if logging is needed, use DEBUG level and remove user identifiers — e.g.
replace the log.info call in SimpleAiApiStore.store with a log.debug message
that does not include user.getName() or raw usage content (you may log only
aggregated, non-identifying metrics such as usage.totalTokens()), and ensure
limit.addTokens(usage.totalTokens()) remains unchanged.
- Around line 63-70: The doc claims per-user limits but the field `limit` in
SimpleAiApiStore and methods `store(...)` and `checkLimit(...)` use a single
shared instance; fix by switching to per-caller limits (e.g., change the `limit`
field to a Map keyed by caller id and update `setLimit(AiApiLimit)` or add
`setLimit(String caller, AiApiLimit)` and make `store(...)` and
`checkLimit(...)` look up the caller's AiApiLimit) or, if global behavior is
intended, update the documentation and parameter names to make it clear this is
a shared/global limit; ensure all references in SimpleAiApiStore (field `limit`,
`setLimit`, `store`, `checkLimit`) are updated consistently to use the chosen
approach.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 96-107: The getJsonObject method can get a null node from
om.readTree(s); add a null check right after var node = om.readTree(s) in
JsonUtil.getJsonObject to avoid calling node.getNodeType() on null: if node is
null log a debug message like "Empty JSON input" (or similar) and return
Optional.empty(); otherwise proceed with the existing instanceof ObjectNode
check and the existing debug log for non-object node types.
- Around line 114-125: Rename the helper method getJsonObjectFromSteam to
getJsonObjectFromStream and update all callers to use the new name; inside the
method, after var node = om.readTree(obj), check if node == null and log a clear
message (e.g. "Empty JSON input") instead of calling node.getNodeType(), then
only test instanceof ObjectNode (ObjectNode on) and return Optional.of(on) or
Optional.empty() as before, and keep exception handling but consider logging the
exception message.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 147-151: Logging guard mismatch: the code in LLMGatewayInterceptor
checks log.isInfoEnabled() but calls log.debug(...) when user != null; change
the guard to log.isDebugEnabled() so the debug call is properly protected (i.e.,
replace log.isInfoEnabled() with log.isDebugEnabled()), keeping the else branch
that calls log.info(...) with res.getUsage() and preserving the user check and
message text.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 120-122: Change the two INFO logs in JsonUtil to DEBUG because
non-JSON bodies and parse failures are expected and handled; specifically in the
JsonUtil class's getJsonObject(...) flow replace the log.info("Expected JSON
Object but got: {}", node.getNodeType()) and the catch block log.info("Error
reading JSON: {}", e.getMessage()) with debug-level logging (log.debug(...)) to
match the String overload behavior and reduce noise.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6ebb35a-67df-4b65-bb9e-4708b5c8d540

📥 Commits

Reviewing files that changed from the base of the PR and between 8ea5320 and 97e597c.

📒 Files selected for processing (28)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/AbstractOpenAiLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiChatCompletionsLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponsesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponsesResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
  • distribution/tutorials/ai/llm-gateway/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/max-input.json
  • distribution/tutorials/ai/llm-gateway/max-output.json
  • distribution/tutorials/ai/llm-gateway/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/membrane.sh
  • distribution/tutorials/ai/llm-gateway/simple.json
✅ Files skipped from review due to trivial changes (4)
  • distribution/tutorials/ai/llm-gateway/max-output.json
  • distribution/tutorials/ai/llm-gateway/simple.json
  • distribution/tutorials/ai/llm-gateway/max-input.json
  • distribution/tutorials/ai/llm-gateway/10-Basic-LLM-Gateway.yaml

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
predic8 added 3 commits May 20, 2026 13:21
…acking

- Removed redundant `isResponsesAPI` variable in `OpenAIProvider`.
- Optimized tool extraction in `OpenAiLLMResponsesRequest` and `OpenAiLLMChatCompletionsRequest`.
- Updated `AiApiLimit` to support unlimited tokens with `MAX_VALUE`.
- Replaced `token` with `apiKey` in `AiApiUser` along with added `tokens` field.
- Improved JSON parsing logic in `JsonUtil` with better exception handling and logging.
- Adjusted output token parameter naming in multiple request classes for consistency.
…r handling

- Deleted `AiApiLimit` and `NoAiApiLimit` classes, consolidating token management into `SimpleAiApiStore`.
- Introduced `LLMErrorCreator` and its implementations (`OpenAiErrorCreator`, etc.) for reusable error generation.
- Refactored `LLMGatewayInterceptor` to utilize provider-specific error creators, simplifying token and model validation.
- Enhanced `SimpleAiApiStore` with token reset functionality and user-specific token tracking.
- Updated tutorials and examples to align with this refactored approach.
- Added `resetTokensUsedInPeriod` for user-specific token reset in `SimpleAiApiStore`.
- Improved inline documentation for methods across `AiApiUser`, `AiApiStore`, and `LLMGatewayInterceptor`.
- Updated parameter descriptions for clarity and consistency.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 20, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (2)

154-156: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

store.store(user, ...) may receive null user for non-POST requests.

For non-POST requests, handleRequest returns at line 77 without setting MEMBRANE_AI_USER on the exchange. When handleResponse executes, user will be null (line 148). If store != null, store.store(null, ...) is called.

SimpleAiApiStore.store() calls user.getName() (line 40) and user.addTokensUsedInPeriod() (line 41), which will NPE on null user.

Consider guarding:

         if (store != null) {
-            store.store(user, res.getUsage());
+            if (user != null) {
+                store.store(user, res.getUsage());
+            }
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 154 - 156, In LLMGatewayInterceptor, handleResponse calls
store.store(user, ...) but user can be null for non-POST requests because
MEMBRANE_AI_USER is only set in handleRequest for POST; update handleResponse to
check for null before calling store.store (e.g., only call store.store(...) when
user != null) or provide a safe default user; reference the symbols
LLMGatewayInterceptor.handleResponse, MEMBRANE_AI_USER, store.store(...) and
SimpleAiApiStore.store (which calls user.getName() and
user.addTokensUsedInPeriod()) to locate the code to guard against a null user
and avoid the NPE.

149-153: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inverted log-level logic appears to suppress detailed user logs.

When log.isInfoEnabled() && user != null, the code logs at DEBUG level (line 150). When user is null or info logging is disabled, it logs at INFO level (line 152). This seems inverted — typically you'd want to log user-specific details at DEBUG and generic messages at INFO.

Was the intent to log user details only when DEBUG is enabled?

-            if (log.isInfoEnabled() && user != null) {
-                log.debug("Token usage of user {}: {}", user, res.getUsage());
+            if (log.isDebugEnabled() && user != null) {
+                log.debug("Token usage for user {}: {}", user, res.getUsage());
             } else {
                 log.info("Token usage: {}", res.getUsage());
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 149 - 153, The log-level branch in LLMGatewayInterceptor (around
the token-usage logging) is inverted: it checks log.isInfoEnabled() && user !=
null but then calls log.debug(...) for user-specific details and log.info(...)
for the generic case; change the condition to check log.isDebugEnabled() && user
!= null so detailed per-user information is only logged when DEBUG is enabled,
and keep the generic usage message at INFO when DEBUG is not enabled or user is
null (update the condition near the token usage logging where res.getUsage() is
printed).
♻️ Duplicate comments (1)
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java (1)

105-111: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle empty-input readTree() results explicitly instead of relying on caught NPEs.

At Line 109 and Line 133, node.getNodeType() is called without a null check. For empty input, Jackson readTree(...) can return null, so this currently falls into catch (Exception) via NPE. It still returns Optional.empty(), but through exception-driven flow and weaker diagnostics.

💡 Suggested minimal fix
@@
-        try {
-            var node = om.readTree(jsonString);
+        try {
+            var node = om.readTree(jsonString);
+            if (node == null) {
+                log.debug("Expected JSON Object but got: empty body");
+                return empty();
+            }
             if (node instanceof ObjectNode on) {
                 return Optional.of(on);
             }
-            log.debug("Expected JSON Object but got: {}",node.getNodeType());
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
         } catch (Exception e) {
-            log.debug("Error reading JSON: {}", e.getMessage());
+            log.debug("Error reading JSON", e);
         }
@@
-        try {
-            var node = om.readTree(obj);
+        try {
+            var node = om.readTree(obj);
+            if (node == null) {
+                log.debug("Expected JSON Object but got: empty body");
+                return empty();
+            }
             if (node instanceof ObjectNode on) {
                 return Optional.of(on);
             }
-            log.debug("Expected JSON Object but got: {}",node.getNodeType());
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
         } catch (Exception e) {
-            log.debug("Error reading JSON: {}", e.getMessage());
+            log.debug("Error reading JSON", e);
         }

Also applies to: 129-135

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` around
lines 105 - 111, In JsonUtil.java, avoid NPEs from om.readTree(...) returning
null by adding null checks before calling node.getNodeType(): after var node =
om.readTree(jsonString) check if (node == null) and log a clear debug message
like "Empty JSON input" (or include jsonString info) and return
Optional.empty(); do the same for the second occurrence around lines 129-135
where node.getNodeType() is invoked; leave the Optional.empty() return behavior
unchanged but replace relying on caught NPEs with explicit null handling and
improved logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java`:
- Around line 29-32: getErrorCreator() currently returns null which can break
error-path handling; change ClaudeProvider.getErrorCreator() to return a
non-null LLMErrorCreator implementation (e.g., a concrete ClaudeLLMErrorCreator
class or an anonymous/inner LLMErrorCreator instance) that maps
provider-specific errors to the gateway LLM error model and implements required
methods used by callers (so callers no longer receive null). Ensure the returned
creator is instantiated in ClaudeProvider and handles common Claude error shapes
to be null-safe.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java`:
- Around line 29-32: The getErrorCreator() method in GoogleProvider currently
returns null; replace it with a concrete LLMErrorCreator instance (e.g., return
new DefaultLLMErrorCreator() or implement and return a
GoogleProviderErrorCreator) so provider error generation never receives null;
implement the concrete class if it doesn't exist, ensure it implements
LLMErrorCreator, and return that instance from GoogleProvider.getErrorCreator().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`:
- Around line 35-37: In AiApiUser.checkLimit(long tokensNeededForRequest) the
subtraction treats tokens==0 as a finite allowance and can yield negative
remaining; change the logic to treat tokens==0 as "no limit" — if this.tokens ==
0 return a large positive allowance (or explicitly return Long.MAX_VALUE or a
sentinel positive value) so tokensUsedInPeriod.get() and tokensNeededForRequest
do not cause denial; update checkLimit to first check this.tokens == 0 and only
perform the subtraction when tokens > 0, keeping the method name and behavior
otherwise.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 54-61: The code iterates the users list inside synchronized(lock)
but setUsers and getUser mutate/read users without synchronization; synchronize
setUsers (and any user-accessing methods like getUser) on the same lock or
switch users to a thread-safe collection; specifically, wrap the body of
setUsers(...) and getUser(...) in synchronized(lock) (or alternatively change
the users field to a CopyOnWriteArrayList and update usages) so that the
iteration in the block that calls
users.forEach(AiApiUser::resetTokensUsedInPeriod) cannot race with concurrent
replacements of the users reference.

In `@distribution/tutorials/ai/llm-gateway/20-Sharing-API-Keys.yaml`:
- Around line 17-19: The tutorial references a missing helper file
"wrong-model.json" in the "3. Wrong Model" step which will cause a
file-not-found error; either add a new file named wrong-model.json to the
tutorial directory containing an invalid model payload (e.g., model:
"invalid-model-name", input: "Hello, AI!"), or update the step text (# 3. Wrong
Model) to instruct users to create the request inline (show the JSON to paste)
or remove the step entirely—modify whichever you choose and ensure the curl
sample still points to "wrong-model.json" or to an inline -d payload
accordingly.
- Around line 38-41: Update the model identifiers listed under the models array:
replace "gpt-5-nano" with "gpt-5.4-nano" and "gpt-5-mini" with "gpt-5.4-mini" so
the models array contains "gpt-5.4", "gpt-5.4-nano", and "gpt-5.4-mini" (modify
the models block shown in the diff).

---

Outside diff comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 154-156: In LLMGatewayInterceptor, handleResponse calls
store.store(user, ...) but user can be null for non-POST requests because
MEMBRANE_AI_USER is only set in handleRequest for POST; update handleResponse to
check for null before calling store.store (e.g., only call store.store(...) when
user != null) or provide a safe default user; reference the symbols
LLMGatewayInterceptor.handleResponse, MEMBRANE_AI_USER, store.store(...) and
SimpleAiApiStore.store (which calls user.getName() and
user.addTokensUsedInPeriod()) to locate the code to guard against a null user
and avoid the NPE.
- Around line 149-153: The log-level branch in LLMGatewayInterceptor (around the
token-usage logging) is inverted: it checks log.isInfoEnabled() && user != null
but then calls log.debug(...) for user-specific details and log.info(...) for
the generic case; change the condition to check log.isDebugEnabled() && user !=
null so detailed per-user information is only logged when DEBUG is enabled, and
keep the generic usage message at INFO when DEBUG is not enabled or user is null
(update the condition near the token usage logging where res.getUsage() is
printed).

---

Duplicate comments:
In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 105-111: In JsonUtil.java, avoid NPEs from om.readTree(...)
returning null by adding null checks before calling node.getNodeType(): after
var node = om.readTree(jsonString) check if (node == null) and log a clear debug
message like "Empty JSON input" (or include jsonString info) and return
Optional.empty(); do the same for the second occurrence around lines 129-135
where node.getNodeType() is invoked; leave the Optional.empty() return behavior
unchanged but replace relying on caught NPEs with explicit null handling and
improved logs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0f476e7b-3ad5-4b76-b421-3c8fabe19123

📥 Commits

Reviewing files that changed from the base of the PR and between 97e597c and 0af0f35.

📒 Files selected for processing (18)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponsesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
  • distribution/tutorials/ai/llm-gateway/20-Sharing-API-Keys.yaml

- Added synchronized blocks to `SimpleAiApiStore` for thread-safe access to user data.
- Introduced `invalidRequestError` to `LLMErrorCreator` and implemented it in `OpenAiErrorCreator`.
- Allowed unlimited tokens for users with `MAX_VALUE` in `AiApiUser`.
- Simplified logic in `LLMGatewayInterceptor` for token and model validation.
- Updated tutorial JSON with test input for validation.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 20, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java (1)

75-78: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Documentation mismatch: Javadoc says default is 0 but code defaults to MAX_VALUE.

The @default 0 (no limit) comment no longer matches the actual default of Long.MAX_VALUE set at line 16.

📝 Proposed fix
     /**
      * `@description` Number of tokens that the user has available within the current period.
-     * `@default` 0 (no limit)
+     * `@default` Long.MAX_VALUE (no limit)
      * `@param` tokens available tokens
      */
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`
around lines 75 - 78, The Javadoc for AiApiUser.tokens incorrectly states the
default is 0 while the code initializes the tokens field to Long.MAX_VALUE;
update the Javadoc to reflect the actual default (e.g., "`@default` Long.MAX_VALUE
(no limit)") so docs match the implementation of the tokens field in class
AiApiUser, or alternatively change the tokens field default to 0 if the intended
behavior was different—adjust either the comment or the field initialization for
consistency.
🧹 Nitpick comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

88-90: 💤 Low value

getUsers() reads @GuardedBy("lock") field without synchronization.

For consistency with the other accessors/mutators, this getter should also synchronize on lock or return a defensive copy.

♻️ Proposed fix
     public List<AiApiUser> getUsers() {
-        return users;
+        synchronized (lock) {
+            return users;
+        }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 88 - 90, getUsers() currently returns the `@GuardedBy`("lock") field
users without synchronization; update getUsers() in SimpleAiApiStore to either
synchronize on lock (e.g., synchronized(lock) { return users; }) or, preferably,
return a defensive copy while holding the lock (e.g., synchronized(lock) {
return new ArrayList<>(users); }) so access is consistent with other
synchronized accessors/mutators that use lock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`:
- Around line 75-78: The Javadoc for AiApiUser.tokens incorrectly states the
default is 0 while the code initializes the tokens field to Long.MAX_VALUE;
update the Javadoc to reflect the actual default (e.g., "`@default` Long.MAX_VALUE
(no limit)") so docs match the implementation of the tokens field in class
AiApiUser, or alternatively change the tokens field default to 0 if the intended
behavior was different—adjust either the comment or the field initialization for
consistency.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 88-90: getUsers() currently returns the `@GuardedBy`("lock") field
users without synchronization; update getUsers() in SimpleAiApiStore to either
synchronize on lock (e.g., synchronized(lock) { return users; }) or, preferably,
return a defensive copy while holding the lock (e.g., synchronized(lock) {
return new ArrayList<>(users); }) so access is consistent with other
synchronized accessors/mutators that use lock.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f9675200-8f29-4e10-b89c-3241ee03df4b

📥 Commits

Reviewing files that changed from the base of the PR and between 0af0f35 and 329bb80.

📒 Files selected for processing (6)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • distribution/tutorials/ai/llm-gateway/wrong-model.json

predic8 added 4 commits May 20, 2026 17:56
- Added Claude-specific error handling with `ClaudeErrorCreator` and `ClaudeErrorResponse`.
- Introduced `10-Basic-LLM-Gateway.yaml` tutorial for Claude integration.
- Enhanced token usage tracking in `ClaudeLLMResponse`.
- Updated examples and tutorials to support both OpenAI and Claude.
…ring and token limit examples

- Added `10-Basic-LLM-Gateway.yaml` and `20-Sharing-API-Keys.yaml` tutorials for Google Gemini.
- Enhanced Claude tutorials with improved key handling and token limit examples.
- Introduced `GoogleErrorCreator` for detailed error handling in Google LLM Gateway.
- Updated `LLMGatewayInterceptor` and token tracking logic to reflect effective max token handling.
- Modified existing OpenAI and Claude examples for consistency and clarity.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 21, 2026

@coderabitai review

…tutorials

- Introduced `AbstractAiTutorialTest` base class and provider-specific extensions for easier test creation.
- Added integration tests for basic gateway setups and API key sharing for Claude, OpenAI, and Google Gemini.
- Simulated upstream mock APIs to enable testing token limits, key handling, and input/output transformations.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 21, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
distribution/tutorials/ai/llm-gateway/google/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings.

Windows cmd.exe parses .cmd/.bat files in 512-byte chunks and LF-only line endings can cause label resolution (:search_up, :found, :notfound) to misbehave depending on file size and chunk boundaries. Ensure this file is committed with CRLF (e.g., add a .gitattributes entry like *.cmd text eol=crlf or run unix2dos).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd` around lines 1 -
25, This file uses LF-only endings which can break label parsing in cmd scripts;
convert distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line
endings and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.
♻️ Duplicate comments (3)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (1)

149-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard usage persistence when no authenticated user is present.

On Line 150, store.store(user, ...) is called even when user is null (e.g., request paths that never set MEMBRANE_AI_USER), which can break persistence or store invalid records.

Proposed fix
-            if (store != null) {
+            if (store != null && user != null) {
                 store.store(user, res.getUsage());
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 149 - 150, The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java (1)

180-218: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same test-isolation concern as the Claude variant.

This alicesTokenBudgetIsExhaustedWhileBobIsUnaffected shares Alice's gateway-side token counter with the other Alice-tests in the same class. If JUnit 5 happens to execute another Alice-test first, Alice's running total will not start at 0 and the budget arithmetic in the Javadoc no longer holds. Please ensure the base class resets per-user usage between @Test methods (see the same comment on the Claude variant).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`
around lines 180 - 218, The test
alicesTokenBudgetIsExhaustedWhileBobIsUnaffected relies on a fresh per-user
token counter but current test runs share Alice's gateway-side state across
methods; add a test teardown that resets per-user usage between `@Test` methods
(e.g., an `@AfterEach` in the common base test class used by
SharingApiKeysGoogleTutorialTest) so the gateway's user token counters are
reinitialized before each test, or call the existing reset method on the shared
token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or similar)
from that teardown to guarantee Alice's counter starts at 0 for every test.
distribution/tutorials/ai/llm-gateway/openai/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings (duplicate of google/membrane.cmd).

Same issue as the Google variant: this .cmd file is committed with LF-only endings, which can cause cmd.exe's 512-byte chunked parser to mis-handle labels and goto/call targets. Switch to CRLF (e.g., via a .gitattributes rule *.cmd text eol=crlf).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd` around lines 1 -
25, This .cmd script (labels :search_up, :found, :notfound and uses call/goto)
is committed with LF-only endings which can break cmd.exe parsing; convert the
file to CRLF line endings and add a repository rule to enforce this (e.g., add a
.gitattributes entry like "*.cmd text eol=crlf") or re-save the file with CRLF
so the labels and call "%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work
correctly.
🧹 Nitpick comments (3)
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java (1)

46-47: ⚡ Quick win

Avoid fixed mock port to reduce CI/test flakiness.

Line 46 hard-codes 3100; concurrent test runs or occupied local ports can make this fail nondeterministically.

💡 Proposed fix
+import java.net.ServerSocket;
@@
-    protected static final int MOCK_LLM_PORT = 3100;
+    protected int mockLlmPort;
@@
-    protected String mockApiUrl() {
-        return "http://localhost:" + MOCK_LLM_PORT;
+    protected String mockApiUrl() {
+        return "http://localhost:" + mockLlmPort;
     }
@@
     private void startMockLlmApi() throws Exception {
+        mockLlmPort = findFreePort();
         var si = new StaticInterceptor();
@@
-        var sp = new ServiceProxy(new ServiceProxyKey(MOCK_LLM_PORT), null, 0);
+        var sp = new ServiceProxy(new ServiceProxyKey(mockLlmPort), null, 0);
@@
         mockRouter.start();
     }
+
+    private static int findFreePort() throws Exception {
+        try (ServerSocket socket = new ServerSocket(0)) {
+            return socket.getLocalPort();
+        }
+    }

Also applies to: 93-95, 111-112

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`
around lines 46 - 47, The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java (1)

112-113: 💤 Low value

Tutorial YAML uses a literal placeholder as the upstream API key.

This assertion confirms the tutorial's 20-Sharing-API-Keys.yaml ships with <<Replace with your API_KEY>> as the actual upstream apiKey value. If a user copies the tutorial to a real deployment without substituting the placeholder, the gateway will forward Bearer <<Replace with your API_KEY>> to OpenAI and get an opaque 401. Consider asserting that the tutorial YAML includes a prominent "REPLACE BEFORE USE" comment, or fail-fast in the gateway when the configured apiKey still contains <<…>> markers. Not a blocker for the tests themselves.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`
around lines 112 - 113, The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.
distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml (1)

13-19: 💤 Low value

Consider illustrating bob's key in the example curls.

Scenario #2 mentions that "Bob should still be able to send requests", but every example curl uses alice's key abc123. Adding one example with bob's key (qwertz) would make the multi-user demo concretely runnable and avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml` around
lines 13 - 19, The examples all use Alice's API key (abc123) but the scenario
text references Bob remaining able to send requests; add or replace one curl
example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 93-99: The current effectiveMaxTokens calculation using
Math.min(aiReq.getRequestedMaxOutputTokens(), maxOutputTokens) can produce 0 or
-1 when sentinel/unlimited values are present and thus corrupt limit accounting;
change the logic in LLMGatewayInterceptor to treat -1 as "unlimited" and compute
effectiveMaxTokens like: if aiReq.getRequestedMaxOutputTokens() == -1 use
maxOutputTokens; else if maxOutputTokens == -1 use
aiReq.getRequestedMaxOutputTokens(); otherwise use Math.min(requested,
maxOutputTokens); then ensure effectiveMaxTokens is non-negative (e.g., clamp to
0) before calling store.checkLimit(user, inputTokens, effectiveMaxTokens) and
when passing the required token count into
errorCreator.tokenLimitExceeded(inputTokens + effectiveMaxTokens, remaining,
store.getRemainingResetTime()) so the required-token value is never negative or
sentinel.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java`:
- Around line 24-33: The rate-limit error message in GoogleErrorCreator builds a
string using tokenRequired, tokenRemaining, and tokenResetInSeconds and
currently formats tokenRemaining directly; change the formatting to clamp
tokenRemaining to a non-negative value (e.g., use Math.max(0, tokenRemaining) or
an equivalent local variable like visibleTokenRemaining) before calling
envelope/json so the message never shows negative "only -5 remain"; update the
formatted argument used in the envelope call where tokenRemaining is passed.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 89-91: getUsers() currently returns the mutable `@GuardedBy`("lock")
field users directly, breaking the synchronization contract; change
SimpleAiApiStore.getUsers() to avoid leaking the guarded list by making and
returning a defensive copy (or an unmodifiable view) while holding lock: inside
synchronized(lock) create a new List (e.g., new ArrayList<>(users)) or
Collections.unmodifiableList(new ArrayList<>(users)) and return that, ensuring
callers cannot mutate the original users field outside the lock.

In `@distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml`:
- Line 5: Update the comment that currently reads "# Replace <<API_KEY>> with
your OpenAI API key." to correctly reference Google/Gemini credentials for this
Google Gemini tutorial; locate the comment string "# Replace <<API_KEY>> with
your OpenAI API key." in the 10-Basic-LLM-Gateway.yaml file and replace it with
a clear phrase such as "# Replace <<API_KEY>> with your Google API key
(Gemini)". Ensure the new wording explicitly mentions Google or Gemini so it
matches the tutorial context.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 25-27: The comment in the tutorial snippet contains a grammar
typo: replace the phrase "even it was requested" with "even though it was
requested" in the comment block (the curl example comment in
20-Sharing-API-Keys.yaml that mentions "totalTokens should not exceed 200 even
it was requested in max-output.json") so the sentence reads "...totalTokens
should not exceed 200 even though it was requested in max-output.json".

---

Outside diff comments:
In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd`:
- Around line 1-25: This file uses LF-only endings which can break label parsing
in cmd scripts; convert
distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line endings
and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 149-150: The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`:
- Around line 180-218: The test alicesTokenBudgetIsExhaustedWhileBobIsUnaffected
relies on a fresh per-user token counter but current test runs share Alice's
gateway-side state across methods; add a test teardown that resets per-user
usage between `@Test` methods (e.g., an `@AfterEach` in the common base test class
used by SharingApiKeysGoogleTutorialTest) so the gateway's user token counters
are reinitialized before each test, or call the existing reset method on the
shared token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or
similar) from that teardown to guarantee Alice's counter starts at 0 for every
test.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd`:
- Around line 1-25: This .cmd script (labels :search_up, :found, :notfound and
uses call/goto) is committed with LF-only endings which can break cmd.exe
parsing; convert the file to CRLF line endings and add a repository rule to
enforce this (e.g., add a .gitattributes entry like "*.cmd text eol=crlf") or
re-save the file with CRLF so the labels and call
"%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work correctly.

---

Nitpick comments:
In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`:
- Around line 46-47: The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`:
- Around line 112-113: The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 13-19: The examples all use Alice's API key (abc123) but the
scenario text references Bob remaining able to send requests; add or replace one
curl example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 80d72558-1bcf-4c21-a2dd-5a2266de02f2

📥 Commits

Reviewing files that changed from the base of the PR and between 329bb80 and 38acaa3.

📒 Files selected for processing (74)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPPing.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
  • core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/BasicClaudeLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/AbstractGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/BasicGoogleLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/AbstractOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/BasicOpenAiLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/claude/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/claude/membrane.sh
  • distribution/tutorials/ai/llm-gateway/claude/simple.json
  • distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/max-input.json
  • distribution/tutorials/ai/llm-gateway/google/max-output.json
  • distribution/tutorials/ai/llm-gateway/google/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/google/membrane.sh
  • distribution/tutorials/ai/llm-gateway/google/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/openai/max-input.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output.json
  • distribution/tutorials/ai/llm-gateway/openai/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/openai/membrane.sh
  • distribution/tutorials/ai/llm-gateway/openai/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/wrong-model.json
💤 Files with no reviewable changes (5)
  • distribution/tutorials/ai/llm-gateway/openai/simple.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/openai/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/openai/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.sh
✅ Files skipped from review due to trivial changes (39)
  • distribution/tutorials/ai/llm-gateway/claude/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/google/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
  • core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java

Comment thread distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml Outdated
Comment thread distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml Outdated
…for LLM Gateway

- Ensure thread-safe access to users in `SimpleAiApiStore` with `List.copyOf`.
- Introduce `visibleRemaining` to handle non-negative token values in `GoogleErrorCreator`.
- Add configuration validation in `LLMGatewayInterceptor` to enforce API key substitution.
- Enhance token limit handling to adjust output tokens dynamically in `LLMGatewayInterceptor`.
- Update Google and Claude tutorials with clearer instructions for API key usage and token limits.
@predic8
Copy link
Copy Markdown
Member Author

predic8 commented May 21, 2026

/ok-to-test

predic8 and others added 6 commits May 22, 2026 08:01
- Introduced `StreamingOpenAiLLMGatewayTutorialTest` with SSE mocking and validation.
- Added JSON fixtures (`stream.json`, `max-output-stream.json`) for testing streaming requests.
- Enhanced base test framework to support `text/event-stream` responses.
- Updated `LLMGatewayInterceptor` to handle streaming scenarios with capped tokens.
- Replaced raw API key placeholders with `TEST_API_KEY` constant in tutorial tests to ensure consistency.
- Added `TEST_API_KEY` to `AbstractAiTutorialTest` for upstream key substitution verification.
- Updated `log4j2.xml` to limit logging to `com.predic8.membrane.core.interceptor.ai`.
- Introduced PostgreSQL dependency in `pom.xml` for future enhancements.
@predic8 predic8 requested a review from rrayst May 22, 2026 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants