Skip to content

feat(dotAI): Dot AI LangChain4J integration (WIP - First provider)#35150

Open
ihoffmann-dot wants to merge 14 commits intomainfrom
dot-ai-langchain-integration
Open

feat(dotAI): Dot AI LangChain4J integration (WIP - First provider)#35150
ihoffmann-dot wants to merge 14 commits intomainfrom
dot-ai-langchain-integration

Conversation

@ihoffmann-dot
Copy link
Copy Markdown
Member

Summary

Replaces the direct OpenAI HTTP client (OpenAIClient) with a LangChain4J abstraction layer, enabling multi-provider AI support in dotCMS. This PR covers Phase 1: OpenAI via LangChain4J.

Changes

  • LangChain4jAIClient.java: New AIProxiedClient implementation that delegates chat, embeddings, and image requests to LangChain4J models.
  • LangChain4jModelFactory.java: Factory that builds ChatModel, EmbeddingModel, and ImageModel instances from a ProviderConfig. Only place with provider-specific builder logic.
  • ProviderConfig.java: Deserializable POJO for the providerConfig JSON secret (per provider section: model, apiKey, endpoint, maxTokens, maxCompletionTokens, etc.).
  • AppConfig.java: Replaced legacy individual-field secrets (apiKey, model, etc.) with a single providerConfig JSON string. isEnabled() now only checks this field.
  • AIAppValidator.java: Removed the OpenAI /v1/models validation call, which is incompatible with the multi-provider architecture.
  • CompletionsResource.java: Updated /api/v1/ai/completions/config to derive model names and config values from AppConfig getters instead of iterating raw AppKeys.
  • dotAI.yml: Removed legacy hidden fields; added providerConfig as the single configuration entry point.
  • Tests: Added unit tests for ProviderConfig, LangChain4jModelFactory, and LangChain4jAIClient; updated AIProxyClientTest integration test to use providerConfig-based setup.

Motivation

The previous implementation was tightly coupled to OpenAI's API contract (hardcoded HTTP calls, OpenAI-specific parameters, model validation via /v1/models). LangChain4J provides a provider-agnostic model interface, allowing future phases to add Azure OpenAI, AWS Bedrock, and Vertex AI without touching the core request/response flow.

The providerConfig JSON secret replaces multiple individual secrets with a single structured configuration, supporting per-section (chat/embeddings/image) provider and model settings.

Related Issue

EPIC: dotAI Multi-Provider Support #33970

@github-actions
Copy link
Copy Markdown

❌ Issue Linking Required

This PR could not be linked to an issue. All PRs must be linked to an issue for tracking purposes.

How to fix this:

Option 1: Add keyword to PR body (Recommended - auto-removes this comment)
Edit this PR description and add one of these lines:

  • This PR fixes #123 or Fixes: #123

  • This PR closes #123 or Closes: #123

  • This PR resolves #123 or Resolves: #123

  • Other supported keywords: fix, fixed, close, closed, resolve, resolved
    Option 2: Link via GitHub UI (Note: won't clear the failed check)

  1. Go to the PR → Development section (right sidebar)

  2. Click "Link issue" and select an existing issue

  3. Push a new commit or re-run the workflow to clear the failed check
    Option 3: Use branch naming
    Create a new branch with one of these patterns:

  • 123-feature-description (number at start)

  • issue-123-feature-description (issue-number at start)

  • feature-issue-123 (issue-number anywhere)

Why is this required?

Issue linking ensures proper tracking, documentation, and helps maintain project history. It connects your code changes to the problem they solve.---

This comment was automatically generated by the issue linking workflow

@github-actions github-actions bot added the Area : Backend PR changes Java/Maven backend code label Mar 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 27, 2026

Claude finished @ihoffmann-dot's task in 5m 5s —— View job


Code Review

  • Read key changed files
  • Review LangChain4jAIClient, LangChain4jModelFactory, ProviderConfig
  • Review AppConfig, AppKeys, dotAI.yml changes
  • Review CompletionsResource, AIAppValidator, AIModels changes
  • Review tests
  • Post findings

Findings

1. Security: API keys exposed in UI — dotAI.yml

providerConfig is declared hidden: false, so the entire JSON blob (including apiKey, and future secretAccessKey/accessKeyId for Bedrock) is displayed in plain text in the dotCMS apps UI. The old apiKey field was hidden: true. This is a regression; the field should be hidden: true.

2. Security: Incomplete credential redaction in /config endpoint — CompletionsResource.java:175

providerConfig.replaceAll("\"apiKey\"\\s*:\\s*\"[^\"]+\"", "\"apiKey\":\"*****\"")

This only redacts the apiKey field. For Bedrock configs (secretAccessKey, accessKeyId) or Vertex AI, those credentials will be returned in plaintext from the /api/v1/ai/completions/config GET endpoint. The redaction must cover all credential fields, not just apiKey. Fix this →

3. NPE crash: computeIfAbsent with null-returning factory — LangChain4jAIClient.java:118-152

LangChain4jModelFactory.build* returns null when config.getProvider() == null. ConcurrentHashMap.computeIfAbsent explicitly forbids a null return from the mapping function and will throw NullPointerException. A request where the provider field is omitted from the JSON will crash here rather than giving a useful error.

// In LangChain4jModelFactory.buildChatModel():
if (config == null || config.getProvider() == null) {
    return null;   // ← ConcurrentHashMap will NPE on this
}

The factory should throw IllegalArgumentException instead of returning null, consistent with the default branch in the switch.

4. Unbounded memory leak in model caches — LangChain4jAIClient.java:59-61

The three ConcurrentHashMap caches are keyed on the full providerConfigJson string (including the plaintext API key). There is no size limit, TTL, or eviction. In a multi-site installation where each site has its own providerConfig, or whenever providerConfig is updated, stale model instances accumulate permanently. Consider using a size-bounded cache (e.g. Caffeine, or a simple LRU wrapper).

5. getProvider() hardcoded to OPEN_AILangChain4jAIClient.java:71

public AIProvider getProvider() {
    return AIProvider.OPEN_AI;
}

The class is intended to be the single multi-provider client. Returning OPEN_AI for Bedrock or Vertex requests will mislead any code that branches on provider identity. The method should either be removed from the interface contract, or resolved dynamically from ProviderConfig.

6. isEnabled() Javadoc says dual-check, implementation is single-check — AppConfig.java:311

/**
 * Returns true when either a {@code providerConfig} JSON is present (new path)
 * or all legacy URL + API key fields are populated (old path).   ← Javadoc
 */
public boolean isEnabled() {
    return StringUtils.isNotBlank(providerConfig);  // ← only new path
}

The Javadoc describes the safe dual-check that would protect rollback, but the implementation only checks providerConfig. The previous review comment already flagged the rollback risk; the Javadoc just makes the gap more visible. Either fix the implementation to match the docs, or fix the docs to be honest that legacy keys are no longer checked.

7. validateAIConfig is a complete no-op — AIAppValidator.java:49-51

public void validateAIConfig(final AppConfig appConfig, final String userId) {
    appConfig.debugLogger(getClass(), () -> "AI configuration validation delegated to LangChain4J");
}

LangChain4J validates credentials at request time, not eagerly. This means a misconfigured providerConfig (wrong API key, bad model name) is invisible until the first real request fails at runtime. The old /v1/models call was removed for good reason, but there's now zero upfront feedback on a broken config. At a minimum, AIAppValidator should attempt to parse and validate the ProviderConfig structure without making a network call.

8. providerConfig JSON parsed redundantly on AppConfig construction — AppConfig.java:62-70

The JSON is parsed three separate times (once for "chat", "image", and "embeddings") via buildModelFromProviderConfig. One parse + three root.get(section) calls would be equivalent.

9. Missing section in providerConfig throws unchecked — LangChain4jAIClient.java:241

If a site only configures chat but not embeddings, any embeddings request will throw IllegalArgumentException("Missing 'embeddings' section in providerConfig") which is caught somewhere up the stack as a 500. There should be a clear, user-facing error or a graceful "not configured" response for unconfigured sections, rather than an unchecked exception with an internal message leaking config structure.


Minor

  • LangChain4jModelFactory uses switch with a default that embeds "Supported in Phase 1: openai" in the error message. That string will be wrong in Phase 2 without touching call sites — consider removing it or using a computed list.
  • toMessages() silently drops the content of messages that are empty strings; LLMs may behave unexpectedly with empty message content.
  • AppKeys still contains TEXT_MODEL_NAMES, TEXT_MODEL_TOKENS_PER_MINUTE, etc. which are no longer used anywhere (the old model-fetch flow was removed). These are dead entries that will confuse future readers.

@ihoffmann-dot ihoffmann-dot self-assigned this Mar 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 27, 2026

Pull Request Unsafe to Rollback!!!

  • Category: M-3 — REST / Headless API Contract Change

  • Risk Level: 🟡 MEDIUM

  • Why it's unsafe: The /api/v1/ai/completions/config GET endpoint response shape has changed: the new code explicitly builds a narrow response map (replacing the previous full AppKeys iteration), so fields such as textModelNames, textModelTokensPerMinute, textModelApiPerMinute, imageModelNames, embeddingsModelNames, and related model-config keys are no longer returned. Any cached Angular frontend page or integration client that was built against N's response shape will not find those fields on a rollback to N-1.

    Additionally, AppConfig.isEnabled() was changed to check only providerConfig (the new secret). If a user deploys N and configures only providerConfig (leaving the legacy apiKey empty), rolling back to N-1 causes isEnabled() to return false and the entire dotAI integration is silently disabled until the legacy apiKey field is manually re-populated.

  • Code that makes it unsafe:

    • dotCMS/src/main/java/com/dotcms/ai/rest/CompletionsResource.javagetConfig() now builds the response from explicit getters rather than iterating all AppKeys; many previously-returned keys are absent from the new response map.
    • dotCMS/src/main/java/com/dotcms/ai/app/AppConfig.javaisEnabled() now returns StringUtils.isNotBlank(providerConfig) only (line ~220), dropping the legacy apiKey + URL check that N-1 uses to decide whether the AI app is active.
    • dotCMS/src/main/resources/apps/dotAI.yml — the apiKey param (previously required: true, hidden: true) was removed from the schema; providerConfig added as the sole activation field.
  • Alternative (if possible): Keep AppConfig.isEnabled() dual-checking: return true when either providerConfig or the legacy apiKey is set, so N-1 continues to work for sites that have not yet migrated. For the /config endpoint, preserve the previously-returned model-config keys (or alias them) alongside the new providerConfig field, so clients built against N-1's response contract do not receive unexpected nulls after rollback. Follow the two-phase removal pattern: ship N with both old and new config paths active, remove the legacy path in N+1 once N-2 is outside the rollback window.

@ihoffmann-dot ihoffmann-dot marked this pull request as ready for review March 30, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Not Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant