Skip to content

fix: normalize data component schemas for cross-provider LLM compatibility#2764

Open
anubra266 wants to merge 1 commit intomainfrom
prd-6070
Open

fix: normalize data component schemas for cross-provider LLM compatibility#2764
anubra266 wants to merge 1 commit intomainfrom
prd-6070

Conversation

@anubra266
Copy link
Contributor

@anubra266 anubra266 commented Mar 18, 2026

Problem

Data component JSON schemas pass our validation but fail at runtime when passed to OpenAI and Anthropic structured output:

  • Anthropic: rejects minimum/maximum (and exclusiveMinimum, exclusiveMaximum, multipleOf) on number/integer types
  • OpenAI strict-mode: requires every key in properties to also appear in required — optional fields like limitations caused rejections

This affected both component generation in the UI and runtime agent generation when a data component is attached to a subagent.

…ility

Adds stripUnsupportedConstraints + makeAllPropertiesRequired to agents-core
as a shared normalizeDataComponentSchema utility, fixing two provider errors:

- Anthropic: rejects minimum/maximum on number types in structured output
- OpenAI: requires all properties to appear in the required array

Applies normalization in both affected code paths:
- Runtime agent generation (schema-builder.ts)
- Component generation UI (generate-render/route.ts, previously had no normalization)

Also wraps buildDataComponentsSchema in a try/catch in generate.ts so an
invalid schema no longer crashes the server — falls back to generation
without structured output instead.
@vercel
Copy link

vercel bot commented Mar 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 18, 2026 11:30pm
agents-docs Ready Ready Preview, Comment Mar 18, 2026 11:30pm
agents-manage-ui Ready Ready Preview, Comment Mar 18, 2026 11:30pm

Request Review

@changeset-bot
Copy link

changeset-bot bot commented Mar 18, 2026

⚠️ No Changeset found

Latest commit: 58e5272

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pullfrog
Copy link
Contributor

pullfrog bot commented Mar 18, 2026

TL;DR — Data component JSON Schemas are now normalized before being sent to LLMs, stripping numeric constraints that Anthropic rejects (minimum/maximum) and ensuring all properties are in required for OpenAI strict-mode. The normalization logic lives in a new shared normalizeDataComponentSchema utility in agents-core.

Key changes

  • normalizeDataComponentSchema — New shared utility in agents-core that composes two transforms: stripUnsupportedConstraints (removes numeric range/step keywords) and makeAllPropertiesRequired (promotes all properties to required, wrapping optional ones as nullable).
  • schema-builder.ts — Agent generation now uses the shared normalizeDataComponentSchema instead of SchemaProcessor.makeAllPropertiesRequired, picking up the constraint-stripping step.
  • generate.tsbuildDataComponentsSchema call is wrapped in try/catch so a schema normalization failure logs an error and falls back to no structured output instead of crashing the generation.
  • generate-render/route.ts — The manage UI's data component preview endpoint also normalizes props before converting to Zod, fixing the same cross-provider issue in the preview flow.
  • SchemaProcessor — Gains a static stripUnsupportedConstraints delegate method, keeping the class API consistent while deferring to the shared implementation.

Summary | 6 files | 1 commit | base: mainprd-6070


normalizeDataComponentSchema in agents-core

Before: No shared normalization — SchemaProcessor.makeAllPropertiesRequired handled the OpenAI required constraint in-place but nothing stripped Anthropic-unsupported numeric keywords.
After: A composable pipeline in schema-conversion.ts: stripUnsupportedConstraintsmakeAllPropertiesRequired, exported as the single-call normalizeDataComponentSchema.

Both transforms recurse into properties, items, anyOf, oneOf, and allOf. stripUnsupportedConstraints removes minimum, maximum, exclusiveMinimum, exclusiveMaximum, and multipleOf from number/integer nodes. makeAllPropertiesRequired adds every property key to required and wraps originally-optional fields in { anyOf: [<schema>, { type: 'null' }] }.

Why strip numeric constraints instead of mapping them to descriptions? Anthropic's structured output API hard-rejects these keywords rather than ignoring them. Stripping is the simplest cross-provider fix — the LLM still sees the description text if the schema author included one.

schema-conversion.ts · schema-normalization.test.ts


Callers updated: schema-builder.ts and generate-render/route.ts

Before: schema-builder.ts called SchemaProcessor.makeAllPropertiesRequired; the manage UI preview route passed raw dataComponent.props to z.fromJSONSchema.
After: Both sites call normalizeDataComponentSchema, getting constraint stripping and required-promotion in one step.

The generate.ts wrapper now catches errors from buildDataComponentsSchema so a malformed schema degrades gracefully instead of aborting the entire generation.

schema-builder.ts · generate.ts · generate-render/route.ts

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix — the new shared normalization utilities are well-structured, the try/catch fallback in generate.ts prevents server crashes, and the test coverage is solid. Two items worth addressing: a duplication concern and a gap in the artifact-component-schema.ts call sites.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

T extends JSONSchema.BaseSchema | Record<string, unknown> | null | undefined,
>(schema: T): T {
return stripUnsupportedConstraints(schema as Record<string, unknown>) as T;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SchemaProcessor.makeAllPropertiesRequired (line 232) is now a near-identical duplicate of the standalone makeAllPropertiesRequired in agents-core/schema-conversion.ts. Unlike stripUnsupportedConstraints which already delegates here, makeAllPropertiesRequired does not — it still carries its own recursive implementation.

Consider having the static method delegate to the shared function (same pattern as stripUnsupportedConstraints on line 289) or removing it entirely and updating callers in artifact-component-schema.ts to import from agents-core directly. Keeping two independent copies is a maintenance hazard.

Comment on lines +285 to +289
/**
* Strips JSON Schema constraints that are not supported by all LLM providers.
* Delegates to the shared `stripUnsupportedConstraints` utility from agents-core.
*/
static stripUnsupportedConstraints<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegates stripUnsupportedConstraints to the shared util — good. But makeAllPropertiesRequired right above (line 232) does not delegate; it has its own full copy of the logic. These two methods should follow the same pattern for consistency.

Comment on lines +91 to +135
export function stripUnsupportedConstraints<T extends Record<string, unknown> | null | undefined>(
schema: T
): T {
if (!schema || typeof schema !== 'object') {
return schema;
}

const stripped: any = { ...schema };

if (stripped.type === 'number' || stripped.type === 'integer') {
delete stripped.minimum;
delete stripped.maximum;
delete stripped.exclusiveMinimum;
delete stripped.exclusiveMaximum;
delete stripped.multipleOf;
}

if (stripped.properties && typeof stripped.properties === 'object') {
const strippedProperties: any = {};
for (const [key, value] of Object.entries(stripped.properties)) {
strippedProperties[key] = stripUnsupportedConstraints(value as Record<string, unknown>);
}
stripped.properties = strippedProperties;
}

if (stripped.items) {
stripped.items = stripUnsupportedConstraints(stripped.items as Record<string, unknown>);
}
if (Array.isArray(stripped.anyOf)) {
stripped.anyOf = stripped.anyOf.map((s: any) =>
stripUnsupportedConstraints(s as Record<string, unknown>)
);
}
if (Array.isArray(stripped.oneOf)) {
stripped.oneOf = stripped.oneOf.map((s: any) =>
stripUnsupportedConstraints(s as Record<string, unknown>)
);
}
if (Array.isArray(stripped.allOf)) {
stripped.allOf = stripped.allOf.map((s: any) =>
stripUnsupportedConstraints(s as Record<string, unknown>)
);
}

return stripped;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither stripUnsupportedConstraints nor makeAllPropertiesRequired recurse into $defs / definitions or follow $ref pointers. If a user defines a data component schema that uses JSON Schema references (e.g., shared types via $defs), constraints inside those definitions will survive stripping.

This is likely fine for current data component schemas (they tend to be flat), but worth noting as a known limitation — or adding a $defs/definitions traversal for completeness.

const shouldStream = ctx.isDelegatedAgent ? undefined : ctx.streamHelper;

const dataComponentsSchema = hasStructuredOutput ? buildDataComponentsSchema(ctx) : null;
let dataComponentsSchema: ReturnType<typeof buildDataComponentsSchema> | null = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ReturnType<typeof buildDataComponentsSchema> resolves to z.ZodType<any>, so let dataComponentsSchema: z.ZodType<any> | null = null would be more readable and avoids coupling to the function name.

Comment on lines +60 to +63
// Normalize for cross-provider compatibility:
// - strips Anthropic-unsupported constraints (minimum/maximum on numbers)
// - ensures all properties are in required (OpenAI strict-mode requirement)
const normalizedProps = normalizeDataComponentSchema(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three comment lines restate what the code does and what the JSDoc on normalizeDataComponentSchema already documents. Per codebase conventions, consider removing them.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: Medium

🟠⚠️ Major (3) 🟠⚠️

🟠 1) SchemaProcessor.ts Duplicate makeAllPropertiesRequired should delegate to agents-core

Issue: The PR correctly established a delegation pattern for stripUnsupportedConstraints (lines 289-293), but makeAllPropertiesRequired (lines 232-283) remains a full 50-line duplicate of the identical implementation in agents-core/src/utils/schema-conversion.ts:145-196.

Why: Code duplication creates maintenance burden — if a bug is found in one implementation, it must be fixed in both. The delegation pattern is already established in this file; makeAllPropertiesRequired should follow the same pattern for consistency. This is especially concerning since both implementations handle the same edge cases (nullable detection, anyOf handling).

Fix: Replace the full implementation with a delegation wrapper (same pattern as stripUnsupportedConstraints):

import { makeAllPropertiesRequired as makeAllPropertiesRequiredCore } from '@inkeep/agents-core';

static makeAllPropertiesRequired<T extends JSONSchema.BaseSchema | Record<string, unknown> | null | undefined>(schema: T): T {
  return makeAllPropertiesRequiredCore(schema as Record<string, unknown>) as T;
}

Then remove lines 232-283.

Refs:

🟠 2) generate.ts:228-238 Silent fallback masks failures from users

Issue: When buildDataComponentsSchema() fails, the error is logged but generation silently continues without structured output. Users who configured data components expecting JSON responses will receive unstructured text instead, with no visible indication that anything went wrong.

Why: This creates a debugging nightmare. Users see their agent "working" but not producing the expected data components. The catch block swallows ALL exceptions indiscriminately — schema validation errors, type errors, or unexpected runtime bugs are all hidden. The only diagnostic path is server log correlation, which most users cannot access.

Fix: Consider surfacing the failure to users. Options:

  1. Fail fast — throw a user-facing error explaining the schema issue
  2. Graceful with warning — add span.setAttribute('generation.structured_output_fallback', true) and send a warning via ctx.streamHelper if available
  3. At minimum — include dataComponentIds in the error log for debugging

Refs:

🟠 3) generate-render/route.ts:63-66 Missing error handling for schema normalization

Issue: normalizeDataComponentSchema() and z.fromJSONSchema() are called without specific error handling. If these fail, the outer try/catch returns a generic HTTP 500 with no context about which data component failed or why.

Why: When "Generate Component" fails in the Agent Builder UI, users cannot self-diagnose. They see a generic error toast with no actionable information. This generates support tickets that could be avoided with better error messaging.

Fix: Add specific error handling that returns HTTP 400 with actionable feedback:

try {
  normalizedProps = normalizeDataComponentSchema(dataComponent.props as Record<string, unknown>);
  mockDataSchema = z.fromJSONSchema(normalizedProps);
} catch (error) {
  console.error('Schema normalization failed:', { dataComponentId, dataComponentName: dataComponent.name, error });
  return new Response(JSON.stringify({
    error: 'Invalid data component schema',
    message: 'Please check that all field types are valid.',
  }), { status: 400, headers: { 'Content-Type': 'application/json' } });
}

Refs:

Inline Comments:

  • 🟠 Major: SchemaProcessor.ts:293 Duplicate makeAllPropertiesRequired should delegate
  • 🟠 Major: generate.ts:228-238 Silent fallback from structured to unstructured output
  • 🟠 Major: generate-render/route.ts:63-66 Missing error handling for schema normalization

🟡 Minor (1) 🟡

🟡 1) schema-normalization.test.ts Additional edge case coverage recommended

Issue: The test suite covers core functionality well but misses some edge cases handled by the implementation: oneOf/allOf recursive stripping (only anyOf tested), nullable: true property handling, and deeply nested schemas (3+ levels).

Why: These code paths exist in the implementation but aren't validated by tests. Future refactoring could silently break them.

Fix: Consider adding tests for oneOf/allOf stripping and nullable: true handling. See inline comment for examples.

Refs:

Inline Comments:

  • 🟡 Minor: schema-normalization.test.ts:175 Consider adding edge case coverage

💭 Consider (2) 💭

💭 1) artifact-component-schema.ts:105,159 Partial normalization inconsistency

Issue: This file calls SchemaProcessor.makeAllPropertiesRequired but NOT stripUnsupportedConstraints, meaning artifact component schemas only get OpenAI compatibility fixes but NOT Anthropic compatibility fixes. By contrast, schema-builder.ts and generate-render/route.ts now use the full normalizeDataComponentSchema helper.

Why: Split-world scenario where artifact components behave differently from data components. If artifact schemas include numeric constraints, they'll fail with Anthropic models.

Fix: Consider using normalizeDataComponentSchema here as well for consistency, or document why partial normalization is intentional for artifact schemas.

💭 2) Missing changeset

Issue: The changeset-bot flagged this PR has no changeset. This PR modifies agents-core, agents-api, and agents-manage-ui with bug fixes that affect runtime behavior.

Fix: Add a changeset: pnpm bump patch --pkg agents-core --pkg agents-api "Fix data component schema normalization for cross-provider LLM compatibility"


💡 APPROVE WITH SUGGESTIONS

Summary: The core fix is solid — the new normalizeDataComponentSchema utility correctly addresses the Anthropic/OpenAI compatibility issues. However, the PR introduces code duplication (SchemaProcessor.makeAllPropertiesRequired should delegate to agents-core like stripUnsupportedConstraints does) and the silent fallback behavior in generate.ts could mask configuration issues from users. Address the duplication and consider improving error visibility.

Discarded (3)
Location Issue Reason Discarded
schema-conversion.ts:91-136 Unsafe any type casting in recursive schema processing Pragmatic choice for JSON Schema processing; type guards at entry points provide sufficient safety
schema-conversion.ts:1 File should be named schema-normalization.ts Cosmetic naming suggestion; file already exports schema conversion utilities so current name is acceptable
generate.ts:303 hasStructuredOutput variable is misleading after fallback Minor observability concern; the variable reflects config intent, not runtime state — acceptable semantics
Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 4 1 0 0 1 0 2
pr-review-tests 8 0 0 0 1 0 7
pr-review-errors 3 1 0 0 2 0 0
pr-review-consistency 3 1 2 0 0 0 0
Total 18 3 2 0 4 0 9

T extends JSONSchema.BaseSchema | Record<string, unknown> | null | undefined,
>(schema: T): T {
return stripUnsupportedConstraints(schema as Record<string, unknown>) as T;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: makeAllPropertiesRequired should also delegate to agents-core

Issue: stripUnsupportedConstraints was correctly refactored to delegate to the shared utility (lines 289-293), but makeAllPropertiesRequired (lines 232-283) remains a full duplicate of the identical implementation in agents-core/src/utils/schema-conversion.ts:145-196.

Why: Code duplication creates maintenance burden — bug fixes must be applied in two places. The delegation pattern is already established here for stripUnsupportedConstraints, so makeAllPropertiesRequired should follow the same pattern for consistency.

Fix:

Suggested change
}
static stripUnsupportedConstraints<
T extends JSONSchema.BaseSchema | Record<string, unknown> | null | undefined,
>(schema: T): T {
return stripUnsupportedConstraints(schema as Record<string, unknown>) as T;
}
/**
* Makes all properties required recursively throughout the schema.
* Delegates to the shared `makeAllPropertiesRequired` utility from agents-core.
*/
static makeAllPropertiesRequired<
T extends JSONSchema.BaseSchema | Record<string, unknown> | null | undefined,
>(schema: T): T {
return makeAllPropertiesRequiredCore(schema as Record<string, unknown>) as T;
}

Then update the import at the top of the file:

import { stripUnsupportedConstraints, makeAllPropertiesRequired as makeAllPropertiesRequiredCore } from '@inkeep/agents-core';

And remove the duplicate implementation at lines 232-283.

Refs:

Comment on lines +228 to +238
let dataComponentsSchema: ReturnType<typeof buildDataComponentsSchema> | null = null;
if (hasStructuredOutput) {
try {
dataComponentsSchema = buildDataComponentsSchema(ctx);
} catch (err) {
logger.error(
{ agentId: ctx.config.id, err },
'Failed to build data components schema — continuing without structured output'
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Silent fallback from structured to unstructured output

Issue: When buildDataComponentsSchema() fails, the error is logged but generation silently continues without structured output. Users configured data components expecting JSON output will receive unstructured text instead, with no visible indication of the failure.

Why: This degrades user experience silently. The user configured data components, expects structured output, but receives plain text. The catch block swallows ALL exception types — it could hide schema validation errors, type errors, or unexpected runtime bugs. Debugging requires server log correlation.

Fix: Consider one of these approaches:

Option A — Fail fast (recommended for data integrity):

let dataComponentsSchema: ReturnType<typeof buildDataComponentsSchema> | null = null;
if (hasStructuredOutput) {
  try {
    dataComponentsSchema = buildDataComponentsSchema(ctx);
  } catch (err) {
    logger.error(
      { agentId: ctx.config.id, dataComponentIds: ctx.config.dataComponents?.map(dc => dc.id), err },
      'Data component schema build failed'
    );
    throw new Error(
      'Failed to build structured output schema from data components. ' +
      'Please review your data component schemas in the agent builder.'
    );
  }
}

Option B — Surface warning to user (graceful with awareness):

// Add span attribute for observability
span.setAttribute('generation.structured_output_fallback', true);
// Send warning via stream helper if available
if (ctx.streamHelper) {
  ctx.streamHelper.sendWarning('Structured output unavailable due to schema issues. Returning text response.');
}

Refs:

  • PR description — mentions "graceful fallback" but doesn't address user visibility

Comment on lines +63 to +66
const normalizedProps = normalizeDataComponentSchema(
dataComponent.props as Record<string, unknown>
);
const mockDataSchema = z.fromJSONSchema(normalizedProps);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Missing error handling for schema normalization

Issue: normalizeDataComponentSchema() and z.fromJSONSchema() are called without specific error handling. If normalization fails, the outer try/catch (line 124) catches it and returns a generic HTTP 500 with error.message. Users see "Internal server error" with no context about which data component failed or why.

Why: When clicking "Generate Component" in the Agent Builder UI fails, users cannot self-diagnose. They see a generic error toast with no actionable information. Every schema issue becomes a support ticket.

Fix: Add specific error handling with actionable feedback:

let normalizedProps: Record<string, unknown>;
let mockDataSchema: z.ZodType<any>;

try {
  normalizedProps = normalizeDataComponentSchema(
    dataComponent.props as Record<string, unknown>
  );
  mockDataSchema = z.fromJSONSchema(normalizedProps);
} catch (error) {
  console.error('Schema normalization failed:', {
    dataComponentId,
    dataComponentName: dataComponent.name,
    error: error instanceof Error ? error.message : String(error),
  });
  
  return new Response(
    JSON.stringify({
      error: 'Invalid data component schema',
      message: 'The data component schema could not be processed. Please check that all field types are valid.',
      details: error instanceof Error ? error.message : undefined,
    }),
    { 
      status: 400,
      headers: { 'Content-Type': 'application/json' },
    }
  );
}

Refs:

expect(result.properties.b).not.toHaveProperty('minimum');
expect(result.properties.b).not.toHaveProperty('maximum');
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Consider adding test coverage for edge cases

Issue: The test suite covers the core functionality well, but could benefit from additional edge cases:

  1. oneOf/allOf recursive stripping — only anyOf is tested (line 76-82)
  2. nullable: true property handling — the code checks for this (schema-conversion.ts:167) but no test validates it
  3. Deeply nested schemas (3+ levels) — only one level of nesting is tested

Why: These edge cases are handled in the implementation but not validated by tests. Future refactoring could silently break them.

Fix: Consider adding these test cases:

it('strips recursively in oneOf', () => {
  const schema = {
    oneOf: [{ type: 'number', minimum: 0 }, { type: 'string' }]
  };
  const result = stripUnsupportedConstraints(schema) as any;
  expect(result.oneOf[0]).not.toHaveProperty('minimum');
});

it('does not double-wrap properties already marked as nullable', () => {
  const schema = {
    type: 'object',
    properties: { field: { type: 'string', nullable: true } },
    required: []
  };
  const result = makeAllPropertiesRequired(schema) as any;
  expect(result.properties.field).not.toHaveProperty('anyOf');
});

Refs:

@github-actions github-actions bot deleted a comment from claude bot Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant