Skip to content

Breaking change in Document AI output despite same processor version (production impact) #33458

@mordeccai

Description

@mordeccai

We are experiencing a severe and unacceptable regression in Document AI behavior that is actively impacting production users.

Without any changes on our side — no code changes, no processor version changes, no retraining — identical documents that have been reliably processed for months are now producing dramatically incorrect results.

Specifically:

  • The system is extracting numerous duplicate and invalid entities
  • The number of detected parties has increased significantly without justification
  • Previously clean, deterministic outputs are now noisy and unreliable

This is not a minor fluctuation — it is a clear degradation in output quality that breaks downstream logic and confuses real users.

It is extremely concerning that such a change can occur without any version change or notice. We rely on processor versioning for stability, and this behavior undermines that expectation.

Has any change been made to model behavior, OCR processing, or entity extraction logic affecting existing processor versions?

This issue is urgent and currently impacting live systems. We expect a prompt and clear explanation and fix.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions