Skip to content

fix: normalize usageSummary.date format and add explicit ES mapping#26693

Open
kojikokojiko wants to merge 2 commits intoopen-metadata:mainfrom
kojikokojiko:fix/usage-summary-date-es-parse-error
Open

fix: normalize usageSummary.date format and add explicit ES mapping#26693
kojikokojiko wants to merge 2 commits intoopen-metadata:mainfrom
kojikokojiko:fix/usage-summary-date-es-parse-error

Conversation

@kojikokojiko
Copy link
Contributor

Problem

Elasticsearch/OpenSearch throws parse errors when reindexing entities with usageSummary:

failed to parse field [usageSummary.date] of type [date] in document with id '...'
Preview of field's value: '2026-03-19 00:00:00'

Root cause (two issues):

  1. usageSummary.date was not defined in ES index mappings, so ES used dynamic mapping and inferred the type as date with default format (strict_date_optional_time). This format only accepts ISO 8601 (T-separated), not space-separated datetime strings.

  2. MySQL JDBC's ResultSet.getString() returns DATE columns as "yyyy-MM-dd HH:mm:ss" (with time component), which doesn't match the dynamically-inferred ES format, causing parse errors.

Fix

1. Java: normalize date output (root cause)

Changed r.getString("usageDate") to r.getDate("usageDate").toString() in both UsageDetailsMapper and UsageDetailsWithIdMapper.

java.sql.Date.toString() always returns "yyyy-MM-dd" regardless of JDBC driver behavior.

Affected files:

  • CollectionDAO.java (2 mappers)
  • UsageRepository.java (1 mapper)

2. ES mapping: add explicit usageSummary.date field (defensive fix)

Added explicit date field definition to usageSummary.properties in all index mapping JSON files across all languages (en, jp, zh, ru) for all entities that have usageSummary (table, dashboard, mlmodel, database, databaseSchema, container, and others).

"date": {
  "type": "date",
  "format": "strict_date_optional_time||yyyy-MM-dd HH:mm:ss||epoch_millis"
}

This ensures that even if legacy data with "yyyy-MM-dd HH:mm:ss" format exists, reindexing will not fail.

Why both fixes are needed

  • Fix 1 alone: new data is safe, but reindexing legacy data with old format would still fail
  • Fix 2 alone: ES accepts both formats, but root cause (inconsistent date serialization) remains
  • Both together: complete fix for new and existing data

Testing

  • Verified ES mapping is applied correctly on local Docker environment
  • Confirmed "2026-03-19" (new format) indexes successfully ✅
  • Confirmed "2026-03-19 00:00:00" (legacy format) indexes successfully ✅ (was failing before)

Fixes #26678

🤖 Generated with Claude Code

@github-actions
Copy link
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Contributor

github-actions bot commented Mar 23, 2026

🟡 Playwright Results — all passed (15 flaky)

✅ 3417 passed · ❌ 0 failed · 🟡 15 flaky · ⏭️ 208 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 453 0 2 2
🟡 Shard 2 603 0 2 32
🟡 Shard 3 611 0 5 27
🟡 Shard 4 601 0 2 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 562 0 4 33
🟡 15 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/BulkImport.spec.ts › Database (shard 2, 1 retry)
  • Features/IncidentManager.spec.ts › Complete Incident lifecycle with table owner (shard 3, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaDeletionUserProfile.spec.ts › User profile loads correctly before and after persona deletion (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Verify Domain entity API calls do not include invalid domains field in glossary term assets (shard 4, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Glossary Term Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • VersionPages/ServiceEntityVersionPage.spec.ts › Api Collection (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@harshach harshach requested a review from mohityadav766 March 23, 2026 13:18
@kojikokojiko kojikokojiko force-pushed the fix/usage-summary-date-es-parse-error branch 2 times, most recently from ad9b3d6 to 502c253 Compare March 24, 2026 05:25
@kojikokojiko
Copy link
Contributor Author

@harshach
How can I fix maven-collate-ci error?

kojikokojiko and others added 2 commits March 24, 2026 15:38
… parse errors

MySQL JDBC may return DATE columns as 'yyyy-MM-dd HH:mm:ss' depending on
driver configuration, causing Elasticsearch to fail parsing usageSummary.date
when the field is dynamically mapped as strict_date_optional_time.

- Use r.getDate() instead of r.getString() in UsageDetailsMapper and
  UsageDetailsWithIdMapper (CollectionDAO) and UsageRepository to ensure
  the date is always normalized to 'yyyy-MM-dd' format via java.sql.Date.toString()
- Add explicit 'date' field mapping to all 36 ES/OS index mapping files with
  format 'strict_date_optional_time||yyyy-MM-dd HH:mm:ss||epoch_millis' to
  handle both formats defensively

Fixes open-metadata#26678

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers the fix for usageSummary.date parse errors in Elasticsearch.
Tests verify that getDate().toString() always returns yyyy-MM-dd format
for both non-null and null usageDate values, across all three mappers:
- CollectionDAO.UsageDAO.UsageDetailsMapper
- CollectionDAO.UsageDAO.UsageDetailsWithIdMapper
- UsageRepository.UsageDetailsMapper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kojikokojiko kojikokojiko force-pushed the fix/usage-summary-date-es-parse-error branch from f07a57e to a666955 Compare March 24, 2026 10:08
@gitar-bot
Copy link

gitar-bot bot commented Mar 24, 2026

Code Review ✅ Approved

Normalizes usageSummary.date format and adds explicit Elasticsearch mapping to prevent parse errors. No issues found.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Search-Issue] Issue in ES/OS

3 participants