Fixes 25682: Add server-side XSS sanitization for user input by GhaziBenDahmane · Pull Request #25683 · open-metadata/OpenMetadata

GhaziBenDahmane · 2026-02-03T14:07:15Z

Describe your changes:

Fixes 25682

Sanitization to avoid XSS is currently done in the frontend before inserting the data instead of being done in the backend, so a bad actor can insert JS code that will be executed via the api.

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

New sanitization utility:
- InputSanitizer in util/InputSanitizer.java uses OWASP HTML Sanitizer with entity link preservation via regex
Server-side XSS protection:
- Applied sanitize() at 20+ locations including EntityRepository, FeedRepository, ColumnRepository, and entity mappers
Comprehensive test coverage:
- InputSanitizerTest with 16 test cases covering XSS vectors, safe HTML preservation, and edge cases

_{This will update automatically on new commits.}

github-actions · 2026-02-03T14:07:43Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

gitar-bot · 2026-02-03T14:12:10Z

openmetadata-service/src/main/java/org/openmetadata/service/util/InputSanitizer.java

+          .allowAttributes("style", "class")
+          .onElements("div", "span", "p", "pre", "code")


⚠️ Security: Allowing style attribute enables CSS-based attacks

Details

The style attribute is allowed on multiple elements (div, span, p, pre, code), which can enable CSS-based attacks. While the OWASP HTML Sanitizer provides some protection, allowing arbitrary style values can lead to:

Data exfiltration via CSS selectors and url() in backgrounds

UI redressing where attackers overlay fake UI elements

User tracking through unique CSS-based fingerprinting

Content hiding/manipulation to confuse users

Recommendation: Either remove the style attribute entirely, or use .allowStyling(CssSchema.DEFAULT) which provides safer CSS handling. If custom styling is required, consider allowlisting specific CSS properties:

.allowAttributes("style") .matching(StyleAttributePolicy.allowedProperties("color", "font-weight", "text-align")) .onElements("div", "span", "p", "pre", "code")

Alternatively, rely on class attributes (which are already allowed) with predefined CSS classes for styling.

Was this helpful? React with 👍 / 👎

github-actions · 2026-02-04T13:44:45Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

gitar-bot · 2026-02-04T13:49:10Z

🔍 CI failure analysis for a37c613: All CI jobs continue to fail due to missing 'safe to test' label - no change from previous analysis.

Issue

All CI jobs remain blocked and continue to fail immediately without executing any tests or builds.

Root Cause

The PR is from an external contributor (GhaziBenDahmane) and still lacks the "safe to test" label required by OpenMetadata's CI security workflow. This security gate prevents unauthorized code execution in CI pipelines for external contributions.

Current Status

Latest commit: a37c613 (merged with main)
Label status: Still missing "safe to test" label
CI execution: No actual validation has occurred - all jobs exit immediately
Recent analysis: 10 additional job failures examined, all show identical root cause

Details

Every CI job continues to exit with the same error:

Error! This pull request does not contain any of the valid labels: ['safe to test']
Exiting with an error code

All affected jobs (including newly analyzed failures):

Checkstyle: java-checkstyle, py-checkstyle
Maven builds: maven-postgresql-ci, maven-collate-ci, maven-sonarcloud-ci
Integration tests: integration-tests-mysql-elasticsearch, integration-tests-postgres-opensearch
Python tests: py-run-tests (3.10, 3.11), py-run-build-tests
E2E tests: playwright-ci-postgresql (all 6 shards)
Test reports: Multiple Test Report jobs

Impact

No code validation, linting, testing, or building has occurred because the security gate prevents execution. The XSS sanitization implementation remains unvalidated by CI.

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Good XSS sanitization implementation using OWASP library with comprehensive test coverage. Two previous security findings remain unaddressed: the style attribute enables CSS-based attacks, and the entity link placeholder could conflict with user input.

⚠️

Security: Allowing `style` attribute enables CSS-based attacks

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/InputSanitizer.java:38-39

The style attribute is allowed on multiple elements (div, span, p, pre, code), which can enable CSS-based attacks. While the OWASP HTML Sanitizer provides some protection, allowing arbitrary style values can lead to:

Data exfiltration via CSS selectors and url() in backgrounds
UI redressing where attackers overlay fake UI elements
User tracking through unique CSS-based fingerprinting
Content hiding/manipulation to confuse users

Recommendation: Either remove the style attribute entirely, or use .allowStyling(CssSchema.DEFAULT) which provides safer CSS handling. If custom styling is required, consider allowlisting specific CSS properties:

.allowAttributes("style")
.matching(StyleAttributePolicy.allowedProperties("color", "font-weight", "text-align"))
.onElements("div", "span", "p", "pre", "code")

Alternatively, rely on class attributes (which are already allowed) with predefined CSS classes for styling.

💡 Security: Entity link placeholder could conflict with user input

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/InputSanitizer.java:32-33

The placeholder pattern __OM_ENTITY_LINK_%d__ used for entity link preservation could theoretically appear in legitimate user input, causing unintended behavior. While unlikely in practice, this is a defense-in-depth concern.

Scenario: If a user includes text like __OM_ENTITY_LINK_0__ in their content, and the sanitization process runs, this text would be replaced with the first extracted entity link (if any) or cause a mismatch.

Recommendation: Use a more complex placeholder that includes a random/unique component:

// Option 1: Include a UUID-based prefix
private static final String ENTITY_LINK_PLACEHOLDER = "__OM_EL_" + UUID.randomUUID().toString().substring(0, 8) + "_%d__";

// Option 2: Use invisible Unicode characters (less preferable but harder to accidentally match)
private static final String ENTITY_LINK_PLACEHOLDER = "\u200B__OM_ENTITY_LINK_%d__\u200B";

This is a low-risk edge case, but addressing it improves robustness.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

fix(security): Add server-side XSS sanitization for user input

c3c554c

GhaziBenDahmane requested a review from a team as a code owner February 3, 2026 14:07

GhaziBenDahmane had a problem deploying to test February 3, 2026 14:07 — with GitHub Actions Failure

gitar-bot bot reviewed Feb 3, 2026

View reviewed changes

Merge branch 'main' into fix/server-side-xss-sanitization

a37c613

GhaziBenDahmane had a problem deploying to test February 4, 2026 13:44 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes 25682: Add server-side XSS sanitization for user input#25683

Fixes 25682: Add server-side XSS sanitization for user input#25683
GhaziBenDahmane wants to merge 2 commits intoopen-metadata:mainfrom
GhaziBenDahmane:fix/server-side-xss-sanitization

GhaziBenDahmane commented Feb 3, 2026 •

edited by gitar-bot bot

Loading

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

gitar-bot bot Feb 3, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

gitar-bot bot commented Feb 4, 2026 •

edited

Loading

Issue

Root Cause

Current Status

Details

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		.allowAttributes("style", "class")
		.onElements("div", "span", "p", "pre", "code")

Conversation

GhaziBenDahmane commented Feb 3, 2026 • edited by gitar-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

gitar-bot bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

gitar-bot bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Root Cause

Current Status

Details

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GhaziBenDahmane commented Feb 3, 2026 •

edited by gitar-bot bot

Loading

gitar-bot bot commented Feb 4, 2026 •

edited

Loading