Skip to content

Fixes 25682: Add server-side XSS sanitization for user input#25683

Open
GhaziBenDahmane wants to merge 2 commits intoopen-metadata:mainfrom
GhaziBenDahmane:fix/server-side-xss-sanitization
Open

Fixes 25682: Add server-side XSS sanitization for user input#25683
GhaziBenDahmane wants to merge 2 commits intoopen-metadata:mainfrom
GhaziBenDahmane:fix/server-side-xss-sanitization

Conversation

@GhaziBenDahmane
Copy link

@GhaziBenDahmane GhaziBenDahmane commented Feb 3, 2026

Describe your changes:

Fixes 25682

Sanitization to avoid XSS is currently done in the frontend before inserting the data instead of being done in the backend, so a bad actor can insert JS code that will be executed via the api.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • New sanitization utility:
    • InputSanitizer in util/InputSanitizer.java uses OWASP HTML Sanitizer with entity link preservation via regex
  • Server-side XSS protection:
    • Applied sanitize() at 20+ locations including EntityRepository, FeedRepository, ColumnRepository, and entity mappers
  • Comprehensive test coverage:
    • InputSanitizerTest with 16 test cases covering XSS vectors, safe HTML preservation, and edge cases

This will update automatically on new commits.


@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Comment on lines +38 to +39
.allowAttributes("style", "class")
.onElements("div", "span", "p", "pre", "code")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Security: Allowing style attribute enables CSS-based attacks

Details

The style attribute is allowed on multiple elements (div, span, p, pre, code), which can enable CSS-based attacks. While the OWASP HTML Sanitizer provides some protection, allowing arbitrary style values can lead to:

  1. Data exfiltration via CSS selectors and url() in backgrounds
  2. UI redressing where attackers overlay fake UI elements
  3. User tracking through unique CSS-based fingerprinting
  4. Content hiding/manipulation to confuse users

Recommendation: Either remove the style attribute entirely, or use .allowStyling(CssSchema.DEFAULT) which provides safer CSS handling. If custom styling is required, consider allowlisting specific CSS properties:

.allowAttributes("style")
.matching(StyleAttributePolicy.allowedProperties("color", "font-weight", "text-align"))
.onElements("div", "span", "p", "pre", "code")

Alternatively, rely on class attributes (which are already allowed) with predefined CSS classes for styling.


Was this helpful? React with 👍 / 👎

@github-actions
Copy link
Contributor

github-actions bot commented Feb 4, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@gitar-bot
Copy link

gitar-bot bot commented Feb 4, 2026

🔍 CI failure analysis for a37c613: All CI jobs continue to fail due to missing 'safe to test' label - no change from previous analysis.

Issue

All CI jobs remain blocked and continue to fail immediately without executing any tests or builds.

Root Cause

The PR is from an external contributor (GhaziBenDahmane) and still lacks the "safe to test" label required by OpenMetadata's CI security workflow. This security gate prevents unauthorized code execution in CI pipelines for external contributions.

Current Status

  • Latest commit: a37c613 (merged with main)
  • Label status: Still missing "safe to test" label
  • CI execution: No actual validation has occurred - all jobs exit immediately
  • Recent analysis: 10 additional job failures examined, all show identical root cause

Details

Every CI job continues to exit with the same error:

Error! This pull request does not contain any of the valid labels: ['safe to test']
Exiting with an error code

All affected jobs (including newly analyzed failures):

  • Checkstyle: java-checkstyle, py-checkstyle
  • Maven builds: maven-postgresql-ci, maven-collate-ci, maven-sonarcloud-ci
  • Integration tests: integration-tests-mysql-elasticsearch, integration-tests-postgres-opensearch
  • Python tests: py-run-tests (3.10, 3.11), py-run-build-tests
  • E2E tests: playwright-ci-postgresql (all 6 shards)
  • Test reports: Multiple Test Report jobs

Impact

No code validation, linting, testing, or building has occurred because the security gate prevents execution. The XSS sanitization implementation remains unvalidated by CI.

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Good XSS sanitization implementation using OWASP library with comprehensive test coverage. Two previous security findings remain unaddressed: the style attribute enables CSS-based attacks, and the entity link placeholder could conflict with user input.

⚠️ Security: Allowing `style` attribute enables CSS-based attacks

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/InputSanitizer.java:38-39

The style attribute is allowed on multiple elements (div, span, p, pre, code), which can enable CSS-based attacks. While the OWASP HTML Sanitizer provides some protection, allowing arbitrary style values can lead to:

  1. Data exfiltration via CSS selectors and url() in backgrounds
  2. UI redressing where attackers overlay fake UI elements
  3. User tracking through unique CSS-based fingerprinting
  4. Content hiding/manipulation to confuse users

Recommendation: Either remove the style attribute entirely, or use .allowStyling(CssSchema.DEFAULT) which provides safer CSS handling. If custom styling is required, consider allowlisting specific CSS properties:

.allowAttributes("style")
.matching(StyleAttributePolicy.allowedProperties("color", "font-weight", "text-align"))
.onElements("div", "span", "p", "pre", "code")

Alternatively, rely on class attributes (which are already allowed) with predefined CSS classes for styling.

💡 Security: Entity link placeholder could conflict with user input

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/InputSanitizer.java:32-33

The placeholder pattern __OM_ENTITY_LINK_%d__ used for entity link preservation could theoretically appear in legitimate user input, causing unintended behavior. While unlikely in practice, this is a defense-in-depth concern.

Scenario: If a user includes text like __OM_ENTITY_LINK_0__ in their content, and the sanitization process runs, this text would be replaced with the first extracted entity link (if any) or cause a mismatch.

Recommendation: Use a more complex placeholder that includes a random/unique component:

// Option 1: Include a UUID-based prefix
private static final String ENTITY_LINK_PLACEHOLDER = "__OM_EL_" + UUID.randomUUID().toString().substring(0, 8) + "_%d__";

// Option 2: Use invisible Unicode characters (less preferable but harder to accidentally match)
private static final String ENTITY_LINK_PLACEHOLDER = "\u200B__OM_ENTITY_LINK_%d__\u200B";

This is a low-risk edge case, but addressing it improves robustness.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant