Skip to content

Content CSV import can produce duplicate-looking tags on updated content and new content #35124

@syedATdot

Description

@syedATdot

Problem Statement

When bulk updating content via CSV import, tag fields can end up showing duplicate-looking tags (same label appears more than once on a content item).

This appears to be a content import path issue (not the Tags v2 import endpoint), likely caused by how tag field values are processed during content import:

Tag values are added via content import logic (ImportUtil/addContentletTagInode path).
The import flow may not consistently resolve tag creation/lookup against the site’s effective tagStorage host.
If a tag name already exists under a different host context (for example SYSTEM_HOST vs site tag storage host), import can create/use another tag record with the same name.
During bulk update, tags are added as relations and may not be treated as a strict replace operation, which can make duplicates visible in UI.
This is related in theme to duplicate tag handling in #34548, but affects a different code path (content CSV import).

Steps to Reproduce

https://drive.google.com/file/d/1AiFmX1EZ_N352pV0rwjOf14qQN9NWnNM/view?usp=drive_link

  • Configure a site where tag storage can differ from direct host usage (or use data where existing tags are already under a different host context).
  • Ensure at least one tag exists already (example: susa) in one host/tag-storage context.
  • Export content from dotCMS backend (CSV), then prepare an update CSV including a tag field for existing content rows (include repeated/common tags like susa, state abbreviations, etc.).
  • Run Content Import (bulk update existing content) using that CSV.
  • Open one of the updated content items (for example row 3 from the import).
  • Observe tag chips in UI: same tag label can appear more than once (duplicate-looking tags).

Acceptance Criteria

  • Content CSV import should treat tag updates deterministically and avoid duplicate-looking tags on content.
  • Tag lookup/creation during content import should consistently resolve against the effective tagStorage host for the target site.
  • Import should not create parallel tag records with identical names across mismatched host contexts for the same intended tag.
  • A content item should show each logical tag only once after import (idempotent behavior on repeated imports).

dotCMS Version

latest

Severity

Medium - Some functionality impacted

Links

https://dotcms.freshdesk.com/a/tickets/35999

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    Status

    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions