Skip to content

ATLAS-5279: changes for handling rename propagation#609

Open
bhor-sanket wants to merge 4 commits intoapache:masterfrom
bhor-sanket:ATLAS-5279
Open

ATLAS-5279: changes for handling rename propagation#609
bhor-sanket wants to merge 4 commits intoapache:masterfrom
bhor-sanket:ATLAS-5279

Conversation

@bhor-sanket
Copy link
Copy Markdown

@bhor-sanket bhor-sanket commented Apr 29, 2026

What changes were proposed in this pull request?

Background

In Apache Atlas, when an entity such as a table or view is renamed (e.g., via hooks from systems like Trino), only the parent entity’s attributes are updated. However, dependent entities (such as columns) continue to retain the old qualifiedName, leading to inconsistency in metadata.

Since qualifiedName is a derived unique attribute that often embeds the parent entity name, any rename operation requires corresponding updates across all dependent entities to maintain correctness and lineage consistency

Problem Statement

  • The issue is currently observed with specific hooks (e.g., Trino hook), where rename events do not result in updates to dependent entities
  • Rename operations update only the primary entity (table/view)
  • Dependent entities (e.g., columns) are not updated
  • Results in stale or inconsistent qualifiedName values
  • Impacts downstream use cases like lineage, discovery, and governance

How the patch resolves it

This patch enables Apache Atlas to automatically propagate rename changes across dependent entities, ensuring metadata consistency without relying on hooks to emit updates for every downstream entity.

Atlas derives rename propagation behavior from type definitions. It identifies downstream entities in the graph, understands how their unique attributes (typically qualifiedName) are constructed, and determines which segment of that attribute needs to change when an upstream entity is renamed.

During a partial update, when the qualifiedName of an entity changes, Atlas follows the configured relationships, recomputes the corresponding attribute for each affected dependent entity using the defined autoComputeFormat, and persists those updates within the same transaction. This ensures consistent metadata updates across the graph.

High-Level Implementation

Model & Storage

  • Relationship ends can define propagateRename to indicate participation in rename propagation, and optionally propagateAttributes to map attributes on dependent entities.
  • Entity types can define attributeDefOverrides to customize attributes like qualifiedName with the appropriate autoComputeFormat.
  • These overrides are persisted on the type vertex to ensure durability across restarts and upgrades.

Type Initialization (resolveReferences)

  • Each AtlasEntityType identifies the relationships to traverse for rename propagation (RenamePropagationTarget).
  • It also precomputes mappings (e.g., autoComputeFormatPathByRefTypeNameMap) to efficiently locate and update relevant segments of computed attributes at runtime.

Entity Update Flow (createOrUpdate / Partial Update)

  • AtlasEntityStoreV2 detects changes in qualifiedName by comparing existing graph state with the incoming request.
  • If a change is detected and propagation is configured, it invokes EntityRenameHandler.
  • EntityRenameHandler traverses the graph, updates only the affected portion of dependent attributes, recomputes the final value, and registers minimal updates in the EntityMutationContext.
  • All updates are persisted within the same transaction as the original entity change.

Addons & Rollout

  • Models such as Trino are updated to opt into this feature.
  • Patch JSON applies updates like SET_ATTRIBUTE_DEF_OVERRIDES and SET_PROPAGATE_RENAME, with endDefToken determining the applicable relationship end (endDef1 or endDef2).

Documentation

  • Detailed documentation is available in docs/src/documents/RenamePropagation.md.

Potential Use Cases and Extensions of This Solution

The rename propagation logic is intentionally designed to be generic and model-driven, rather than being hardcoded for specific entity types or hooks. Behavior is derived dynamically from Atlas type definitions and relationship metadata, making it extensible for future hooks and systems.

By using relationship metadata (propagateRename, propagateAttributes) and computed attribute definitions (autoComputeFormat), rename propagation can be enabled or modified through model changes without requiring Atlas core code changes. This provides a flexible foundation for consistent metadata updates across different ecosystems and entity hierarchies.

Key Use Cases Enabled

Lightweight Hooks with Reduced Event Overhead
Hooks no longer need to emit updates for every dependent entity. Atlas resolves and updates dependencies internally, reducing event volume and simplifying hook logic. (Performance/scalability trade-offs should be evaluated for this approach.)
Cross-System Rename Propagation
Enables model-driven propagation across different metadata systems and hooks (e.g., Hive → Trino).
For example, renaming a Hive table can automatically propagate updates to corresponding Trino table entities and their dependent columns, ensuring consistent qualifiedName values across systems.
Standardized Unique Attribute Definition
Provides a model-driven way to define qualifiedName formats, removing the need for system-specific hardcoded logic in hooks and ensuring consistency across entity creation and rename flows.
Selective Propagation Control
Allows explicit control over which relationships participate in rename propagation, enabling exclusion of certain dependent entities where propagation is not required.
Attribute Inheritance with Entity-Level Overrides
Establishes a baseline where derived entities can inherit attributes defined at a parent type and override them as needed.
For example, qualifiedName defined at a base type (e.g., Referenceable) can be overridden by derived entities such as Table using attributeDefOverrides, allowing entity-specific customization while preserving a consistent framework.

How was this patch tested?

Setup :

  • Configured Trino hook with Atlas running on Docker
  • Applied the patch changes on the Atlas setup

use-cases validation :

  • Rename Propagation Validation

    • Renamed Trino table (sales_data → sales_data_latest)
    • Verified that column unique attributes were updated with the new table name
  • Regression Testing (Open Source Trino Docker Setup)

  • Create Trino table

    • Verified creation of Trino and Hive table entities along with their association in Atlas
  • Add column to Trino table

    • Verified creation of the new column entity under the Trino table
  • Rename column in Trino table

    • Verified column rename update in Atlas
  • Drop column from Trino table

    • Verified deletion of the column entity from Atlas
  • Drop Trino table

    • Verified deletion of the Trino table entity from Atlas

@bhor-sanket bhor-sanket marked this pull request as draft April 29, 2026 07:14
sanket.bhor added 3 commits May 5, 2026 10:04
…lrconfig for newly added field and handled deleted column use-case
…lrconfig for newly added field and handled deleted column use-case
@bhor-sanket bhor-sanket marked this pull request as ready for review May 7, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant