NIFI-15758: Add fragment attribute support to UnpackContent and remov…#11058
Open
Scrooge-McDucks wants to merge 1 commit intoapache:mainfrom
Open
NIFI-15758: Add fragment attribute support to UnpackContent and remov…#11058Scrooge-McDucks wants to merge 1 commit intoapache:mainfrom
Scrooge-McDucks wants to merge 1 commit intoapache:mainfrom
Conversation
…e fragment attributes from MergeContent in Defragment mode
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…e fragment attributes from MergeContent in Defragment mode
Summary
This change adds optional fragment attribute support to
UnpackContentso unpacked FlowFiles can be regrouped downstream usingMergeContentinDefragmentmode.It also updates
MergeContentto remove reassembly-related attributes from the merged FlowFile once defragmentation has completed successfully, including:fragment.identifierfragment.indexfragment.countsegment.original.filenameMotivation
A common dataflow pattern is:
UnpackContentextracts individual FlowFilesThis works well conceptually, but today
UnpackContentdoes not provide a built-in way to assign the fragment attributes needed for downstream reassembly across formats such as ZIP, TAR, and FlowFile Package.Without those attributes, users need custom logic to preserve grouping and ordering, which adds complexity and can lead to inconsistent behaviour.
This change makes that workflow easier by allowing
UnpackContentto optionally generate fragment attributes, while ensuringMergeContentremoves the temporary reassembly metadata once the final merged FlowFile has been produced.Changes Included
UnpackContent
Added optional support for assigning fragment attributes to unpacked FlowFiles.
New Properties
Add Fragment Attributes
fragment.identifierfragment.indexfragment.countsegment.original.filenameFragment Identifier Value
fragment.identifier${UUID()}Examples:
${UUID()}for a unique grouping per archive (default)${filename}for grouping based on the original filename${archive.filename}when an explicit archive attribute is availableBehaviour
When enabled:
fragment.identifierfragment.indexis assigned based on entry order within the archivefragment.countis set to the total number of unpacked entriesWhen disabled:
UnpackContentbehaviourMergeContent
Updated
MergeContentso that after a successful defragmentation, the merged FlowFile no longer retains temporary reassembly metadata.When operating in
Defragmentmode, the merged FlowFile now removes:fragment.identifierfragment.indexfragment.countsegment.original.filenameThis ensures the final merged output reflects the completed repackaged artifact rather than the intermediate fragmentation state used to drive regrouping.
Compatibility
UnpackContentis opt-inMergeContentcleanup only applies after successful defragmentationExample
Input:
archive.zipcontaining 3 filesUnpack output when enabled with
${filename}as the identifier:After processing and successful defragmentation in
MergeContent, the merged FlowFile no longer retains:fragment.identifierfragment.indexfragment.countsegment.original.filenameSummary
NIFI-15758
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000VerifiedstatusPull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
./mvnw clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation