NIFI-15754 Add Google Cloud Storage Provider for Iceberg#11052
Open
exceptionfactory wants to merge 2 commits intoapache:mainfrom
Open
NIFI-15754 Add Google Cloud Storage Provider for Iceberg#11052exceptionfactory wants to merge 2 commits intoapache:mainfrom
exceptionfactory wants to merge 2 commits intoapache:mainfrom
Conversation
- Added Iceberg GCS module and NAR - Added Iceberg FileIO implementation using Java HttpClient for GCS REST operations
pvillard31
requested changes
Mar 29, 2026
...berg-gcs/src/main/java/org/apache/nifi/services/iceberg/gcs/GoogleCloudStorageInputFile.java
Outdated
Show resolved
Hide resolved
...iceberg-gcs/src/main/java/org/apache/nifi/services/iceberg/gcs/GoogleCloudStorageFileIO.java
Outdated
Show resolved
Hide resolved
Comment on lines
+52
to
+54
| try { | ||
| return httpClient.send(request, bodyHandler); | ||
| } catch (final InterruptedException e) { |
Contributor
There was a problem hiding this comment.
Should we implement retry with backoff for transient failures (429, 500, 503, ...)?
Contributor
Author
There was a problem hiding this comment.
Yes, it looks like that would more closely align with the Google Cloud Storage library behavior, I will implement some basic retry and backoff strategy.
...erg-gcs/src/main/java/org/apache/nifi/services/iceberg/gcs/GoogleCloudStorageProperties.java
Outdated
Show resolved
Hide resolved
24f2a0f to
eb2a5c5
Compare
eb2a5c5 to
80926bb
Compare
Contributor
Author
|
Thanks for the review @pvillard31, I pushed an update to address the feedback, and I added a simple retry strategy for HTTP requests. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NIFI-15754 Adds Google Cloud Storage support to Iceberg modules with a
GCSIcebergFileIOProviderController Service implementation.The new Controller Service is packaged in a separate
nifi-iceberg-gcsmodule and bundled innifi-iceberg-gcs-nar, following the pattern of the AWS S3 and Azure Data Lake Storage implementations.Although the iceberg-gcp module from the Apache Iceberg project includes a GCSFileIO implementation, the class depends on the google-cloud-storage library which has dozens of direct and transitive dependencies.
Instead of using the
GCSFileIOclass, the Controller Service package includes a direct implementation of the IcebergFileIOinterface namedGoogleCloudStorageFileIO. The implementation uses the Java HttpClient for REST operations with the GCS API. Supporting Bearer Token authentication with vended credentials from an Iceberg REST Catalog, theFileIOimplementation avoids multiple layers of dependencies. The new implementation reads the same properties defined in the Iceberg GCPProperties to support compatibility with Iceberg Catalogs.The
InputStreamandOutputStreamimplementations handle direct interaction with Google Cloud Storage, including support for resumable uploads.Tests for the
FileIOimplementation include interaction with the OkHttp MockWebServer to verify expected HTTP requests and responses.Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000VerifiedstatusPull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
./mvnw clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation