PLUGIN-1413 Added argument for BQ temporary staging bucket names #1152

fernst · 2022-09-29T01:28:05Z

No description provided.

albertshau · 2022-09-29T16:09:01Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java


  private static final Map<Schema.Type, Set<LegacySQLTypeName>> TYPE_MAP = ImmutableMap.<Schema.Type,
-    Set<LegacySQLTypeName>>builder()
+      Set<LegacySQLTypeName>>builder()


this change isn't needed

albertshau · 2022-09-29T16:09:14Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

        if (Integer.parseInt(chunkSize) % MediaHttpUploader.MINIMUM_CHUNK_SIZE != 0) {
          collector.addFailure(
-            String.format("Value must be a multiple of %s.", MediaHttpUploader.MINIMUM_CHUNK_SIZE), null)
+              String.format("Value must be a multiple of %s.", MediaHttpUploader.MINIMUM_CHUNK_SIZE), null)


indentation

reverted this change

albertshau · 2022-09-29T16:11:02Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

  private static final Logger LOG = LoggerFactory.getLogger(BigQueryUtil.class);

  private static final String DEFAULT_PARTITION_COLUMN_NAME = "_PARTITIONTIME";
+  private static final String BIGQUERY_BUCKET_PREFIX_PROPERTY_NAME = "io.cdap.plugin.bigquery.bucket.prefix";


we do a similar thing for cmek, where the argument key is 'gcp.cmek.key.name' (see CmekUtils). Let's follow a similar pattern and name it something like 'gcp.bigquery.bucket.prefix'

albertshau · 2022-09-29T16:15:02Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

+   * We use this to ensure location name length is constant (only 8 characters).
+   *
+   * @param location location to checksum
+   * @return checksum value as@ an 8 character string (hex).


albertshau · 2022-09-29T16:16:07Z

src/test/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtilTest.java

+      hashValues.add(hash);
+    }
+
+    System.out.println(hashValues);


don't need this

albertshau · 2022-09-29T16:19:24Z

src/main/java/io/cdap/plugin/gcp/bigquery/sink/AbstractBigQuerySink.java

+    if (bucketName == null && bucketPrefix != null) {
+      // Check if the destination dataset exists.
+      DatasetId datasetId = DatasetId.of(config.getDatasetProject(), config.getDataset());
+      Dataset dataset = bigQuery.getDataset(datasetId);


It seems like we should already be making this call as part of the existing validation/automatic bucket creation. If so, we should do some refactoring to avoid duplicate calls. It's become kind of a mess now, but ideally we do all the I/O in a single place and pass the return objects around.

I don't see an easy way to refactor this without making 2 calls to get the dataset.

The problem is the bucket name is needed even when in preview. This makes it difficult to refactor the code in a way that doesn't change the entire method signature for the abstract bigquery sink and other callers (such as the BigQuery Pushdown implementation).

I have refactored the code a bit, hope this helps

fernst added the build Trigger unit test build label Sep 29, 2022

fernst requested review from albertshau, chtyim and masoud-io September 29, 2022 01:28

albertshau reviewed Sep 29, 2022

View reviewed changes

fernst requested a review from albertshau September 29, 2022 22:18

albertshau approved these changes Sep 30, 2022

View reviewed changes

PLUGIN-1413 Added argument for BQ temporary staging bucket names

3adb4b6

fernst force-pushed the PLUGIN-1413 branch from 27a5c24 to 3adb4b6 Compare September 30, 2022 18:02

fernst merged commit c048522 into develop Sep 30, 2022

itsankit-google mentioned this pull request Nov 3, 2022

PLUGIN-1418: add argument for BQ temp staging bucket names #1177

Merged

fernst added the bq-pushdown label Jan 20, 2025

PLUGIN-1413 Added argument for BQ temporary staging bucket names #1152

PLUGIN-1413 Added argument for BQ temporary staging bucket names #1152

Uh oh!

Conversation

fernst commented Sep 29, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albertshau Sep 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

albertshau Sep 29, 2022 •

edited

Loading