Skip to content

Conversation

@cjac
Copy link
Contributor

@cjac cjac commented Oct 1, 2025

This commit addresses issues in the create-dpgce script related to service account handling, particularly for domain-scoped projects.

  • lib/env.sh:

  • lib/shared-functions.sh:

    • Replaced gcloud iam service-accounts describe with list --filter for a more reliable existence check, as describe was failing to find existing service accounts.
    • Added a retry loop for gcloud projects add-iam-policy-binding to handle potential IAM propagation delays after service account creation.
    • Ensured the script exits if role bindings fail after multiple retries.
    • Cleaned up role binding logic into a loop.

These changes ensure the script can reliably create and configure the necessary service account and its IAM roles, unblocking cluster creation.

@gemini-code-assist
Copy link

Summary of Changes

Hello @cjac, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the reliability and correctness of service account management within the create-dpgce script. It specifically addresses issues related to service account email formatting for domain-scoped projects and improves the robustness of service account creation and IAM role assignment. By implementing more reliable existence checks and retry mechanisms for IAM policy bindings, these changes ensure that the script can consistently and successfully configure the necessary Google Cloud resources, thereby unblocking cluster creation workflows.

Highlights

  • Service Account Email Formatting: The script now correctly formats Google Service Account (GSA) emails for domain-scoped projects (e.g., c9h.org:project-id) by including the domain in the email address, resolving previous issues with incorrect GSA identification.
  • Improved Service Account Existence Check: The method for checking if a service account already exists has been made more robust. It now uses gcloud iam service-accounts list --filter instead of describe, which was failing to find existing service accounts reliably.
  • Robust IAM Role Binding with Retries: A retry loop has been implemented for gcloud projects add-iam-policy-binding commands. This addresses potential IAM propagation delays after service account creation, ensuring roles are successfully bound. The script will now exit if role bindings fail after multiple retries.
  • Streamlined Role Assignment: The logic for assigning IAM roles has been refactored into a loop, making it cleaner and easier to manage. Additionally, the roles/bigquery.dataEditor role has been added to the default set of roles assigned to the service account.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@cjac
Copy link
Contributor Author

cjac commented Oct 1, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes significant improvements to the service account creation and configuration process, particularly for domain-scoped projects. The changes to correctly format the service account email and replace gcloud ... describe with gcloud ... list are great for reliability. The introduction of a retry loop for IAM role binding is an excellent addition to handle propagation delays.

My review includes a few suggestions:

  • There are a couple of critical issues where the script doesn't exit on failure as intended. The exit_handler is called, but it doesn't terminate the script. This should be changed to exit 1.
  • I've also included some suggestions to improve shell script style and performance by using built-in parameter expansion and arithmetic commands.

--description="Service account for use with cluster ${CLUSTER_NAME}" \
--display-name="${SA_NAME}"; then
echo "ERROR: Failed to create service account ${SA_NAME}."
exit_handler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Calling exit_handler here does not terminate the script. Since trap exit_handler EXIT is set, you should call exit 1 to terminate the script on error. The trap will automatically invoke exit_handler.

Suggested change
exit_handler
exit 1

sleep ${SLEEP_TIME}
else
echo "Failed to bind ${role} to ${GSA} after ${MAX_RETRIES} attempts."
exit_handler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Calling exit_handler here does not terminate the script. Since trap exit_handler EXIT is set, you should call exit 1 to terminate the script on error. The trap will automatically invoke exit_handler.

Suggested change
exit_handler
exit 1

Comment on lines +63 to +64
DOMAIN=$(echo "${PROJECT_ID}" | cut -d':' -f1)
PROJECT_NAME=$(echo "${PROJECT_ID}" | cut -d':' -f2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better performance and to avoid forking external processes, you can use shell parameter expansion instead of echo and cut to parse the PROJECT_ID.

Suggested change
DOMAIN=$(echo "${PROJECT_ID}" | cut -d':' -f1)
PROJECT_NAME=$(echo "${PROJECT_ID}" | cut -d':' -f2)
DOMAIN="${PROJECT_ID%:*}"
PROJECT_NAME="${PROJECT_ID#*:}"

gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
--member="serviceAccount:${GSA}" \
--role=roles/compute.viewer
RETRY_COUNT=$((RETRY_COUNT + 1))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

You can use the more concise C-style arithmetic command ((RETRY_COUNT++)) to increment the counter.

Suggested change
RETRY_COUNT=$((RETRY_COUNT + 1))
((RETRY_COUNT++))

This commit addresses issues in the create-dpgce script related to
service account handling, particularly for domain-scoped projects.

- lib/env.sh:
  - Correctly formats the service account email (GSA) for domain-scoped
    projects (e.g., c9h.org:project-id) by including the domain
    in the email address (e.g., sa-name@project-id.c9h.org.iam.gserviceaccount.com).

- lib/shared-functions.sh:
  - Replaced `gcloud iam service-accounts describe` with `list --filter`
    for a more reliable existence check, as describe was failing to find
    existing service accounts.
  - Added a retry loop for `gcloud projects add-iam-policy-binding`
    to handle potential IAM propagation delays after service account
    creation.
  - Ensured the script exits if role bindings fail after multiple retries.
  - Cleaned up role binding logic into a loop.

These changes ensure the script can reliably create and configure the
necessary service account and its IAM roles, unblocking cluster creation.
@cjac cjac force-pushed the scoped-service-accounts-2025-09-30 branch from 4b65f6d to bf02e6a Compare October 1, 2025 02:47
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces valuable improvements to the service account creation process, particularly for domain-scoped projects and by adding robust retry logic for IAM role bindings. The code is cleaner and more resilient. However, I've identified a critical issue in the new error handling logic where the script would not exit upon failure, potentially leading to an inconsistent state. I've also included a suggestion to improve script performance and readability by using shell built-ins.

--description="Service account for use with cluster ${CLUSTER_NAME}" \
--display-name="${SA_NAME}"; then
echo "ERROR: Failed to create service account ${SA_NAME}."
exit 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Calling exit_handler directly here will only clean up temporary files but will not terminate the script, causing it to proceed even after the service account creation failed. The script should exit with a non-zero status code. The trap exit_handler EXIT at the top of the script will ensure exit_handler is called upon exit.

Suggested change
exit 1
exit 1

sleep ${SLEEP_TIME}
else
echo "Failed to bind ${role} to ${GSA} after ${MAX_RETRIES} attempts."
exit 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the service account creation failure, calling exit_handler directly here will not terminate the script. The script should exit with a non-zero status code to indicate failure after all retries for role binding have been exhausted. The configured trap will handle the cleanup.

Suggested change
exit 1
exit 1

Comment on lines +63 to +64
DOMAIN=$(echo "${PROJECT_ID}" | cut -d':' -f1)
PROJECT_NAME=$(echo "${PROJECT_ID}" | cut -d':' -f2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better performance and to avoid forking external processes, you can use shell parameter expansion to parse the PROJECT_ID instead of echo and cut.

Suggested change
DOMAIN=$(echo "${PROJECT_ID}" | cut -d':' -f1)
PROJECT_NAME=$(echo "${PROJECT_ID}" | cut -d':' -f2)
DOMAIN="${PROJECT_ID%%:*}"
PROJECT_NAME="${PROJECT_ID#*:}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant