Skip to content

Conversation

@tim-u
Copy link
Contributor

@tim-u tim-u commented Dec 16, 2025

Fixes #142

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tim-u, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness and user-friendliness of the Dataproc Spark Connect client by introducing specific error handling for cases where Google Cloud default credentials are not configured. By catching DefaultCredentialsError and re-raising it with a more descriptive message, users will receive clearer guidance when encountering authentication issues, streamlining the debugging process.

Highlights

  • Improved Error Handling: The system now specifically catches DefaultCredentialsError during Dataproc Spark session creation.
  • Clearer Error Message: A more informative DataprocSparkConnectException is now raised when Google Cloud default credentials are not found, aiding in quicker debugging.
  • New Unit Test: A new unit test has been added to validate the improved error handling for missing application default credentials.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the error handling for missing user default credentials by adding a specific except block for DefaultCredentialsError. This provides a clearer error message to the user. A corresponding unit test has been added to verify this behavior.

My review includes two main points:

  1. In google/cloud/dataproc_spark_connect/session.py, there's an opportunity to refactor the exception handling logic to reduce code duplication across except blocks.
  2. In tests/unit/test_session.py, the new test case can be written more idiomatically using the assertRaises context manager for better readability.

Overall, the changes are good and improve user experience. Addressing the feedback will enhance code maintainability and test clarity.

Comment on lines +461 to +465
stop_create_session_pbar_event.set()
if create_session_pbar_thread.is_alive():
create_session_pbar_thread.join()
DataprocSparkSession._active_s8s_session_id = None
DataprocSparkSession._active_session_uses_custom_id = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This cleanup logic is duplicated in three different except blocks (this one, the one for InvalidArgument/PermissionDenied, and the generic Exception one). To improve maintainability and reduce redundancy, consider factoring this logic out. You could, for example, use a single except Exception as e: block that contains the cleanup logic and then uses isinstance checks to raise the appropriate specific exception.

Comment on lines +1493 to +1498
try:
DataprocSparkSession.builder.location("test-region").projectId(
"test-project"
).getOrCreate()
except DataprocSparkConnectException as e:
self.assertIn("Your default credentials were not found", str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and to follow common testing patterns, consider using the assertRaises context manager to test for exceptions. This makes the intent of the test clearer.

        with self.assertRaises(DataprocSparkConnectException) as e:
            DataprocSparkSession.builder.location("test-region").projectId(
                "test-project"
            ).getOrCreate()
        self.assertIn("Your default credentials were not found", str(e.exception))

@tim-u tim-u marked this pull request as ready for review December 16, 2025 17:28
@tim-u tim-u requested a review from medb December 16, 2025 17:29
@medb medb changed the title Improve the error message when User Default Credentials are not provided fix: Improve the error message when User Default Credentials are not provided Dec 18, 2025
DataprocSparkSession._active_s8s_session_id = None
DataprocSparkSession._active_session_uses_custom_id = False
raise DataprocSparkConnectException(
f"Error while creating Dataproc Session: {e}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to add a link to the ADC docs so users can fix it up more easily: https://docs.cloud.google.com/docs/authentication/provide-credentials-adc

raise DataprocSparkConnectException(
f"Error while creating Dataproc Session: {e.message}"
)
except DefaultCredentialsError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we check whether this will fix also #142 issue?

@medb medb changed the title fix: Improve the error message when User Default Credentials are not provided fix: Improve the error message when Application Default Credentials are not configured Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't connect to Spark Serverless Session from Vertex AI Pipeline Custom Job

2 participants