-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Given following Kubeflow component executed as Vertex AI Custom Jobs trigger by Vertex AI Pipeline:
from kfp.dsl import component
@component(base_image="python:3.11", packages_to_install=["dataproc_spark_connect==0.9.0"])
def trigger_colab_notebook_execution():
from google.cloud.dataproc_spark_connect import DataprocSparkSession
# Spark Connect imports differs. Change pyspark.sql.functions -> pyspark.sql.connect.functions
import pyspark.sql.connect.functions as F
from datetime import timedelta
spark = (
DataprocSparkSession.builder
.appName("MySparkApp")
.idleTtl(timedelta(minutes=30))
.runtimeVersion("2.3")
.config("spark.dynamicAllocation.maxExecutors", "4")
.serviceAccount(SA)
.projectId(PROJECT_ID)
.location("europe-west1")
.getOrCreate()
)
from pyspark.sql import Row
df = spark.createDataFrame([
Row(col1=1, col2="a"),
Row(col1=2, col2="b"),
Row(col1=3, col2="c"),
Row(col1=4, col2="d"),
Row(col1=5, col2="e")
])
df.show()
spark.stop()The job is failing because Spark client fails due to server rejected WebSocket connection: HTTP 401:
ERROR 2025-09-24T14:00:47.393154648Z [resource.labels.taskName: workerpool0-0] Exception in thread Thread-52 (forward_connection):
ERROR 2025-09-24T14:00:47.393178226Z [resource.labels.taskName: workerpool0-0] Traceback (most recent call last):
ERROR 2025-09-24T14:00:47.393192105Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
ERROR 2025-09-24T14:00:47.395184066Z [resource.labels.taskName: workerpool0-0] self.run()
ERROR 2025-09-24T14:00:47.395209026Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 982, in run
ERROR 2025-09-24T14:00:47.399372305Z [resource.labels.taskName: workerpool0-0] self._target(*self._args, **self._kwargs)
ERROR 2025-09-24T14:00:47.399390990Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 170, in forward_connection
ERROR 2025-09-24T14:00:47.399411026Z [resource.labels.taskName: workerpool0-0] with connect_tcp_bridge(target_host) as websocket_conn:
ERROR 2025-09-24T14:00:47.399447296Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:47.399457044Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 80, in connect_tcp_bridge
ERROR 2025-09-24T14:00:47.399465970Z [resource.labels.taskName: workerpool0-0] return websocketclient.connect(
ERROR 2025-09-24T14:00:47.399474408Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:47.399482154Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 378, in connect
ERROR 2025-09-24T14:00:47.399490165Z [resource.labels.taskName: workerpool0-0] connection.handshake(
ERROR 2025-09-24T14:00:47.399500407Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 101, in handshake
ERROR 2025-09-24T14:00:47.399507644Z [resource.labels.taskName: workerpool0-0] raise self.protocol.handshake_exc
ERROR 2025-09-24T14:00:47.399516155Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 325, in parse
ERROR 2025-09-24T14:00:47.399530099Z [resource.labels.taskName: workerpool0-0] self.process_response(response)
ERROR 2025-09-24T14:00:47.399538529Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 142, in process_response
ERROR 2025-09-24T14:00:47.399545526Z [resource.labels.taskName: workerpool0-0] raise InvalidStatus(response)
ERROR 2025-09-24T14:00:47.399560160Z [resource.labels.taskName: workerpool0-0] websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 401
ERROR 2025-09-24T14:00:48.191835742Z [resource.labels.taskName: workerpool0-0] Exception in thread Thread-57 (forward_connection):
ERROR 2025-09-24T14:00:48.191869173Z [resource.labels.taskName: workerpool0-0] Traceback (most recent call last):
ERROR 2025-09-24T14:00:48.191900398Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
ERROR 2025-09-24T14:00:48.201565366Z [resource.labels.taskName: workerpool0-0] self.run()
ERROR 2025-09-24T14:00:48.201586845Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 982, in run
ERROR 2025-09-24T14:00:48.204504021Z [resource.labels.taskName: workerpool0-0] self._target(*self._args, **self._kwargs)
ERROR 2025-09-24T14:00:48.204522326Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 170, in forward_connection
ERROR 2025-09-24T14:00:48.204532576Z [resource.labels.taskName: workerpool0-0] with connect_tcp_bridge(target_host) as websocket_conn:
ERROR 2025-09-24T14:00:48.204548739Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:48.204558147Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 80, in connect_tcp_bridge
ERROR 2025-09-24T14:00:48.204566088Z [resource.labels.taskName: workerpool0-0] return websocketclient.connect(
ERROR 2025-09-24T14:00:48.204573019Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:48.204581387Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 378, in connect
ERROR 2025-09-24T14:00:48.204589438Z [resource.labels.taskName: workerpool0-0] connection.handshake(
ERROR 2025-09-24T14:00:48.204600501Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 101, in handshake
ERROR 2025-09-24T14:00:48.204607705Z [resource.labels.taskName: workerpool0-0] raise self.protocol.handshake_exc
ERROR 2025-09-24T14:00:48.204615559Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 325, in parse
ERROR 2025-09-24T14:00:48.204623113Z [resource.labels.taskName: workerpool0-0] self.process_response(response)
ERROR 2025-09-24T14:00:48.204633713Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 142, in process_response
ERROR 2025-09-24T14:00:48.204640672Z [resource.labels.taskName: workerpool0-0] raise InvalidStatus(response)
ERROR 2025-09-24T14:00:48.204647519Z [resource.labels.taskName: workerpool0-0] websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 401
ERROR 2025-09-24T14:00:49.806677478Z [resource.labels.taskName: workerpool0-0] Exception in thread Thread-60 (forward_connection):
ERROR 2025-09-24T14:00:49.806748694Z [resource.labels.taskName: workerpool0-0] Traceback (most recent call last):
ERROR 2025-09-24T14:00:49.806769320Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
ERROR 2025-09-24T14:00:49.810217855Z [resource.labels.taskName: workerpool0-0] self.run()
ERROR 2025-09-24T14:00:49.810239118Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 982, in run
ERROR 2025-09-24T14:00:49.810248788Z [resource.labels.taskName: workerpool0-0] self._target(*self._args, **self._kwargs)
ERROR 2025-09-24T14:00:49.810257010Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 170, in forward_connection
ERROR 2025-09-24T14:00:49.810265090Z [resource.labels.taskName: workerpool0-0] with connect_tcp_bridge(target_host) as websocket_conn:
ERROR 2025-09-24T14:00:49.810272722Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:49.810279805Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 80, in connect_tcp_bridge
ERROR 2025-09-24T14:00:49.810286699Z [resource.labels.taskName: workerpool0-0] return websocketclient.connect(
ERROR 2025-09-24T14:00:49.812685391Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:49.812703957Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 378, in connect
ERROR 2025-09-24T14:00:49.812713510Z [resource.labels.taskName: workerpool0-0] connection.handshake(
ERROR 2025-09-24T14:00:49.812724854Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 101, in handshake
ERROR 2025-09-24T14:00:49.812733386Z [resource.labels.taskName: workerpool0-0] raise self.protocol.handshake_exc
ERROR 2025-09-24T14:00:49.812740943Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 325, in parse
ERROR 2025-09-24T14:00:49.812749711Z [resource.labels.taskName: workerpool0-0] self.process_response(response)
ERROR 2025-09-24T14:00:49.812757457Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 142, in process_response
ERROR 2025-09-24T14:00:49.812764270Z [resource.labels.taskName: workerpool0-0] raise InvalidStatus(response)
ERROR 2025-09-24T14:00:49.812771434Z [resource.labels.taskName: workerpool0-0] websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 401
ERROR 2025-09-24T14:00:52.260109270Z [resource.labels.taskName: workerpool0-0] Exception in thread Thread-62 (forward_connection):
ERROR 2025-09-24T14:00:52.260127771Z [resource.labels.taskName: workerpool0-0] Traceback (most recent call last):
ERROR 2025-09-24T14:00:52.260137252Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
ERROR 2025-09-24T14:00:52.262516295Z [resource.labels.taskName: workerpool0-0] self.run()
ERROR 2025-09-24T14:00:52.262541409Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 982, in run
ERROR 2025-09-24T14:00:52.262551174Z [resource.labels.taskName: workerpool0-0] self._target(*self._args, **self._kwargs)
ERROR 2025-09-24T14:00:52.262560088Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 170, in forward_connection
ERROR 2025-09-24T14:00:52.266560118Z [resource.labels.taskName: workerpool0-0] with connect_tcp_bridge(target_host) as websocket_conn:
ERROR 2025-09-24T14:00:52.266578378Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:52.266588197Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 80, in connect_tcp_bridge
ERROR 2025-09-24T14:00:52.266595922Z [resource.labels.taskName: workerpool0-0] return websocketclient.connect(
ERROR 2025-09-24T14:00:52.266603804Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:52.266612139Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 378, in connect
ERROR 2025-09-24T14:00:52.266619685Z [resource.labels.taskName: workerpool0-0] connection.handshake(
ERROR 2025-09-24T14:00:52.266627693Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 101, in handshake
ERROR 2025-09-24T14:00:52.266634857Z [resource.labels.taskName: workerpool0-0] raise self.protocol.handshake_exc
ERROR 2025-09-24T14:00:52.266641544Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 325, in parse
ERROR 2025-09-24T14:00:52.266649389Z [resource.labels.taskName: workerpool0-0] self.process_response(response)
ERROR 2025-09-24T14:00:52.266656983Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 142, in process_response
ERROR 2025-09-24T14:00:52.266663260Z [resource.labels.taskName: workerpool0-0] raise InvalidStatus(response)
ERROR 2025-09-24T14:00:52.266670292Z [resource.labels.taskName: workerpool0-0] websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 401
ERROR 2025-09-24T14:00:56.804468120Z [resource.labels.taskName: workerpool0-0] Exception in thread Thread-65 (forward_connection):
ERROR 2025-09-24T14:00:56.804498760Z [resource.labels.taskName: workerpool0-0] Traceback (most recent call last):
ERROR 2025-09-24T14:00:56.804514203Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
ERROR 2025-09-24T14:00:56.807048963Z [resource.labels.taskName: workerpool0-0] self.run()
ERROR 2025-09-24T14:00:56.807069225Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/threading.py", line 982, in run
ERROR 2025-09-24T14:00:56.807078643Z [resource.labels.taskName: workerpool0-0] self._target(*self._args, **self._kwargs)
ERROR 2025-09-24T14:00:56.807086882Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 170, in forward_connection
ERROR 2025-09-24T14:00:56.807094684Z [resource.labels.taskName: workerpool0-0] with connect_tcp_bridge(target_host) as websocket_conn:
ERROR 2025-09-24T14:00:56.811039076Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:56.811057745Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/google/cloud/dataproc_spark_connect/client/proxy.py", line 80, in connect_tcp_bridge
ERROR 2025-09-24T14:00:56.811066949Z [resource.labels.taskName: workerpool0-0] return websocketclient.connect(
ERROR 2025-09-24T14:00:56.811074847Z [resource.labels.taskName: workerpool0-0] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 2025-09-24T14:00:56.811083936Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 378, in connect
ERROR 2025-09-24T14:00:56.811091932Z [resource.labels.taskName: workerpool0-0] connection.handshake(
ERROR 2025-09-24T14:00:56.811099650Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/sync/client.py", line 101, in handshake
ERROR 2025-09-24T14:00:56.811106763Z [resource.labels.taskName: workerpool0-0] raise self.protocol.handshake_exc
ERROR 2025-09-24T14:00:56.811122233Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 325, in parse
ERROR 2025-09-24T14:00:56.811140845Z [resource.labels.taskName: workerpool0-0] self.process_response(response)
ERROR 2025-09-24T14:00:56.811149317Z [resource.labels.taskName: workerpool0-0] File "/usr/local/lib/python3.11/site-packages/websockets/client.py", line 142, in process_response
ERROR 2025-09-24T14:00:56.811156368Z [resource.labels.taskName: workerpool0-0] raise InvalidStatus(response)
ERROR 2025-09-24T14:00:56.811163736Z [resource.labels.taskName: workerpool0-0] websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 401
Is this a bug or did I forget something?
Metadata
Metadata
Assignees
Labels
No labels