Skip to content

Commit 964185b

Browse files
sd-dbjprakash-db
andauthored
Backport: Add _retry_server_directed_only mode for Retry-After header compliance (#757)
* Add _retry_server_directed_only mode for Retry-After header compliance When enabled, the connector only retries on 429/503 if the server includes a Retry-After header in the response. This prevents duplicate side effects for non-idempotent ExecuteStatement operations where the server has not explicitly signaled that retry is safe. The new opt-in parameter `_retry_server_directed_only` threads through ClientContext, all three DatabricksRetryPolicy construction sites (Thrift, SEA, UnifiedHttpClient), and the retry policy's should_retry/is_retry methods. Default behavior (retry without requiring the header) is unchanged. Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> * Remove unnecessary _retry_server_directed_only instance variables Inline kwargs.get() at the single point of use in ThriftDatabricksClient and SeaHttpClient instead of storing as dead instance state. Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> * Address PR feedback: rename and clean up retry-after parameter - Rename server_directed_only to respect_server_retry_after_header throughout for clarity - Store _respect_server_retry_after_header as instance variable in Thrift/SEA backends to match existing kwargs extraction pattern - Replace duplicate test fixture with _make_retry_policy(**overrides) helper for flexible policy construction in tests Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> * Fix Black formatting for retry.py and utils.py Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> * Apply CI workflow and test fixes from PR #745 - Update actions/checkout@v2 → v4 and actions/setup-python@v2 → v5 in run-unit-tests-with-arrow job - Pin Poetry version to 2.2.1 across all workflow files - Reduce long-running query test min duration from 3 to 2 minutes Co-authored-by: Isaac * Removed unnecessary tests * Fixed something * fixed something --------- Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> Co-authored-by: Jothi Prakash <jothi.prakash@databricks.com>
1 parent 30286ad commit 964185b

File tree

13 files changed

+137
-18
lines changed

13 files changed

+137
-18
lines changed

.github/workflows/code-quality-checks.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ jobs:
3535
- name: Install Poetry
3636
uses: snok/install-poetry@v1
3737
with:
38+
version: "2.2.1"
3839
virtualenvs-create: true
3940
virtualenvs-in-project: true
4041
installer-parallel: true
@@ -106,10 +107,10 @@ jobs:
106107
# check-out repo and set-up python
107108
#----------------------------------------------
108109
- name: Check out repository
109-
uses: actions/checkout@v2
110+
uses: actions/checkout@v4
110111
- name: Set up python ${{ matrix.python-version }}
111112
id: setup-python
112-
uses: actions/setup-python@v2
113+
uses: actions/setup-python@v5
113114
with:
114115
python-version: ${{ matrix.python-version }}
115116
#----------------------------------------------
@@ -118,6 +119,7 @@ jobs:
118119
- name: Install Poetry
119120
uses: snok/install-poetry@v1
120121
with:
122+
version: "2.2.1"
121123
virtualenvs-create: true
122124
virtualenvs-in-project: true
123125
installer-parallel: true
@@ -191,6 +193,7 @@ jobs:
191193
- name: Install Poetry
192194
uses: snok/install-poetry@v1
193195
with:
196+
version: "2.2.1"
194197
virtualenvs-create: true
195198
virtualenvs-in-project: true
196199
installer-parallel: true
@@ -243,6 +246,7 @@ jobs:
243246
- name: Install Poetry
244247
uses: snok/install-poetry@v1
245248
with:
249+
version: "2.2.1"
246250
virtualenvs-create: true
247251
virtualenvs-in-project: true
248252
installer-parallel: true

.github/workflows/integration.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ jobs:
3333
- name: Install Poetry
3434
uses: snok/install-poetry@v1
3535
with:
36+
version: "2.2.1"
3637
virtualenvs-create: true
3738
virtualenvs-in-project: true
3839
installer-parallel: true

.github/workflows/publish-test.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ jobs:
2121
- name: Install Poetry
2222
uses: snok/install-poetry@v1
2323
with:
24+
version: "2.2.1"
2425
virtualenvs-create: true
2526
virtualenvs-in-project: true
2627
installer-parallel: true

.github/workflows/publish.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ jobs:
2323
- name: Install Poetry
2424
uses: snok/install-poetry@v1
2525
with:
26+
version: "2.2.1"
2627
virtualenvs-create: true
2728
virtualenvs-in-project: true
2829
installer-parallel: true

src/databricks/sql/auth/common.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ def __init__(
4747
retry_stop_after_attempts_duration: Optional[float] = None,
4848
retry_delay_default: Optional[float] = None,
4949
retry_dangerous_codes: Optional[List[int]] = None,
50+
respect_server_retry_after_header: Optional[bool] = None,
5051
proxy_auth_method: Optional[str] = None,
5152
pool_connections: Optional[int] = None,
5253
pool_maxsize: Optional[int] = None,
@@ -79,6 +80,7 @@ def __init__(
7980
)
8081
self.retry_delay_default = retry_delay_default or 5.0
8182
self.retry_dangerous_codes = retry_dangerous_codes or []
83+
self.respect_server_retry_after_header = bool(respect_server_retry_after_header)
8284
self.proxy_auth_method = proxy_auth_method
8385
self.pool_connections = pool_connections or 10
8486
self.pool_maxsize = pool_maxsize or 20

src/databricks/sql/auth/retry.py

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ def __init__(
9494
stop_after_attempts_duration: float,
9595
delay_default: float,
9696
force_dangerous_codes: List[int],
97+
respect_server_retry_after_header: bool = False,
9798
urllib3_kwargs: dict = {},
9899
):
99100
# These values do not change from one command to the next
@@ -103,6 +104,7 @@ def __init__(
103104
self.stop_after_attempts_duration = stop_after_attempts_duration
104105
self._delay_default = delay_default
105106
self.force_dangerous_codes = force_dangerous_codes
107+
self.respect_server_retry_after_header = respect_server_retry_after_header
106108

107109
# the urllib3 kwargs are a mix of configuration (some of which we override)
108110
# and counters like `total` or `connect` which may change between successive retries
@@ -202,6 +204,7 @@ def new(
202204
stop_after_attempts_duration=self.stop_after_attempts_duration,
203205
delay_default=self.delay_default,
204206
force_dangerous_codes=self.force_dangerous_codes,
207+
respect_server_retry_after_header=self.respect_server_retry_after_header,
205208
urllib3_kwargs={},
206209
)
207210

@@ -323,7 +326,9 @@ def get_backoff_time(self) -> float:
323326

324327
return proposed_backoff
325328

326-
def should_retry(self, method: str, status_code: int) -> Tuple[bool, str]:
329+
def should_retry(
330+
self, method: str, status_code: int, has_retry_after: bool = False
331+
) -> Tuple[bool, str]:
327332
"""This method encapsulates the connector's approach to retries.
328333
329334
We always retry a request unless one of these conditions is met:
@@ -381,6 +386,15 @@ def should_retry(self, method: str, status_code: int) -> Tuple[bool, str]:
381386
if not self._is_method_retryable(method):
382387
return False, "Only POST requests are retried"
383388

389+
# When respect_server_retry_after_header is enabled, only retry when the
390+
# server explicitly signals it's safe via a Retry-After header. This prevents
391+
# duplicate side effects for non-idempotent operations.
392+
if self.respect_server_retry_after_header and not has_retry_after:
393+
return (
394+
False,
395+
"respect_server_retry_after_header mode: no Retry-After header present",
396+
)
397+
384398
# Request failed with 404 and was a GetOperationStatus. This is not recoverable. Don't retry.
385399
if status_code == 404 and self.command_type == CommandType.GET_OPERATION_STATUS:
386400
return (
@@ -450,7 +464,7 @@ def is_retry(
450464
Logs a debug message if the request will be retried
451465
"""
452466

453-
should_retry, msg = self.should_retry(method, status_code)
467+
should_retry, msg = self.should_retry(method, status_code, has_retry_after)
454468

455469
if should_retry:
456470
logger.debug(msg)

src/databricks/sql/backend/sea/utils/http_client.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,9 @@ def __init__(
9090
)
9191
self._retry_delay_default = kwargs.get("_retry_delay_default", 5.0)
9292
self.force_dangerous_codes = kwargs.get("_retry_dangerous_codes", [])
93+
self._respect_server_retry_after_header = kwargs.get(
94+
"_respect_server_retry_after_header", False
95+
)
9396

9497
# Connection pooling settings
9598
self.max_connections = kwargs.get("max_connections", 10)
@@ -114,6 +117,7 @@ def __init__(
114117
stop_after_attempts_duration=self._retry_stop_after_attempts_duration,
115118
delay_default=self._retry_delay_default,
116119
force_dangerous_codes=self.force_dangerous_codes,
120+
respect_server_retry_after_header=self._respect_server_retry_after_header,
117121
urllib3_kwargs=urllib3_kwargs,
118122
)
119123
else:

src/databricks/sql/backend/thrift_backend.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,9 @@ def __init__(
189189
" This behaviour is deprecated and will be removed in a future release."
190190
)
191191
self.force_dangerous_codes = kwargs.get("_retry_dangerous_codes", [])
192+
self._respect_server_retry_after_header = kwargs.get(
193+
"_respect_server_retry_after_header", False
194+
)
192195

193196
additional_transport_args = {}
194197

@@ -215,6 +218,7 @@ def __init__(
215218
stop_after_attempts_duration=self._retry_stop_after_attempts_duration,
216219
delay_default=self._retry_delay_default,
217220
force_dangerous_codes=self.force_dangerous_codes,
221+
respect_server_retry_after_header=self._respect_server_retry_after_header,
218222
urllib3_kwargs=urllib3_kwargs,
219223
)
220224

src/databricks/sql/common/unified_http_client.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ def _setup_pool_managers(self):
9999
stop_after_attempts_duration=self.config.retry_stop_after_attempts_duration,
100100
delay_default=self.config.retry_delay_default,
101101
force_dangerous_codes=self.config.retry_dangerous_codes,
102+
respect_server_retry_after_header=self.config.respect_server_retry_after_header,
102103
)
103104

104105
# Initialize the required attributes that DatabricksRetryPolicy expects

src/databricks/sql/utils.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -919,6 +919,9 @@ def build_client_context(server_hostname: str, version: str, **kwargs):
919919
),
920920
retry_delay_default=kwargs.get("_retry_delay_default"),
921921
retry_dangerous_codes=kwargs.get("_retry_dangerous_codes"),
922+
respect_server_retry_after_header=kwargs.get(
923+
"_respect_server_retry_after_header"
924+
),
922925
proxy_auth_method=kwargs.get("_proxy_auth_method"),
923926
pool_connections=kwargs.get("_pool_connections"),
924927
pool_maxsize=kwargs.get("_pool_maxsize"),

0 commit comments

Comments
 (0)