fix(cdk): propagate api_budget to CustomRequester subclasses of HttpRequester#993
fix(cdk): propagate api_budget to CustomRequester subclasses of HttpRequester#993devin-ai-integration[bot] wants to merge 4 commits intomainfrom
Conversation
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1776463195-fix-api-budget-custom-requester#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1776463195-fix-api-budget-custom-requesterPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Pull request overview
Fixes a CDK bug where manifest-level api_budget wasn’t being propagated when a connector uses a CustomRequester (Python subclass of HttpRequester), causing rate-limit policies to be silently dropped at runtime.
Changes:
- Inject
self._api_budgetinto custom components that areHttpRequestersubclasses when no explicitapi_budgetis provided. - Add a post-construction sync helper to align an injected
api_budgetwith a custom requester’s active_http_clientin cases where the requester replaces the client in__post_init__. - Add unit tests and testing-only requester classes covering propagation and the client-replacement pattern.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py |
Implements api_budget injection for HttpRequester subclasses and adds a helper intended to sync the injected budget with the active HttpClient. |
unit_tests/sources/declarative/parsers/testing_components.py |
Adds test-only HttpRequester subclasses to simulate replacing _http_client with and without a custom budget. |
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py |
Adds unit tests validating propagation to CustomRequester and guarding non-HttpRequester custom components. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return | ||
|
|
||
| if len(getattr(http_client_api_budget, "_policies", [])) == 0: | ||
| http_client._api_budget = injected_api_budget |
There was a problem hiding this comment.
_sync_injected_api_budget_with_http_client reassigns http_client._api_budget, but HttpClient passes the budget into LimiterSession/CachedLimiterSession at construction time and the session keeps its own _api_budget reference. Updating only HttpClient._api_budget won’t affect the active session’s limiter, so requests can still run with the original empty budget. Consider also updating http_client._session._api_budget (when present) or mutating the existing budget object instead of swapping the reference so the session sees the injected policies.
| http_client._api_budget = injected_api_budget | |
| http_client._api_budget = injected_api_budget | |
| http_client_session = getattr(http_client, "_session", None) | |
| if getattr(http_client_session, "_api_budget", None) is not None: | |
| http_client_session._api_budget = injected_api_budget |
There was a problem hiding this comment.
Good catch — verified this against HttpClient.__init__ (sets self._api_budget then passes self._api_budget into LimiterSession/CachedLimiterSession at line 176/180) and LimiterMixin.__init__ which stores it as self._api_budget = api_budget and calls it directly in send(). Without syncing the session, requests would keep going through the original (empty) budget.
Addressed in 1b448ba. _sync_injected_api_budget_with_http_client now also updates http_client._session._api_budget when the session is still holding the same reference as the client's replaced budget:
http_client._api_budget = injected_api_budget
http_client_session = getattr(http_client, "_session", None)
if (
http_client_session is not None
and getattr(http_client_session, "_api_budget", None) is http_client_api_budget
):
http_client_session._api_budget = injected_api_budgetKept the existing ref-swap approach rather than mutating the budget in place, to stay consistent with how HttpClient itself wires the reference at construction time.
| ): | ||
| return | ||
|
|
||
| if len(getattr(http_client_api_budget, "_policies", [])) == 0: |
There was a problem hiding this comment.
The emptiness check len(getattr(http_client_api_budget, "_policies", [])) == 0 will treat any non-APIBudget implementation (or any budget that doesn’t expose _policies) as empty and overwrite it. If a custom requester installs a different AbstractAPIBudget implementation on its replacement client, this sync step could incorrectly clobber it. Consider guarding with hasattr(http_client_api_budget, "_policies") / isinstance(http_client_api_budget, APIBudget) before using this heuristic, and otherwise leaving the client budget untouched.
| if len(getattr(http_client_api_budget, "_policies", [])) == 0: | |
| if isinstance(http_client_api_budget, APIBudget) and len(http_client_api_budget._policies) == 0: |
There was a problem hiding this comment.
Agreed — a custom AbstractAPIBudget implementation that doesn't expose _policies would have been silently treated as empty and clobbered. Addressed in 1b448ba:
if (
isinstance(http_client_api_budget, APIBudget)
and len(http_client_api_budget._policies) == 0
):
...Non-APIBudget implementations now fall through untouched. APIBudget is already imported at the top of the module so no new import was needed.
| assert isinstance(custom_requester, HttpRequester) | ||
| assert custom_requester.api_budget is not None | ||
| assert custom_requester._http_client._api_budget is custom_requester.api_budget |
There was a problem hiding this comment.
These tests assert custom_requester._http_client._api_budget is updated, but HttpClient enforces rate limiting via its internal session (LimiterSession/CachedLimiterSession) which stores its own _api_budget reference at construction time. To catch regressions where the session is still using the default empty budget, consider asserting on custom_requester._http_client._session._api_budget (or exercising a request path) rather than only the client field.
There was a problem hiding this comment.
Good point — asserting only on _http_client._api_budget would have kept passing even if the session still held the original empty budget (which is what actually enforces rate limiting on send()). Tightened all three tests in 1b448ba to additionally assert on custom_requester._http_client._session._api_budget:
test_api_budget_propagated_to_custom_requester_subclass_of_http_requester: asserts the session's budget is the injected one.test_api_budget_propagated_to_custom_requester_that_replaces_http_client: asserts the session's budget was synced to the injected one (this test would now fail without the session-sync change in the_sync_injected_api_budget_with_http_clientfix).test_api_budget_not_overwriting_non_empty_budget_on_replaced_http_client: asserts the session's budget remains the requester's own intentionally-installed budget (regression guard that we never silently swap a live session's budget).
Pytest: 120 passed locally.
… APIBudget isinstance Co-Authored-By: bot_apk <apk@cognition.ai>
Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
| return custom_component | ||
|
|
||
| @staticmethod | ||
| def _sync_injected_api_budget_with_http_client(custom_requester: HttpRequester) -> None: |
There was a problem hiding this comment.
I am not sure we need this logic here in the cdk. CDK can not know about whether api budget override was done on purpose or not. CDK should be responsible on passing this param to custom component, but not overriding it. If we currently have connectors with empty api budget in their custom requesters but it's expected to be non-empty budget since its defined in manifest, custom requester should be updated then in the source.
There was a problem hiding this comment.
Thanks Daryna Ishchenko (@darynaishchenko) — that's a fair architectural point and worth surfacing to Alfredo García (@alfredogarcia) (who authored commit 0787f79, adding this sync block).
To ground the discussion in what the original oncall actually needs:
source-youtube-analytics'ContentOwnerRequester/JobRequestersimply subclassHttpRequesterand overrideget_request_params/send_request— they do not override__post_init__or replace_http_client. So the reported bug (airbytehq/oncall#12011) is fully fixed by the first commit (f5377a8) alone, which just forwardsself._api_budgetinto the dataclass kwargs.- The second commit (0787f79), which adds
_sync_injected_api_budget_with_http_client, handles a different, speculative pattern: connectors whose custom requester replaces_http_clientin__post_init__without forwardingapi_budget. As you note, the CDK can't really distinguish "accidentally dropped the budget" from "intentionally installed an empty one" on such a replacement client — theisinstance(APIBudget) and len(_policies) == 0heuristic is a best-effort guess.
Given that, I agree the cleaner separation of concerns is the one you describe: CDK forwards api_budget to the custom component (commit 1), and any connector whose custom requester replaces _http_client is responsible for forwarding self.api_budget into that replacement. If we want to revert the sync block and keep only the kwarg injection, I'm happy to do that — but I'd like Alfredo García (@alfredogarcia) to weigh in first since they added it.
Session URL: https://app.devin.ai/sessions/36dd4bce63bb414d8e4cbb61935d6f1e
Summary
ModelToComponentFactory.create_custom_componentconstructs a custom component frommodel.dict()plus incoming kwargs, but it does not forward the manifest-levelself._api_budgetthe waycreate_http_requesterdoes. As a result, connectors that define a top-levelapi_budgetand use aCustomRequester(i.e., a Python subclass ofHttpRequester) silently lose the configured rate-limit policies at runtime — the custom requester is constructed withapi_budget=None, which in turn builds anHttpClientwith an emptyAPIBudget(policies=[]).This was hit by
source-youtube-analytics(see airbytehq/oncall#12011 and the parent oncall#11978), whoseContentOwnerRequester/JobRequestersubclassHttpRequesterand receiveapi_budget=Noneeven though the manifest declares aHTTPAPIBudgetwith aMovingWindowCallRatePolicy.Fix
Two coordinated changes in
create_custom_component:Inject
api_budgeton the dataclass. When the targetcustom_component_classis a subclass ofHttpRequesterand the manifest/caller has not explicitly provided anapi_budget, injectself._api_budget. The guard only fires when:self._api_budget is not None(a top-level budget was declared and registered viaset_api_budget),api_budgetvia its type hints,kwargs.get("api_budget") is None(we never override an explicit manifest or caller value),custom_component_classis atypethat is a subclass ofHttpRequester.Sync the injected budget with the active
HttpClient. Some custom requesters callsuper().__post_init__()and then replace_http_clientwith their ownHttpClientimplementation without forwardingapi_budget. In that case, the requester'sapi_budgetattribute is set correctly, but the active_http_client._api_budgetis still the default emptyAPIBudget(policies=[]). The new helper_sync_injected_api_budget_with_http_clientdetects this (replacement client with an empty budget) and points the active client at the injected budget. If a custom requester has intentionally installed its own non-empty budget on the replacement client, it is left untouched.This mirrors the pattern established by #828 (propagating
_api_budgetto substream parent-stream factories) and matches howcreate_http_requesterreadsself._api_budgetbefore constructing theHttpRequester, while additionally tolerating the__post_init__HttpClientoverride pattern thatsource-youtube-analytics' custom requesters use.Once a CDK release containing this fix is adopted, connectors like
source-youtube-analyticswill pick up their manifest-level rate-limit policies on theCustomRequesterpath automatically — no connector-side change required.Declarative-First Evaluation
Not applicable: this is a CDK bug fix inside
ModelToComponentFactory, not a manifest/YAML change in a connector. The goal is to make the existing declarativeapi_budgetfeature work forCustomRequester— which is itself the escape hatch used when a behavior is not expressible declaratively.Test Coverage
Added four unit tests in
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py, backed by three new helper classes inunit_tests/sources/declarative/parsers/testing_components.py(TestingRequester,TestingRequesterWithReplacedHttpClient,TestingRequesterWithReplacedHttpClientAndOwnBudget):test_api_budget_propagated_to_custom_requester_subclass_of_http_requester— a plainHttpRequestersubclass receives the manifest-levelHTTPAPIBudget/MovingWindowCallRatePolicy, and the underlyingHttpClientshares the same budget instance. Fails onmainwithout the fix (custom_requester.api_budget is None).test_api_budget_propagated_to_custom_requester_that_replaces_http_client— covers thesource-youtube-analyticspattern where the requester replaces_http_clientin__post_init__without forwardingapi_budget. Asserts the active_http_client._api_budgetis the injected budget after the sync step.test_api_budget_not_overwriting_non_empty_budget_on_replaced_http_client— regression guard: when a requester intentionally installs its own non-empty budget on the replacementHttpClient, the sync step must leave it alone.test_api_budget_not_propagated_to_non_http_requester_custom_components— regression guard: custom components that are notHttpRequestersubclasses (e.g., aCustomErrorHandler) must not receive anapi_budgetkwarg, so their constructors don't fail withTypeError: got an unexpected keyword argument 'api_budget'.Ran locally on the initial commit:
poetry run pytest unit_tests/sources/declarative/parsers/test_model_to_component_factory.py— 118 passed.poetry run ruff check .andpoetry run ruff format --check .— clean.poetry run mypy --config-file mypy.ini airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py— no issues.CI is green on the latest commit (all Ruff/MyPy/Pytest jobs passing across Python 3.10–3.13).
Breaking Change Evaluation
Not breaking. No schema, spec, state, or stream changes. This is a bug fix that makes a previously-ignored
api_budgetmanifest field take effect on theCustomRequesterpath. Existing behavior is preserved when:api_budgetis declared (self._api_budget is None), orHttpRequester, orapi_budget(not overridden), orHttpClient.Resolves https://github.com/airbytehq/oncall/issues/12011:
Review & Testing Checklist for Human
create_custom_component(only inject when the target class is anHttpRequestersubclass and noapi_budgetwas already provided) is acceptable, or whether a broader"api_budget" in component_fieldscheck would be preferred._sync_injected_api_budget_with_http_client's heuristic for detecting a "replacement"HttpClient: it only overwriteshttp_client._api_budgetwhen the active budget has zero policies. Confirm this is a safe proxy for "the requester forgot to forwardapi_budget" vs. "the requester intentionally installed an empty budget".HttpRequestersubclass (e.g.,source-youtube-analytics) that theapi_budgetnow takes effect at runtime after bumping to a CDK build containing this fix.api_budgetbeyondHttpRequester(out of scope here — this PR is deliberately narrow to match the reported bug).Notes
feat: share API budget with parent streams) and fix: propagate api_budget to parent streams in substream partition routers #826. Neither covered theCustomRequesterpath, and neither handled the__post_init__HttpClientoverride case added here.source-youtube-analyticsdoes not need to change; the configuredapi_budgetwill take effect automatically once the connector's CDK version is bumped to a release containing this fix.Link to Devin session: https://app.devin.ai/sessions/36dd4bce63bb414d8e4cbb61935d6f1e