Skip to content

fix(cdk): propagate api_budget to CustomRequester subclasses of HttpRequester#993

Open
devin-ai-integration[bot] wants to merge 4 commits intomainfrom
devin/1776463195-fix-api-budget-custom-requester
Open

fix(cdk): propagate api_budget to CustomRequester subclasses of HttpRequester#993
devin-ai-integration[bot] wants to merge 4 commits intomainfrom
devin/1776463195-fix-api-budget-custom-requester

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 17, 2026

Summary

ModelToComponentFactory.create_custom_component constructs a custom component from model.dict() plus incoming kwargs, but it does not forward the manifest-level self._api_budget the way create_http_requester does. As a result, connectors that define a top-level api_budget and use a CustomRequester (i.e., a Python subclass of HttpRequester) silently lose the configured rate-limit policies at runtime — the custom requester is constructed with api_budget=None, which in turn builds an HttpClient with an empty APIBudget(policies=[]).

This was hit by source-youtube-analytics (see airbytehq/oncall#12011 and the parent oncall#11978), whose ContentOwnerRequester/JobRequester subclass HttpRequester and receive api_budget=None even though the manifest declares a HTTPAPIBudget with a MovingWindowCallRatePolicy.

Fix

Two coordinated changes in create_custom_component:

  1. Inject api_budget on the dataclass. When the target custom_component_class is a subclass of HttpRequester and the manifest/caller has not explicitly provided an api_budget, inject self._api_budget. The guard only fires when:

    • self._api_budget is not None (a top-level budget was declared and registered via set_api_budget),
    • the class actually accepts api_budget via its type hints,
    • kwargs.get("api_budget") is None (we never override an explicit manifest or caller value),
    • and custom_component_class is a type that is a subclass of HttpRequester.
  2. Sync the injected budget with the active HttpClient. Some custom requesters call super().__post_init__() and then replace _http_client with their own HttpClient implementation without forwarding api_budget. In that case, the requester's api_budget attribute is set correctly, but the active _http_client._api_budget is still the default empty APIBudget(policies=[]). The new helper _sync_injected_api_budget_with_http_client detects this (replacement client with an empty budget) and points the active client at the injected budget. If a custom requester has intentionally installed its own non-empty budget on the replacement client, it is left untouched.

This mirrors the pattern established by #828 (propagating _api_budget to substream parent-stream factories) and matches how create_http_requester reads self._api_budget before constructing the HttpRequester, while additionally tolerating the __post_init__ HttpClient override pattern that source-youtube-analytics' custom requesters use.

Once a CDK release containing this fix is adopted, connectors like source-youtube-analytics will pick up their manifest-level rate-limit policies on the CustomRequester path automatically — no connector-side change required.

Declarative-First Evaluation

Not applicable: this is a CDK bug fix inside ModelToComponentFactory, not a manifest/YAML change in a connector. The goal is to make the existing declarative api_budget feature work for CustomRequester — which is itself the escape hatch used when a behavior is not expressible declaratively.

Test Coverage

Added four unit tests in unit_tests/sources/declarative/parsers/test_model_to_component_factory.py, backed by three new helper classes in unit_tests/sources/declarative/parsers/testing_components.py (TestingRequester, TestingRequesterWithReplacedHttpClient, TestingRequesterWithReplacedHttpClientAndOwnBudget):

  1. test_api_budget_propagated_to_custom_requester_subclass_of_http_requester — a plain HttpRequester subclass receives the manifest-level HTTPAPIBudget/MovingWindowCallRatePolicy, and the underlying HttpClient shares the same budget instance. Fails on main without the fix (custom_requester.api_budget is None).
  2. test_api_budget_propagated_to_custom_requester_that_replaces_http_client — covers the source-youtube-analytics pattern where the requester replaces _http_client in __post_init__ without forwarding api_budget. Asserts the active _http_client._api_budget is the injected budget after the sync step.
  3. test_api_budget_not_overwriting_non_empty_budget_on_replaced_http_client — regression guard: when a requester intentionally installs its own non-empty budget on the replacement HttpClient, the sync step must leave it alone.
  4. test_api_budget_not_propagated_to_non_http_requester_custom_components — regression guard: custom components that are not HttpRequester subclasses (e.g., a CustomErrorHandler) must not receive an api_budget kwarg, so their constructors don't fail with TypeError: got an unexpected keyword argument 'api_budget'.

Ran locally on the initial commit:

  • poetry run pytest unit_tests/sources/declarative/parsers/test_model_to_component_factory.py — 118 passed.
  • poetry run ruff check . and poetry run ruff format --check . — clean.
  • poetry run mypy --config-file mypy.ini airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py — no issues.

CI is green on the latest commit (all Ruff/MyPy/Pytest jobs passing across Python 3.10–3.13).

Breaking Change Evaluation

Not breaking. No schema, spec, state, or stream changes. This is a bug fix that makes a previously-ignored api_budget manifest field take effect on the CustomRequester path. Existing behavior is preserved when:

  • no top-level api_budget is declared (self._api_budget is None), or
  • the custom component does not subclass HttpRequester, or
  • the manifest/caller already provided an explicit api_budget (not overridden), or
  • a custom requester installed its own non-empty budget on its replacement HttpClient.

Resolves https://github.com/airbytehq/oncall/issues/12011:

Review & Testing Checklist for Human

  • Confirm the guard in create_custom_component (only inject when the target class is an HttpRequester subclass and no api_budget was already provided) is acceptable, or whether a broader "api_budget" in component_fields check would be preferred.
  • Review _sync_injected_api_budget_with_http_client's heuristic for detecting a "replacement" HttpClient: it only overwrites http_client._api_budget when the active budget has zero policies. Confirm this is a safe proxy for "the requester forgot to forward api_budget" vs. "the requester intentionally installed an empty budget".
  • Sanity-check on a manifest-only / low-code connector that uses a custom HttpRequester subclass (e.g., source-youtube-analytics) that the api_budget now takes effect at runtime after bumping to a CDK build containing this fix.
  • Decide whether an analogous propagation is worth extending to any other custom component types that accept api_budget beyond HttpRequester (out of scope here — this PR is deliberately narrow to match the reported bug).

Notes

Link to Devin session: https://app.devin.ai/sessions/36dd4bce63bb414d8e4cbb61935d6f1e


Open in Devin Review

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1776463195-fix-api-budget-custom-requester#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1776463195-fix-api-budget-custom-requester

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

PyTest Results (Fast)

4 026 tests  +8   4 015 ✅ +8   7m 32s ⏱️ -8s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit fb8bff3. ± Comparison against base commit 1256a1f.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

PyTest Results (Full)

4 029 tests  +8   4 017 ✅ +8   10m 53s ⏱️ -40s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit fb8bff3. ± Comparison against base commit 1256a1f.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a CDK bug where manifest-level api_budget wasn’t being propagated when a connector uses a CustomRequester (Python subclass of HttpRequester), causing rate-limit policies to be silently dropped at runtime.

Changes:

  • Inject self._api_budget into custom components that are HttpRequester subclasses when no explicit api_budget is provided.
  • Add a post-construction sync helper to align an injected api_budget with a custom requester’s active _http_client in cases where the requester replaces the client in __post_init__.
  • Add unit tests and testing-only requester classes covering propagation and the client-replacement pattern.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Implements api_budget injection for HttpRequester subclasses and adds a helper intended to sync the injected budget with the active HttpClient.
unit_tests/sources/declarative/parsers/testing_components.py Adds test-only HttpRequester subclasses to simulate replacing _http_client with and without a custom budget.
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py Adds unit tests validating propagation to CustomRequester and guarding non-HttpRequester custom components.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return

if len(getattr(http_client_api_budget, "_policies", [])) == 0:
http_client._api_budget = injected_api_budget
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_sync_injected_api_budget_with_http_client reassigns http_client._api_budget, but HttpClient passes the budget into LimiterSession/CachedLimiterSession at construction time and the session keeps its own _api_budget reference. Updating only HttpClient._api_budget won’t affect the active session’s limiter, so requests can still run with the original empty budget. Consider also updating http_client._session._api_budget (when present) or mutating the existing budget object instead of swapping the reference so the session sees the injected policies.

Suggested change
http_client._api_budget = injected_api_budget
http_client._api_budget = injected_api_budget
http_client_session = getattr(http_client, "_session", None)
if getattr(http_client_session, "_api_budget", None) is not None:
http_client_session._api_budget = injected_api_budget

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — verified this against HttpClient.__init__ (sets self._api_budget then passes self._api_budget into LimiterSession/CachedLimiterSession at line 176/180) and LimiterMixin.__init__ which stores it as self._api_budget = api_budget and calls it directly in send(). Without syncing the session, requests would keep going through the original (empty) budget.

Addressed in 1b448ba. _sync_injected_api_budget_with_http_client now also updates http_client._session._api_budget when the session is still holding the same reference as the client's replaced budget:

http_client._api_budget = injected_api_budget
http_client_session = getattr(http_client, "_session", None)
if (
    http_client_session is not None
    and getattr(http_client_session, "_api_budget", None) is http_client_api_budget
):
    http_client_session._api_budget = injected_api_budget

Kept the existing ref-swap approach rather than mutating the budget in place, to stay consistent with how HttpClient itself wires the reference at construction time.


Devin session

):
return

if len(getattr(http_client_api_budget, "_policies", [])) == 0:
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emptiness check len(getattr(http_client_api_budget, "_policies", [])) == 0 will treat any non-APIBudget implementation (or any budget that doesn’t expose _policies) as empty and overwrite it. If a custom requester installs a different AbstractAPIBudget implementation on its replacement client, this sync step could incorrectly clobber it. Consider guarding with hasattr(http_client_api_budget, "_policies") / isinstance(http_client_api_budget, APIBudget) before using this heuristic, and otherwise leaving the client budget untouched.

Suggested change
if len(getattr(http_client_api_budget, "_policies", [])) == 0:
if isinstance(http_client_api_budget, APIBudget) and len(http_client_api_budget._policies) == 0:

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — a custom AbstractAPIBudget implementation that doesn't expose _policies would have been silently treated as empty and clobbered. Addressed in 1b448ba:

if (
    isinstance(http_client_api_budget, APIBudget)
    and len(http_client_api_budget._policies) == 0
):
    ...

Non-APIBudget implementations now fall through untouched. APIBudget is already imported at the top of the module so no new import was needed.


Devin session

Comment on lines +4414 to +4416
assert isinstance(custom_requester, HttpRequester)
assert custom_requester.api_budget is not None
assert custom_requester._http_client._api_budget is custom_requester.api_budget
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests assert custom_requester._http_client._api_budget is updated, but HttpClient enforces rate limiting via its internal session (LimiterSession/CachedLimiterSession) which stores its own _api_budget reference at construction time. To catch regressions where the session is still using the default empty budget, consider asserting on custom_requester._http_client._session._api_budget (or exercising a request path) rather than only the client field.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — asserting only on _http_client._api_budget would have kept passing even if the session still held the original empty budget (which is what actually enforces rate limiting on send()). Tightened all three tests in 1b448ba to additionally assert on custom_requester._http_client._session._api_budget:

  • test_api_budget_propagated_to_custom_requester_subclass_of_http_requester: asserts the session's budget is the injected one.
  • test_api_budget_propagated_to_custom_requester_that_replaces_http_client: asserts the session's budget was synced to the injected one (this test would now fail without the session-sync change in the _sync_injected_api_budget_with_http_client fix).
  • test_api_budget_not_overwriting_non_empty_budget_on_replaced_http_client: asserts the session's budget remains the requester's own intentionally-installed budget (regression guard that we never silently swap a live session's budget).

Pytest: 120 passed locally.


Devin session

devin-ai-integration Bot and others added 2 commits April 20, 2026 22:37
… APIBudget isinstance

Co-Authored-By: bot_apk <apk@cognition.ai>
Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
return custom_component

@staticmethod
def _sync_injected_api_budget_with_http_client(custom_requester: HttpRequester) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we need this logic here in the cdk. CDK can not know about whether api budget override was done on purpose or not. CDK should be responsible on passing this param to custom component, but not overriding it. If we currently have connectors with empty api budget in their custom requesters but it's expected to be non-empty budget since its defined in manifest, custom requester should be updated then in the source.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daryna Ishchenko (@darynaishchenko) — that's a fair architectural point and worth surfacing to Alfredo García (@alfredogarcia) (who authored commit 0787f79, adding this sync block).

To ground the discussion in what the original oncall actually needs:

  • source-youtube-analytics' ContentOwnerRequester / JobRequester simply subclass HttpRequester and override get_request_params / send_request — they do not override __post_init__ or replace _http_client. So the reported bug (airbytehq/oncall#12011) is fully fixed by the first commit (f5377a8) alone, which just forwards self._api_budget into the dataclass kwargs.
  • The second commit (0787f79), which adds _sync_injected_api_budget_with_http_client, handles a different, speculative pattern: connectors whose custom requester replaces _http_client in __post_init__ without forwarding api_budget. As you note, the CDK can't really distinguish "accidentally dropped the budget" from "intentionally installed an empty one" on such a replacement client — the isinstance(APIBudget) and len(_policies) == 0 heuristic is a best-effort guess.

Given that, I agree the cleaner separation of concerns is the one you describe: CDK forwards api_budget to the custom component (commit 1), and any connector whose custom requester replaces _http_client is responsible for forwarding self.api_budget into that replacement. If we want to revert the sync block and keep only the kwarg injection, I'm happy to do that — but I'd like Alfredo García (@alfredogarcia) to weigh in first since they added it.

Session URL: https://app.devin.ai/sessions/36dd4bce63bb414d8e4cbb61935d6f1e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants