-
-
Notifications
You must be signed in to change notification settings - Fork 51
feat(chutes): add quota tracking with daily credit limits #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
I'm starting my review of the Chutes quota tracking implementation. It looks like a solid addition with the new mixin pattern. Diving into the details now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
This is a well-structured PR that implements quota tracking for the Chutes provider using a clean mixin pattern. The addition of the background refresh job and the integration with UsageManager via a virtual model are solid architectural choices.
However, there are a few points regarding performance and a potentially critical missing piece (quota grouping) that should be addressed to ensure the tracking works correctly across all models.
Architectural Feedback
The use of a mixin (ChutesQuotaTracker) is excellent for keeping the main provider class clean. The primary concern is the missing quota grouping. Without implementing get_model_quota_group in ChutesProvider, the UsageManager will treat each model as having its own independent quota, which contradicts the "credential-level" nature of Chutes quotas.
Key Suggestions
- Implement Quota Grouping: Add
get_model_quota_grouptoChutesProviderto ensure all models share the same usage stats. - Parallelize Background Job: Use
asyncio.gatherinrun_background_jobto refresh multiple credentials efficiently. - Client Reuse: Allow passing an existing
httpx.AsyncClienttofetch_quota_usageto avoid repeated overhead.
Questions for the Author
- Is the authentication difference between
get_models(Bearer) andfetch_quota_usage(Raw) intentional? - Does the Chutes API return any reset-related headers that we could use instead of the local 00:00 UTC calculation?
This review was generated by an AI assistant.
src/rotator_library/providers/utilities/chutes_quota_tracker.py
Outdated
Show resolved
Hide resolved
src/rotator_library/providers/utilities/chutes_quota_tracker.py
Outdated
Show resolved
Hide resolved
Implement quota tracking for Chutes provider using a simple standalone mixin pattern: Core Implementation: - ChutesQuotaTracker: Standalone mixin (no complex base class inheritance) - Tracks credential-level quota (1 request = 1 credit consumed) - Daily quota reset at 00:00 UTC - Automatic tier detection (Legacy=200, Base=300, Plus=2000, Pro=5000) Features: - fetch_quota_usage(): Queries Chutes API for quota/used credits - Background refresh job: Parallel quota updates via asyncio.gather - Integration with UsageManager using virtual model 'chutes/_quota' - get_model_quota_group(): Returns 'chutes_global' for shared quota pool - Configurable refresh interval via CHUTES_QUOTA_REFRESH_INTERVAL env var Performance: - Shared httpx.AsyncClient for connection reuse - Parallel credential fetching with semaphore (max 5 concurrent) Files: - NEW: chutes_quota_tracker.py - Standalone quota mixin - MODIFIED: chutes_provider.py - Add quota tracking + background job - MODIFIED: usage_manager.py - Add 'chutes' to _REQUEST_COUNT_PROVIDERS API Integration: - Endpoint: GET https://api.chutes.ai/users/me/quota_usage/me - Returns: { quota: int, used: float } - Auth: Bearer token in Authorization header Architecture: Simple standalone mixin for credential-level quota tracking. No OAuth complexity, no cost learning algorithms, no per-model quotas.
556658e to
8105c20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
The author has done an excellent job of addressing the feedback from the previous review. The architecture is now more robust and efficient.
Assessment of New Changes:
- Quota Grouping: Successfully implemented
get_model_quota_group, ensuring shared usage tracking across all models. - Performance: Parallelized the background quota refresh job and implemented HTTP client reuse, which will reduce latency and overhead.
- Consistency: Unified the authentication scheme to use
Bearertokens across all endpoints. - Integration: Correctly integrated 'chutes' into the
UsageManager's request-count tracking.
Overall Status:
The PR is now in a high-quality state and ready for merging.
This review was generated by an AI assistant.
| os.environ.get("CHUTES_QUOTA_REFRESH_INTERVAL", "300") | ||
| ) | ||
|
|
||
| def get_model_quota_group(self, model: str) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix verified: Implementing get_model_quota_group correctly ensures that all Chutes models share the same credential-level quota pool in the UsageManager. This addresses the grouping issue raised in the previous review.
| tasks = [ | ||
| refresh_single_credential(api_key, client) for api_key in credentials | ||
| ] | ||
| await asyncio.gather(*tasks, return_exceptions=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix verified: The background job now correctly parallelizes quota fetches using asyncio.gather and a shared httpx.AsyncClient. This significantly improves efficiency when managing multiple credentials, as suggested.
Implement quota tracking for Chutes provider using a simple standalone mixin pattern:
Core Implementation:
Features:
Files:
API Integration:
Architecture: Simple standalone mixin for credential-level quota tracking.
Important
Adds quota tracking for Chutes provider using a mixin pattern, with daily resets and tier detection, integrated into
UsageManager.ChutesQuotaTrackermixin for quota tracking inchutes_quota_tracker.py.UsageManagerusing virtual modelchutes/_quota.fetch_quota_usage(): Fetches quota usage from Chutes API.chutes_provider.py.CHUTES_QUOTA_REFRESH_INTERVAL.chutes_quota_tracker.pyfor quota tracking mixin.chutes_provider.pyto include quota tracking and background job.usage_manager.pyto add 'chutes' to_REQUEST_COUNT_PROVIDERS.GET https://api.chutes.ai/users/me/quota_usage/me.This description was created by
for 556658e. You can customize this summary. It will automatically update as commits are pushed.