Summary
GitHub Copilot's OpenAI-compatible completions API does not return x-ratelimit-* response headers, making it impossible for API consumers, custom clients, and AI agent frameworks to track token usage, quota consumption, and rate-limit state at runtime.
Current Behavior
When calling the GitHub Copilot completions API (OpenAI-compatible endpoint), the response contains no x-ratelimit-* headers:
x-ratelimit-limit-requests — absent
x-ratelimit-remaining-requests — absent
x-ratelimit-reset-requests — absent
x-ratelimit-limit-tokens — absent
x-ratelimit-remaining-tokens — absent
This has been confirmed both via direct API testing and by inspecting session logs from clients that parse these headers. There is also no REST endpoint to poll current quota usage (e.g. "how many requests do I have left this hour?") — unlike OpenAI's /dashboard/billing/usage or providers like Nous Portal and OpenRouter that return these headers natively.
Expected Behavior
The completions API should return standard x-ratelimit-* headers after every response, consistent with other OpenAI-compatible providers:
x-ratelimit-limit-requests: 60
x-ratelimit-remaining-requests: 45
x-ratelimit-reset-requests: 23s
x-ratelimit-limit-tokens: 128000
x-ratelimit-remaining-tokens: 70432
Motivation
-
AI agent frameworks parse x-ratelimit-* headers to display real-time quota/rate-limit info in status bars and UIs. Without these headers, rate-limit state is always None for Copilot sessions, degrading the user experience compared to other providers.
-
Usage transparency — developers building on the Copilot API have no programmatic way to know how close they are to their quota limit mid-session. The information simply doesn't exist at runtime.
-
OpenAI-compatibility — the API surface implies OpenAI-compatible behaviour. Returning these headers would align Copilot with the de-facto standard that all other OpenAI-compatible providers follow.
Workaround
None exists for quota transparency at runtime. Switching to Nous Portal, OpenRouter, or OpenAI as provider restores header-based quota tracking immediately.
Additional Context
- GitHub Copilot enforces quota server-side based on subscription plan.
- This affects any third-party tool, IDE extension, or agent framework that relies on
x-ratelimit-* headers for adaptive rate limiting or status display (RPM remaining, token budget remaining, reset time).
Summary
GitHub Copilot's OpenAI-compatible completions API does not return
x-ratelimit-*response headers, making it impossible for API consumers, custom clients, and AI agent frameworks to track token usage, quota consumption, and rate-limit state at runtime.Current Behavior
When calling the GitHub Copilot completions API (OpenAI-compatible endpoint), the response contains no
x-ratelimit-*headers:x-ratelimit-limit-requests— absentx-ratelimit-remaining-requests— absentx-ratelimit-reset-requests— absentx-ratelimit-limit-tokens— absentx-ratelimit-remaining-tokens— absentThis has been confirmed both via direct API testing and by inspecting session logs from clients that parse these headers. There is also no REST endpoint to poll current quota usage (e.g. "how many requests do I have left this hour?") — unlike OpenAI's
/dashboard/billing/usageor providers like Nous Portal and OpenRouter that return these headers natively.Expected Behavior
The completions API should return standard
x-ratelimit-*headers after every response, consistent with other OpenAI-compatible providers:Motivation
AI agent frameworks parse
x-ratelimit-*headers to display real-time quota/rate-limit info in status bars and UIs. Without these headers, rate-limit state is alwaysNonefor Copilot sessions, degrading the user experience compared to other providers.Usage transparency — developers building on the Copilot API have no programmatic way to know how close they are to their quota limit mid-session. The information simply doesn't exist at runtime.
OpenAI-compatibility — the API surface implies OpenAI-compatible behaviour. Returning these headers would align Copilot with the de-facto standard that all other OpenAI-compatible providers follow.
Workaround
None exists for quota transparency at runtime. Switching to Nous Portal, OpenRouter, or OpenAI as provider restores header-based quota tracking immediately.
Additional Context
x-ratelimit-*headers for adaptive rate limiting or status display (RPM remaining, token budget remaining, reset time).