Skip to content

Feature Request: Return x-ratelimit-* headers in GitHub Copilot completions API responses #1187

@renne

Description

@renne

Summary

GitHub Copilot's OpenAI-compatible completions API does not return x-ratelimit-* response headers, making it impossible for API consumers, custom clients, and AI agent frameworks to track token usage, quota consumption, and rate-limit state at runtime.

Current Behavior

When calling the GitHub Copilot completions API (OpenAI-compatible endpoint), the response contains no x-ratelimit-* headers:

  • x-ratelimit-limit-requests — absent
  • x-ratelimit-remaining-requests — absent
  • x-ratelimit-reset-requests — absent
  • x-ratelimit-limit-tokens — absent
  • x-ratelimit-remaining-tokens — absent

This has been confirmed both via direct API testing and by inspecting session logs from clients that parse these headers. There is also no REST endpoint to poll current quota usage (e.g. "how many requests do I have left this hour?") — unlike OpenAI's /dashboard/billing/usage or providers like Nous Portal and OpenRouter that return these headers natively.

Expected Behavior

The completions API should return standard x-ratelimit-* headers after every response, consistent with other OpenAI-compatible providers:

x-ratelimit-limit-requests: 60
x-ratelimit-remaining-requests: 45
x-ratelimit-reset-requests: 23s
x-ratelimit-limit-tokens: 128000
x-ratelimit-remaining-tokens: 70432

Motivation

  1. AI agent frameworks parse x-ratelimit-* headers to display real-time quota/rate-limit info in status bars and UIs. Without these headers, rate-limit state is always None for Copilot sessions, degrading the user experience compared to other providers.

  2. Usage transparency — developers building on the Copilot API have no programmatic way to know how close they are to their quota limit mid-session. The information simply doesn't exist at runtime.

  3. OpenAI-compatibility — the API surface implies OpenAI-compatible behaviour. Returning these headers would align Copilot with the de-facto standard that all other OpenAI-compatible providers follow.

Workaround

None exists for quota transparency at runtime. Switching to Nous Portal, OpenRouter, or OpenAI as provider restores header-based quota tracking immediately.

Additional Context

  • GitHub Copilot enforces quota server-side based on subscription plan.
  • This affects any third-party tool, IDE extension, or agent framework that relies on x-ratelimit-* headers for adaptive rate limiting or status display (RPM remaining, token budget remaining, reset time).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions