Feature Request: Return x-ratelimit-* headers in GitHub Copilot completions API responses

## Summary

GitHub Copilot's OpenAI-compatible completions API does not return `x-ratelimit-*` response headers, making it impossible for API consumers, custom clients, and AI agent frameworks to track token usage, quota consumption, and rate-limit state at runtime.

## Current Behavior

When calling the GitHub Copilot completions API (OpenAI-compatible endpoint), the response contains **no** `x-ratelimit-*` headers:
- `x-ratelimit-limit-requests` — absent
- `x-ratelimit-remaining-requests` — absent
- `x-ratelimit-reset-requests` — absent
- `x-ratelimit-limit-tokens` — absent
- `x-ratelimit-remaining-tokens` — absent

This has been confirmed both via direct API testing and by inspecting session logs from clients that parse these headers. There is also **no REST endpoint** to poll current quota usage (e.g. "how many requests do I have left this hour?") — unlike OpenAI's `/dashboard/billing/usage` or providers like Nous Portal and OpenRouter that return these headers natively.

## Expected Behavior

The completions API should return standard `x-ratelimit-*` headers after every response, consistent with other OpenAI-compatible providers:

```
x-ratelimit-limit-requests: 60
x-ratelimit-remaining-requests: 45
x-ratelimit-reset-requests: 23s
x-ratelimit-limit-tokens: 128000
x-ratelimit-remaining-tokens: 70432
```

## Motivation

1. **AI agent frameworks** parse `x-ratelimit-*` headers to display real-time quota/rate-limit info in status bars and UIs. Without these headers, rate-limit state is always `None` for Copilot sessions, degrading the user experience compared to other providers.

2. **Usage transparency** — developers building on the Copilot API have no programmatic way to know how close they are to their quota limit mid-session. The information simply doesn't exist at runtime.

3. **OpenAI-compatibility** — the API surface implies OpenAI-compatible behaviour. Returning these headers would align Copilot with the de-facto standard that all other OpenAI-compatible providers follow.

## Workaround

None exists for quota transparency at runtime. Switching to Nous Portal, OpenRouter, or OpenAI as provider restores header-based quota tracking immediately.

## Additional Context

- GitHub Copilot enforces quota server-side based on subscription plan.
- This affects any third-party tool, IDE extension, or agent framework that relies on `x-ratelimit-*` headers for adaptive rate limiting or status display (RPM remaining, token budget remaining, reset time).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Return x-ratelimit-* headers in GitHub Copilot completions API responses #1187

Summary

Current Behavior

Expected Behavior

Motivation

Workaround

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Return x-ratelimit-* headers in GitHub Copilot completions API responses #1187

Description

Summary

Current Behavior

Expected Behavior

Motivation

Workaround

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions