feat(gemini): ✨ implement native client fingerprinting and endpoint fallbacks #77

Mirrowel · 2026-01-16T00:49:37Z

Update request headers, user-agent, and client metadata to exactly match the official Gemini CLI (v0.26.0) fingerprint.
Implement persistent session_id and unique user_prompt_id generation to replicate native conversation tracking.
Add automatic endpoint fallback strategy (Sandbox Daily → Production) for both streaming generation and token counting.
Enhance error handling to trigger failovers on 5xx server errors and connection timeouts, while preserving explicit 429 rate limit handling.

Important

Implement native client fingerprinting and endpoint fallbacks for Gemini CLI, updating headers, session management, and error handling.

Behavior:
- Update request headers in gemini_auth_base.py and gemini_cli_provider.py to match Gemini CLI v0.26.0.
- Implement persistent session_id and unique user_prompt_id generation in gemini_cli_provider.py.
- Add endpoint fallback strategy in gemini_cli_provider.py using GEMINI_CLI_ENDPOINT_FALLBACKS.
- Enhance error handling in gemini_cli_provider.py to trigger failovers on 5xx errors and timeouts, while preserving 429 handling.
Utilities:
- Add GEMINI_CLI_ENDPOINT_FALLBACKS to gemini_shared_utils.py for endpoint management.
Misc:
- Comment out unused headers in gemini_auth_base.py and gemini_cli_provider.py for potential SDK mimicry.

^{This description was created by}^{for 5f604d3. You can customize this summary. It will automatically update as commits are pushed.}

…allbacks - Update request headers, user-agent, and client metadata to exactly match the official Gemini CLI (v0.26.0) fingerprint. - Implement persistent `session_id` and unique `user_prompt_id` generation to replicate native conversation tracking. - Add automatic endpoint fallback strategy (Sandbox Daily → Production) for both streaming generation and token counting. - Enhance error handling to trigger failovers on 5xx server errors and connection timeouts, while preserving explicit 429 rate limit handling.

Mirrowel

Importantly, this seems to fix 429 rate limits.

The native `gemini-cli` does not send `X-Goog-Api-Client` or `Client-Metadata` as HTTP headers during the OAuth/Code Assist flow. These headers are specific to the SDK path or sent within the request body for management endpoints. - Remove `X-Goog-Api-Client` and `Client-Metadata` from HTTP headers to strictly mimic the official CLI fingerprint. - Remove explicit `Accept` header as it is not sent by the native client. - Update internal documentation and source code references to reflect deep analysis of `gemini-cli` internals.

mirrobot-agent · 2026-01-16T01:15:04Z

I'm starting my review of the Gemini CLI alignment features. I'll be looking closely at the new client fingerprinting and fallback strategies. Stand by for the full report!

mirrobot-agent

Overall Assessment

This PR is a solid enhancement to the Gemini provider. The alignment with the native Gemini CLI v0.26.0 fingerprinting (headers, session management, and payload structure) is meticulous and well-researched. The implementation of the endpoint fallback strategy (Sandbox Daily → Production) is a great addition for improving reliability and potentially navigating rate limits more effectively.

Architectural Feedback

The decision to persist the session_id within the GeminiCliProvider instance is correct, as it ensures conversation continuity across requests as long as the provider instance lives in the RotatingClient cache. The use of secrets for user_prompt_id is also a good security-conscious choice for randomness.

Key Suggestions

Clean up redundant logic: I found a small section of unreachable code in the endpoint fallback loop that can be removed.

Questions for the Author

None. The implementation details and source code references in the comments are very helpful!

This review was generated by an AI assistant.

mirrobot-agent · 2026-01-16T01:16:25Z

src/rotator_library/providers/gemini_cli_provider.py

+                        raise
+
+                # If we get here, all endpoints failed (shouldn't happen due to raise in loop)
+                if last_endpoint_error:


This block appears to be unreachable because the fallback loop (lines 1478-1626) already raises the final exception when the GEMINI_CLI_ENDPOINT_FALLBACKS list is exhausted. Consider removing it to simplify the code.

Mirrowel · 2026-01-16T01:38:09Z

2 research docs generated, for later reference if needed:

Gemini CLI Alignment Plan

This plan outlines the exact technical requirements to align GeminiCliProvider.py with the official gemini-cli client. By mirroring these request "fingerprints," the proxy will appear as the legitimate client to Google's backend, potentially accessing more favorable rate-limit buckets.

Research Sources

Primary Sources (Official gemini-cli Repository)

File	Lines	Purpose
`stuff/gemini-cli/packages/core/package.json`	3, 28	Version (`0.26.0`), SDK version (`@google/genai: 1.30.0`)
`stuff/gemini-cli/packages/core/src/code_assist/server.ts`	60-70, 72-120	Endpoint URL, API version, request flow
`stuff/gemini-cli/packages/core/src/code_assist/converter.ts`	31-48, 119-162	Exact payload structure (`CAGenerateContentRequest`)
`stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts`	19-39, 46-57	Platform enum mapping, `ClientMetadata` structure
`stuff/gemini-cli/packages/core/src/utils/version.ts`	14-17	Version string generation
`stuff/gemini-cli/packages/core/src/utils/channel.ts`	25-43	Release channel detection (`stable`/`nightly`/`preview`)
`stuff/gemini-cli/packages/cli/src/gemini.tsx`	668	`user_prompt_id` generation logic
`stuff/gemini-cli/packages/core/src/utils/googleQuotaErrors.ts`	100-180	Error classification and retry delay extraction
`stuff/gemini-cli/packages/core/src/utils/googleErrors.ts`	131-222	Deep JSON error parsing
`stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts`	69-85	OAuth Client ID/Secret (verified match)

Current Implementation Files

File	Lines	Current State
`src/rotator_library/providers/gemini_cli_provider.py`	1343-1354	Payload construction
`src/rotator_library/providers/gemini_cli_provider.py`	1405-1413	Header construction
`src/rotator_library/providers/gemini_auth_base.py`	19-23, 36-44	Auth headers, OAuth credentials

1. Header Alignment

The official client constructs headers dynamically based on runtime environment.

1.1 `User-Agent` Header

Native Format:

GeminiCLI/${version}/${model} (${platform}; ${arch})

Examples:

GeminiCLI/0.26.0/gemini-2.5-pro (win32; x64)
GeminiCLI/0.26.0/gemini-3-flash-preview (darwin; arm64)

Current (WRONG):

google-api-nodejs-client/9.15.1

Implementation:

def _get_aligned_user_agent(self, model: str) -> str:
    import platform
    plat = sys.platform  # 'win32', 'darwin', 'linux'
    arch = platform.machine()  # 'x86_64', 'arm64', 'AMD64'
    # Normalize arch
    if arch in ('x86_64', 'AMD64'):
        arch = 'x64'
    model_name = model.split("/")[-1]
    return f"GeminiCLI/0.26.0/{model_name} ({plat}; {arch})"

1.2 `X-Goog-Api-Client` Header

Native Format:

gl-node/${node_version} gdcl/${sdk_version}

Correct Value:

gl-node/22.17.0 gdcl/1.30.0

Current (INCOMPLETE):

gl-node/22.17.0

Note: gdcl refers to the @google/genai SDK version from package.json line 28.

1.3 `Client-Metadata` Header

Native Structure (from client_metadata.ts lines 46-54):

{
  ideName: 'IDE_UNSPECIFIED',
  pluginType: 'GEMINI',
  ideVersion: '0.26.0',           // CLI version
  platform: 'WINDOWS_AMD64',      // See platform mapping below
  updateChannel: 'stable'         // 'stable' | 'nightly' | 'preview'
}

Serialized Format:

ideType=IDE_UNSPECIFIED,pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable

Current (INCOMPLETE):

ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI

Missing Fields:

ideVersion (required)
updateChannel (required)
Dynamic platform (currently hardcoded to PLATFORM_UNSPECIFIED)

1.4 Platform Enum Mapping

Source: client_metadata.ts lines 19-39

`sys.platform`	`platform.machine()`	Enum Value
`darwin`	`x86_64` / `x64`	`DARWIN_AMD64`
`darwin`	`arm64`	`DARWIN_ARM64`
`linux`	`x86_64` / `x64`	`LINUX_AMD64`
`linux`	`aarch64` / `arm64`	`LINUX_ARM64`
`win32`	`AMD64` / `x86_64`	`WINDOWS_AMD64`
(other)	(any)	`PLATFORM_UNSPECIFIED`

Implementation:

def _get_platform_enum(self) -> str:
    import platform
    plat = sys.platform
    arch = platform.machine().lower()
    
    if plat == 'darwin':
        if arch in ('x86_64', 'amd64'):
            return 'DARWIN_AMD64'
        if arch == 'arm64':
            return 'DARWIN_ARM64'
    elif plat == 'linux':
        if arch in ('x86_64', 'amd64'):
            return 'LINUX_AMD64'
        if arch in ('aarch64', 'arm64'):
            return 'LINUX_ARM64'
    elif plat == 'win32':
        if arch in ('amd64', 'x86_64'):
            return 'WINDOWS_AMD64'
    
    return 'PLATFORM_UNSPECIFIED'

1.5 `X-Goog-User-Project` Header

Native Behavior: Sent with the project_id for quota attribution.

Value: The GCP project ID associated with the credential (already in self.project_id_cache).

2. Request Payload Alignment

2.1 Correct Payload Structure

Source: converter.ts lines 31-48, 119-131

Native TypeScript Interface:

interface CAGenerateContentRequest {
  model: string;                    // Top level
  project?: string;                 // Top level
  user_prompt_id?: string;          // Top level - MISSING in current impl
  request: {
    contents: Content[];
    systemInstruction?: Content;
    tools?: ToolListUnion;
    toolConfig?: ToolConfig;
    labels?: Record<string, string>;
    safetySettings?: SafetySetting[];
    generationConfig?: VertexGenerationConfig;
    session_id?: string;            // INSIDE request - MISSING in current impl
  }
}

Correct JSON:

{
  "model": "gemini-2.5-pro",
  "project": "your-project-id",
  "user_prompt_id": "a1b2c3d4e5f6g7",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "generationConfig": {...},
    "safetySettings": [...],
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Current Implementation Issues (lines 1343-1354):

Missing user_prompt_id at top level
Missing session_id inside request
Has extra fields not in native CLI:
- requestType: "agent" - REMOVE
- requestId: "agent-{uuid}" - REMOVE (replace with user_prompt_id)

2.2 `user_prompt_id` Generation

Source: gemini.tsx line 668

Native JavaScript:

const prompt_id = Math.random().toString(16).slice(2);

This generates a 13-14 character hexadecimal string (e.g., "a1b2c3d4e5f6g7").

Python Equivalent:

import secrets

def _generate_user_prompt_id(self) -> str:
    """Generate a unique prompt ID matching native gemini-cli format."""
    # secrets.token_hex(7) produces 14 hex chars
    return secrets.token_hex(7)

Lifecycle: Generate a new user_prompt_id for every request.

2.3 `session_id` Management

Source: server.ts lines 64-70, converter.ts line 160

Native Behavior:

session_id is a persistent UUID for the duration of a conversation
Passed to CodeAssistServer constructor, then placed inside request

Implementation:

def __init__(self):
    # ... existing init ...
    self._session_id = str(uuid.uuid4())  # Persistent per provider instance

# In request construction:
request_payload["request"]["session_id"] = self._session_id

Note: The session_id should persist across multiple requests within the same "session" (provider instance lifetime).

3. Error Metadata Alignment

3.1 Priority for Retry Delay Extraction

Source: googleQuotaErrors.ts lines 100-180, googleErrors.ts lines 131-222

Native Priority Order:

ErrorInfo.metadata.quotaResetDelay (e.g., "539.477544ms")
RetryInfo.retryDelay (e.g., "0.539477544s")
Message regex: /Please retry in ([0-9.]+(?:ms|s))/
Default: 10 seconds for RATE_LIMIT_EXCEEDED

Current Implementation (lines 289-302):
Already checks quotaResetDelay and RetryInfo - this is correct.

Improvement: Support millisecond precision (current _parse_duration returns int, should return float).

3.2 Error Classification

Source: googleQuotaErrors.ts lines 50-100

Condition	Classification	Action
`reason == "QUOTA_EXHAUSTED"`	Terminal	Fallback to different model
`reason == "RATE_LIMIT_EXCEEDED"`	Retryable	Wait and retry
`retry_delay > 300s`	Terminal	Fallback to different model
`quotaId` contains `PerDay`	Terminal	Fallback to different model

4. OAuth Credentials (Already Aligned)

Source: oauth2.ts lines 69-78, gemini_auth_base.py lines 36-44

Field	Native Value	Our Value	Status
Client ID	`681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com`	Same	✅
Client Secret	`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`	Same	✅
Scopes	`cloud-platform`, `userinfo.email`, `userinfo.profile`	Same	✅

5. Implementation Checklist

Phase 1: Header Updates (`gemini_cli_provider.py` lines 1405-1413)

Replace static User-Agent with dynamic _get_aligned_user_agent(model)
Update X-Goog-Api-Client to gl-node/22.17.0 gdcl/1.30.0
Add ideVersion=0.26.0 to Client-Metadata
Add updateChannel=stable to Client-Metadata
Replace platform=PLATFORM_UNSPECIFIED with dynamic _get_platform_enum()
Add X-Goog-User-Project: {project_id} header

Phase 2: Payload Updates (`gemini_cli_provider.py` lines 1343-1380)

Add user_prompt_id at top level (use _generate_user_prompt_id())
Add session_id inside request object
Remove requestType field (not in native)
Remove requestId field (replaced by user_prompt_id)

Phase 3: Provider Initialization

Generate persistent _session_id in __init__
Add helper methods: _get_aligned_user_agent(), _get_platform_enum(), _generate_user_prompt_id()

Phase 4: Error Parsing (Optional Enhancement)

Update _parse_duration to return float instead of int
Add classification for QUOTA_EXHAUSTED vs RATE_LIMIT_EXCEEDED

6. Expected Final Request

# Headers
{
    "Authorization": "Bearer {access_token}",
    "User-Agent": "GeminiCLI/0.26.0/gemini-2.5-pro (win32; x64)",
    "X-Goog-Api-Client": "gl-node/22.17.0 gdcl/1.30.0",
    "X-Goog-User-Project": "my-gcp-project-123",
    "Client-Metadata": "ideType=IDE_UNSPECIFIED,pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable",
    "Accept": "application/json",
    "Content-Type": "application/json",
}

# Payload
{
    "model": "gemini-2.5-pro",
    "project": "my-gcp-project-123",
    "user_prompt_id": "a1b2c3d4e5f6g7",
    "request": {
        "contents": [...],
        "systemInstruction": {...},
        "generationConfig": {...},
        "safetySettings": [...],
        "tools": [...],
        "toolConfig": {...},
        "session_id": "550e8400-e29b-41d4-a716-446655440000"
    }
}

7. Verification

After implementation, verify alignment by:

Capture native request: Run gemini-cli with debug logging enabled
Compare headers: Ensure all header values match exactly
Compare payload: Ensure structure and field names match
Monitor rate limits: Check if 429 frequency decreases

Appendix: Version Constants

Keep these updated when the official client releases new versions:

# From packages/core/package.json
GEMINI_CLI_VERSION = "0.26.0"
GENAI_SDK_VERSION = "1.30.0"
NODE_VERSION = "22.17.0"
UPDATE_CHANNEL = "stable"  # or "nightly" for development

Gemini CLI Payload Construction Research Report

Date: January 16, 2026
Scope: OAuth/Code Assist Path Only
Purpose: Document differences between native gemini-cli and gemini_cli_provider.py for alignment

Executive Summary

This document details the findings from analyzing the native Gemini CLI source code (stuff/gemini-cli/) to understand how it constructs API requests for the Code Assist endpoint. The goal is to ensure gemini_cli_provider.py accurately replicates the official client behavior.

Key Findings:

The native CLI does NOT explicitly set X-Goog-Api-Client or Client-Metadata headers in its Code Assist HTTP calls - these appear to be artifacts from mimicking the @google/genai SDK
The labels field is supported but not actively used in the CLI's request construction
The native CLI has specific logic for handling "thought" parts that the Python provider lacks
The thinkingConfig supports multiple configuration styles (budget vs. level) depending on model family

1. Authentication Flow

1.1 OAuth Client Configuration

The native CLI uses Google's OAuth2 flow with specific client credentials.

Source: stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts

// Lines 69-78
const OAUTH_CLIENT_ID =
  '681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com';

const OAUTH_CLIENT_SECRET = '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';

const OAUTH_SCOPE = [
  'https://www.googleapis.com/auth/cloud-platform',
  'https://www.googleapis.com/auth/userinfo.email',
  'https://www.googleapis.com/auth/userinfo.profile',
];

Current Provider Status: Matches - uses identical OAuth credentials.

1.2 Token Refresh

The CLI uses google-auth-library OAuth2Client which automatically handles token refresh via the tokens event.

Source: stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts:154-162

2. HTTP Headers Analysis

2.1 Headers Set by Native CLI

The Code Assist server (CodeAssistServer) sets minimal headers for its API calls.

Source: stuff/gemini-cli/packages/core/src/code_assist/server.ts

// Lines 284-290 (streamGenerateContent)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

// Lines 302-308 (countTokens)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

// Lines 331-337 (retrieveUserQuota)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

Key Observation: The native CLI Code Assist path sets only:

Content-Type: application/json
Authorization: Bearer <token>

2.2 User-Agent Header

The User-Agent is constructed dynamically in the content generator.

Source: stuff/gemini-cli/packages/core/src/core/contentGenerator.ts

// Lines 151-167
const httpOptions = { headers: baseHeaders };
// ...
let headers: Record<string, string> = { ...baseHeaders };

The base headers include the User-Agent pattern:

GeminiCLI/${version}/${model} (${platform}; ${arch})

Version Source: stuff/gemini-cli/package.json

"version": "0.26.0-nightly.20260115.6cb3ae4e0"

2.3 X-Goog-Api-Client Header

Finding: This header is NOT explicitly set in the Code Assist server calls.

Search Result: grep -ri "X-Goog-Api-Client" stuff/gemini-cli returned NO matches in the core Code Assist path.

Implication: The X-Goog-Api-Client: gl-node/22.17.0 gdcl/1.30.0 header in gemini_cli_provider.py is mimicking what the @google/genai SDK would add, not what the native CLI adds.

SDK Version Reference: stuff/gemini-cli/packages/core/package.json

"@google/genai": "^1.30.0"

2.4 Client-Metadata Header

Finding: This header is NOT used as an HTTP header in the native CLI.

Search Result: grep -ri "Client-Metadata" stuff/gemini-cli returned NO matches.

The ClientMetadata structure exists but is used differently:

Source: stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts

// Lines 7-14
export interface ClientMetadata {
  ideType: string;
  ideVersion: string;
  pluginType: string;
  pluginVersion: string;
  platform: string;
  updateChannel: string;
}

This metadata is sent in the request body for the listExperiments call, not as a header:

Source: stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts:23-35

export async function listExperiments(
  // ...
): Promise<ExperimentResult> {
  // Metadata is included in the POST body, not headers
}

2.5 X-Goog-User-Project Header

Finding: Used in MCP (Model Context Protocol) authentication, not in Code Assist calls.

Source: stuff/gemini-cli/packages/core/src/mcp/google-auth-provider.ts:138-148

const headers: Record<string, string> = {};
// ...
if (!Object.keys(headers).some(
  (key) => key.toLowerCase() === 'x-goog-user-project',
)) {
  headers['X-Goog-User-Project'] = quotaProjectId;
}

Provider Status: Correctly excluded - causes 403 errors in Code Assist path.

2.6 Header Comparison Table

Header	Native CLI (Code Assist)	`gemini_cli_provider.py`	Action
`Content-Type`	`application/json`	`application/json`	Keep
`Authorization`	`Bearer <token>`	`Bearer <token>`	Keep
`User-Agent`	Dynamic with version	Hardcoded `0.26.0`	Update version
`X-Goog-Api-Client`	Not set	`gl-node/22.17.0 gdcl/1.30.0`	Keep for SDK mimicry
`Client-Metadata`	Not set as header	Serialized string	Keep for fingerprinting
`X-Goog-User-Project`	Not set	Not set	Exclude
`Accept`	Not set	`application/json`	Optional

3. Request Body Structure

3.1 Top-Level Wrapper (CAGenerateContentRequest)

The Code Assist endpoint expects a specific wrapper structure.

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts

// Lines 35-44
export interface CAGenerateContentRequest {
  model: string;
  project: string;
  user_prompt_id: string;
  request: VertexGenerateContentRequest;
}

// Lines 259-266
export function toGenerateContentRequest(
  model: string,
  config: GenerateContentConfig,
  params: GenerateContentParameters,
  projectId: string,
  sessionId: string,
  userPromptId: string,
): CAGenerateContentRequest {

Current Provider Status: Correctly implemented.

3.2 User Prompt ID Generation

Source: stuff/gemini-cli/packages/core/src/code_assist/codeAssist.ts

The user_prompt_id is generated as a random hex string:

Math.random().toString(16).slice(2)

Current Provider: Uses secrets.token_hex(7) which produces equivalent output.

3.3 VertexGenerateContentRequest Structure

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:46-89

export interface VertexGenerateContentRequest {
  contents: VertexContent[];
  systemInstruction?: VertexContent;
  tools?: VertexTool[];
  toolConfig?: VertexToolConfig;
  generationConfig?: VertexGenerationConfig;
  safetySettings?: SafetySetting[];
  cachedContent?: string;
  labels?: Record<string, string>;  // LINE 56
  session_id?: string;
}

3.4 Labels Field

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:157

labels: req.config?.labels,

Finding: The labels field is passed through from the config but is typically undefined in normal CLI usage.

Test Evidence: stuff/gemini-cli/packages/core/src/code_assist/converter.test.ts:47,76,105

labels: undefined,

Current Provider Status: Missing. Should be added for potential metadata tagging.

4. Generation Configuration

4.1 VertexGenerationConfig Structure

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:59-89

export interface VertexGenerationConfig {
  stopSequences?: string[];
  responseMimeType?: string;
  responseSchema?: unknown;
  candidateCount?: number;
  maxOutputTokens?: number;
  temperature?: number;
  topP?: number;
  topK?: number;
  presencePenalty?: number;
  frequencyPenalty?: number;
  seed?: number;
  responseLogprobs?: boolean;
  logprobs?: number;
  audioTimestamp?: boolean;
  thinkingConfig?: ThinkingConfig;
  responseModalities?: string[];
  mediaResolution?: string;
  speechConfig?: SpeechConfigUnion;
  routingConfig?: GenerationConfigRoutingConfig;
}

4.2 ThinkingConfig Variations

The CLI supports multiple thinking configuration styles based on model family.

Source: stuff/gemini-cli/packages/core/src/config/defaultModelConfigs.ts

// Lines 28-51 - Gemini 2.5 models use thinkingBudget
'chat-base-2.5': {
  extends: 'chat-base',
  modelConfig: {
    generateContentConfig: {
      thinkingConfig: {
        thinkingBudget: 8192
      }
    }
  }
}

// Lines 51-60 - Gemini 3 models use thinkingLevel
'chat-base-3': {
  extends: 'chat-base',
  modelConfig: {
    generateContentConfig: {
      thinkingConfig: {
        thinkingLevel: 'HIGH'
      }
    }
  }
}

Current Provider Status: Only supports thinkingBudget. Missing thinkingLevel support.

4.3 Default Model Configurations

Source: stuff/gemini-cli/schemas/settings.schema.json:459-569

Model	thinkingConfig
`gemini-2.5-pro`	`{ thinkingBudget: 8192 }`
`gemini-2.5-flash`	`{ thinkingBudget: 8192 }`
`gemini-3-pro-preview`	`{ thinkingLevel: 'HIGH' }`
`gemini-3-flash-preview`	`{ thinkingLevel: 'HIGH' }`

4.4 Safety Settings

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:158

safetySettings: req.config?.safetySettings?.map(toSafetySetting),

Safety settings are placed at the root of VertexGenerateContentRequest, not inside generationConfig.

Current Provider Status: Places safety settings correctly.

5. Message Transformation

5.1 Thought Part Handling

The native CLI has specific logic for handling "thought" parts in message content.

Source: stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts

// Function: partToCountTokensPart
// Converts thought parts to text blocks for token counting

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:224-242

export function toVertexPart(part: Part): VertexPart {
  if ('thought' in part && part.thought) {
    return {
      text: part.text,
      thought: true,
    };
  }
  // ... other part conversions
}

5.2 History Thought Stripping

Source: stuff/gemini-cli/packages/core/src/core/client.ts:235-236

stripThoughtsFromHistory() {
  this.getChat().stripThoughtsFromHistory();
}

This is called when switching between Vertex and GenAI auth to prevent thoughtSignature compatibility issues.

Current Provider Status: Does not preserve reasoning_content as thought parts when reconstructing history.

6. Token Counting Considerations

6.1 CountTokens Safety Transformation

Source: stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts

The CLI transforms thought parts before sending to countTokens API to prevent validation errors:

// Thought parts are converted to: [Thought: <text>]

Current Provider Status: Missing this safety transformation.

6.2 CountTokens Endpoint

Source: stuff/gemini-cli/packages/core/src/code_assist/server.ts:296-316

async countTokens(
  req: CountTokensRequest,
): Promise<CountTokensResponse> {
  const url = `${this.baseUrl}/v1/projects/${this.projectId}/locations/us-central1:countTokens`;
  // ...
}

7. Recommendations

7.1 Headers

Action	Header	Rationale
Update	`User-Agent`	Use version `0.26.0-nightly.20260115.6cb3ae4e0`
Keep	`X-Goog-Api-Client`	Mimics SDK behavior for backend compatibility
Keep	`Client-Metadata`	Essential fingerprint even though CLI sends in body
Exclude	`X-Goog-User-Project`	Causes 403 errors

7.2 Request Body

Action	Field	Location
Add	`labels`	`request.labels` - for metadata tagging
Enhance	`thinkingConfig`	Add `thinkingLevel` support for Gemini 3
Verify	`safetySettings`	Ensure at root of `request`, not in `generationConfig`

7.3 Message Handling

Action	Feature	Impact
Implement	Thought preservation	Maintain `reasoning_content` as `thought: true` parts
Implement	CountTokens safety	Strip/convert thought parts before token counting

8. Source File Index

Core Code Assist Implementation

File	Purpose
`stuff/gemini-cli/packages/core/src/code_assist/server.ts`	HTTP client for Code Assist API
`stuff/gemini-cli/packages/core/src/code_assist/converter.ts`	Request/response type conversions
`stuff/gemini-cli/packages/core/src/code_assist/codeAssist.ts`	High-level Code Assist orchestration
`stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts`	OAuth authentication flow
`stuff/gemini-cli/packages/core/src/code_assist/setup.ts`	Client metadata initialization
`stuff/gemini-cli/packages/core/src/code_assist/types.ts`	TypeScript type definitions

Configuration

File	Purpose
`stuff/gemini-cli/packages/core/src/config/defaultModelConfigs.ts`	Default model generation configs
`stuff/gemini-cli/schemas/settings.schema.json`	Full settings schema with defaults
`stuff/gemini-cli/packages/core/package.json`	SDK version dependencies
`stuff/gemini-cli/package.json`	CLI version number

Utilities

File	Purpose
`stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts`	API type conversion helpers
`stuff/gemini-cli/packages/core/src/core/contentGenerator.ts`	Content generation with headers
`stuff/gemini-cli/packages/core/src/core/client.ts`	Gemini client with history management

Experiments/Metadata

File	Purpose
`stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts`	ClientMetadata structure
`stuff/gemini-cli/packages/core/src/code_assist/experiments/experiments.ts`	Experiment flag fetching

Test Files (for reference)

File	Purpose
`stuff/gemini-cli/packages/core/src/code_assist/converter.test.ts`	Converter unit tests showing expected values
`stuff/gemini-cli/packages/core/src/code_assist/server.test.ts`	Server test fixtures

Appendix A: Current Provider Headers vs Native CLI

gemini_cli_provider.py (Lines 504-514)

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {access_token}",
    "User-Agent": f"GeminiCLI/0.26.0/{model_name_for_header} (win32; x64)",
    "Client-Metadata": (
        "ideType=IDE_UNSPECIFIED,"
        "pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable"
    ),
    "X-Goog-Api-Client": x_goog_api_client,
    "Accept": "application/json",
}

Native CLI server.ts (Lines 284-290)

headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

Difference: The native CLI sends fewer headers. The additional headers in the provider (Client-Metadata, X-Goog-Api-Client, User-Agent) are included to mimic the SDK and ensure the request appears to come from an official client.

Appendix B: Request Payload Structure Comparison

Native CLI (toGenerateContentRequest output)

{
  "model": "gemini-2.5-pro",
  "project": "user-project-id",
  "user_prompt_id": "a1b2c3d4e5f6",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "tools": [...],
    "toolConfig": {...},
    "generationConfig": {
      "temperature": 1,
      "topP": 0.95,
      "topK": 64,
      "thinkingConfig": {
        "includeThoughts": true,
        "thinkingBudget": 8192
      }
    },
    "safetySettings": [...],
    "labels": undefined,
    "session_id": "session-uuid"
  }
}

gemini_cli_provider.py (Current)

{
  "model": "gemini-2.5-pro",
  "project": "user-project-id",
  "user_prompt_id": "a1b2c3d4e5f6",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "tools": [...],
    "toolConfig": {...},
    "generationConfig": {
      "temperature": 1,
      "topP": 0.95,
      "topK": 64,
      "thinkingConfig": {
        "includeThoughts": true,
        "thinkingBudget": 8192
      }
    },
    "safetySettings": [...],
    "session_id": "session-uuid"
  }
}

Difference: The provider is missing the labels field (even if typically undefined).

End of Research Report

This change updates the session identification logic to be deterministic and stateless, addressing isolation issues in multi-user environments. - Replaces the static, instance-level `_session_id` with a dynamic generation method based on the SHA256 hash of the first user message. - Ensures consistent session IDs for the same conversation context, maintaining continuity across server restarts. - Decouples session identity from the provider instance lifecycle to prevent context bleeding between requests. - Aligns ID formatting with native client behavior using UUID-formatted hashes.

Mirrowel · 2026-01-16T01:45:03Z

huh? how tf

Mirrowel · 2026-01-16T01:45:37Z

fuck, cause it is my PR code owners does not apply. ofc. shitass github

Mirrowel commented Jan 16, 2026

View reviewed changes

Mirrowel marked this pull request as ready for review January 16, 2026 01:14

Mirrowel self-assigned this Jan 16, 2026

Mirrowel added the enhancement New feature or request label Jan 16, 2026

mirrobot-agent bot previously approved these changes Jan 16, 2026

View reviewed changes

Mirrowel dismissed mirrobot-agent[bot]’s stale review via 5f604d3 January 16, 2026 01:40

This comment was marked as abuse.

Sign in to view

mirrobot-agent bot merged commit b3637a9 into dev Jan 16, 2026
2 of 3 checks passed

mirrobot-agent bot deleted the feature/gemini_cli-alignment branch January 16, 2026 01:43

Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026

Mirrowel restored the feature/gemini_cli-alignment branch January 16, 2026 01:52

Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026

Repository owner deleted a comment from Tarquinen Jan 16, 2026

Repository owner locked as spam and limited conversation to collaborators Jan 16, 2026

Repository owner unlocked this conversation Jan 16, 2026

Mirrowel deleted the feature/gemini_cli-alignment branch January 16, 2026 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(gemini): ✨ implement native client fingerprinting and endpoint fallbacks #77

feat(gemini): ✨ implement native client fingerprinting and endpoint fallbacks #77

Mirrowel commented Jan 16, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

Mirrowel left a comment

Uh oh!

mirrobot-agent bot commented Jan 16, 2026

Uh oh!

mirrobot-agent bot left a comment

Uh oh!

mirrobot-agent bot Jan 16, 2026

Uh oh!

Mirrowel commented Jan 16, 2026

Uh oh!

This comment was marked as abuse.

Uh oh!

Uh oh!

Mirrowel commented Jan 16, 2026

Uh oh!

Mirrowel commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat(gemini): ✨ implement native client fingerprinting and endpoint fallbacks #77

feat(gemini): ✨ implement native client fingerprinting and endpoint fallbacks #77

Conversation

Mirrowel commented Jan 16, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mirrowel left a comment

Choose a reason for hiding this comment

Uh oh!

mirrobot-agent bot commented Jan 16, 2026

Uh oh!

mirrobot-agent bot left a comment

Choose a reason for hiding this comment

Overall Assessment

Architectural Feedback

Key Suggestions

Questions for the Author

Uh oh!

mirrobot-agent bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Mirrowel commented Jan 16, 2026

Gemini CLI Alignment Plan

Research Sources

Primary Sources (Official gemini-cli Repository)

Current Implementation Files

1. Header Alignment

1.1 User-Agent Header

1.2 X-Goog-Api-Client Header

1.3 Client-Metadata Header

1.4 Platform Enum Mapping

1.5 X-Goog-User-Project Header

2. Request Payload Alignment

2.1 Correct Payload Structure

2.2 user_prompt_id Generation

2.3 session_id Management

3. Error Metadata Alignment

3.1 Priority for Retry Delay Extraction

3.2 Error Classification

4. OAuth Credentials (Already Aligned)

5. Implementation Checklist

Phase 1: Header Updates (gemini_cli_provider.py lines 1405-1413)

Phase 2: Payload Updates (gemini_cli_provider.py lines 1343-1380)

Phase 3: Provider Initialization

Phase 4: Error Parsing (Optional Enhancement)

6. Expected Final Request

7. Verification

Appendix: Version Constants

Gemini CLI Payload Construction Research Report

Executive Summary

Table of Contents

1. Authentication Flow

1.1 OAuth Client Configuration

1.2 Token Refresh

2. HTTP Headers Analysis

2.1 Headers Set by Native CLI

2.2 User-Agent Header

2.3 X-Goog-Api-Client Header

2.4 Client-Metadata Header

2.5 X-Goog-User-Project Header

2.6 Header Comparison Table

3. Request Body Structure

3.1 Top-Level Wrapper (CAGenerateContentRequest)

3.2 User Prompt ID Generation

3.3 VertexGenerateContentRequest Structure

3.4 Labels Field

4. Generation Configuration

4.1 VertexGenerationConfig Structure

4.2 ThinkingConfig Variations

4.3 Default Model Configurations

4.4 Safety Settings

5. Message Transformation

5.1 Thought Part Handling

5.2 History Thought Stripping

6. Token Counting Considerations

6.1 CountTokens Safety Transformation

6.2 CountTokens Endpoint

7. Recommendations

7.1 Headers

Mirrowel commented Jan 16, 2026 •

edited by ellipsis-dev bot

Loading

1.1 `User-Agent` Header

1.2 `X-Goog-Api-Client` Header

1.3 `Client-Metadata` Header

1.5 `X-Goog-User-Project` Header

2.2 `user_prompt_id` Generation

2.3 `session_id` Management

Phase 1: Header Updates (`gemini_cli_provider.py` lines 1405-1413)

Phase 2: Payload Updates (`gemini_cli_provider.py` lines 1343-1380)