Skip to content

Conversation

@Mirrowel
Copy link
Owner

@Mirrowel Mirrowel commented Jan 16, 2026

  • Update request headers, user-agent, and client metadata to exactly match the official Gemini CLI (v0.26.0) fingerprint.
  • Implement persistent session_id and unique user_prompt_id generation to replicate native conversation tracking.
  • Add automatic endpoint fallback strategy (Sandbox Daily → Production) for both streaming generation and token counting.
  • Enhance error handling to trigger failovers on 5xx server errors and connection timeouts, while preserving explicit 429 rate limit handling.

Important

Implement native client fingerprinting and endpoint fallbacks for Gemini CLI, updating headers, session management, and error handling.

  • Behavior:
    • Update request headers in gemini_auth_base.py and gemini_cli_provider.py to match Gemini CLI v0.26.0.
    • Implement persistent session_id and unique user_prompt_id generation in gemini_cli_provider.py.
    • Add endpoint fallback strategy in gemini_cli_provider.py using GEMINI_CLI_ENDPOINT_FALLBACKS.
    • Enhance error handling in gemini_cli_provider.py to trigger failovers on 5xx errors and timeouts, while preserving 429 handling.
  • Utilities:
    • Add GEMINI_CLI_ENDPOINT_FALLBACKS to gemini_shared_utils.py for endpoint management.
  • Misc:
    • Comment out unused headers in gemini_auth_base.py and gemini_cli_provider.py for potential SDK mimicry.

This description was created by Ellipsis for 5f604d3. You can customize this summary. It will automatically update as commits are pushed.

…allbacks

- Update request headers, user-agent, and client metadata to exactly match the official Gemini CLI (v0.26.0) fingerprint.
- Implement persistent `session_id` and unique `user_prompt_id` generation to replicate native conversation tracking.
- Add automatic endpoint fallback strategy (Sandbox Daily → Production) for both streaming generation and token counting.
- Enhance error handling to trigger failovers on 5xx server errors and connection timeouts, while preserving explicit 429 rate limit handling.
Copy link
Owner Author

@Mirrowel Mirrowel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importantly, this seems to fix 429 rate limits.

The native `gemini-cli` does not send `X-Goog-Api-Client` or `Client-Metadata` as HTTP headers during the OAuth/Code Assist flow. These headers are specific to the SDK path or sent within the request body for management endpoints.

- Remove `X-Goog-Api-Client` and `Client-Metadata` from HTTP headers to strictly mimic the official CLI fingerprint.
- Remove explicit `Accept` header as it is not sent by the native client.
- Update internal documentation and source code references to reflect deep analysis of `gemini-cli` internals.
@Mirrowel Mirrowel marked this pull request as ready for review January 16, 2026 01:14
@Mirrowel Mirrowel self-assigned this Jan 16, 2026
@Mirrowel Mirrowel added the enhancement New feature or request label Jan 16, 2026
@mirrobot-agent
Copy link
Contributor

I'm starting my review of the Gemini CLI alignment features. I'll be looking closely at the new client fingerprinting and fallback strategies. Stand by for the full report!

mirrobot-agent[bot]
mirrobot-agent bot previously approved these changes Jan 16, 2026
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This PR is a solid enhancement to the Gemini provider. The alignment with the native Gemini CLI v0.26.0 fingerprinting (headers, session management, and payload structure) is meticulous and well-researched. The implementation of the endpoint fallback strategy (Sandbox Daily → Production) is a great addition for improving reliability and potentially navigating rate limits more effectively.

Architectural Feedback

The decision to persist the session_id within the GeminiCliProvider instance is correct, as it ensures conversation continuity across requests as long as the provider instance lives in the RotatingClient cache. The use of secrets for user_prompt_id is also a good security-conscious choice for randomness.

Key Suggestions

  • Clean up redundant logic: I found a small section of unreachable code in the endpoint fallback loop that can be removed.

Questions for the Author

None. The implementation details and source code references in the comments are very helpful!

This review was generated by an AI assistant.

raise

# If we get here, all endpoints failed (shouldn't happen due to raise in loop)
if last_endpoint_error:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block appears to be unreachable because the fallback loop (lines 1478-1626) already raises the final exception when the GEMINI_CLI_ENDPOINT_FALLBACKS list is exhausted. Consider removing it to simplify the code.

@Mirrowel
Copy link
Owner Author

2 research docs generated, for later reference if needed:

Gemini CLI Alignment Plan

This plan outlines the exact technical requirements to align GeminiCliProvider.py with the official gemini-cli client. By mirroring these request "fingerprints," the proxy will appear as the legitimate client to Google's backend, potentially accessing more favorable rate-limit buckets.


Research Sources

Primary Sources (Official gemini-cli Repository)

File Lines Purpose
stuff/gemini-cli/packages/core/package.json 3, 28 Version (0.26.0), SDK version (@google/genai: 1.30.0)
stuff/gemini-cli/packages/core/src/code_assist/server.ts 60-70, 72-120 Endpoint URL, API version, request flow
stuff/gemini-cli/packages/core/src/code_assist/converter.ts 31-48, 119-162 Exact payload structure (CAGenerateContentRequest)
stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts 19-39, 46-57 Platform enum mapping, ClientMetadata structure
stuff/gemini-cli/packages/core/src/utils/version.ts 14-17 Version string generation
stuff/gemini-cli/packages/core/src/utils/channel.ts 25-43 Release channel detection (stable/nightly/preview)
stuff/gemini-cli/packages/cli/src/gemini.tsx 668 user_prompt_id generation logic
stuff/gemini-cli/packages/core/src/utils/googleQuotaErrors.ts 100-180 Error classification and retry delay extraction
stuff/gemini-cli/packages/core/src/utils/googleErrors.ts 131-222 Deep JSON error parsing
stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts 69-85 OAuth Client ID/Secret (verified match)

Current Implementation Files

File Lines Current State
src/rotator_library/providers/gemini_cli_provider.py 1343-1354 Payload construction
src/rotator_library/providers/gemini_cli_provider.py 1405-1413 Header construction
src/rotator_library/providers/gemini_auth_base.py 19-23, 36-44 Auth headers, OAuth credentials

1. Header Alignment

The official client constructs headers dynamically based on runtime environment.

1.1 User-Agent Header

Native Format:

GeminiCLI/${version}/${model} (${platform}; ${arch})

Examples:

  • GeminiCLI/0.26.0/gemini-2.5-pro (win32; x64)
  • GeminiCLI/0.26.0/gemini-3-flash-preview (darwin; arm64)

Current (WRONG):

google-api-nodejs-client/9.15.1

Implementation:

def _get_aligned_user_agent(self, model: str) -> str:
    import platform
    plat = sys.platform  # 'win32', 'darwin', 'linux'
    arch = platform.machine()  # 'x86_64', 'arm64', 'AMD64'
    # Normalize arch
    if arch in ('x86_64', 'AMD64'):
        arch = 'x64'
    model_name = model.split("/")[-1]
    return f"GeminiCLI/0.26.0/{model_name} ({plat}; {arch})"

1.2 X-Goog-Api-Client Header

Native Format:

gl-node/${node_version} gdcl/${sdk_version}

Correct Value:

gl-node/22.17.0 gdcl/1.30.0

Current (INCOMPLETE):

gl-node/22.17.0

Note: gdcl refers to the @google/genai SDK version from package.json line 28.

1.3 Client-Metadata Header

Native Structure (from client_metadata.ts lines 46-54):

{
  ideName: 'IDE_UNSPECIFIED',
  pluginType: 'GEMINI',
  ideVersion: '0.26.0',           // CLI version
  platform: 'WINDOWS_AMD64',      // See platform mapping below
  updateChannel: 'stable'         // 'stable' | 'nightly' | 'preview'
}

Serialized Format:

ideType=IDE_UNSPECIFIED,pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable

Current (INCOMPLETE):

ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI

Missing Fields:

  • ideVersion (required)
  • updateChannel (required)
  • Dynamic platform (currently hardcoded to PLATFORM_UNSPECIFIED)

1.4 Platform Enum Mapping

Source: client_metadata.ts lines 19-39

sys.platform platform.machine() Enum Value
darwin x86_64 / x64 DARWIN_AMD64
darwin arm64 DARWIN_ARM64
linux x86_64 / x64 LINUX_AMD64
linux aarch64 / arm64 LINUX_ARM64
win32 AMD64 / x86_64 WINDOWS_AMD64
(other) (any) PLATFORM_UNSPECIFIED

Implementation:

def _get_platform_enum(self) -> str:
    import platform
    plat = sys.platform
    arch = platform.machine().lower()
    
    if plat == 'darwin':
        if arch in ('x86_64', 'amd64'):
            return 'DARWIN_AMD64'
        if arch == 'arm64':
            return 'DARWIN_ARM64'
    elif plat == 'linux':
        if arch in ('x86_64', 'amd64'):
            return 'LINUX_AMD64'
        if arch in ('aarch64', 'arm64'):
            return 'LINUX_ARM64'
    elif plat == 'win32':
        if arch in ('amd64', 'x86_64'):
            return 'WINDOWS_AMD64'
    
    return 'PLATFORM_UNSPECIFIED'

1.5 X-Goog-User-Project Header

Native Behavior: Sent with the project_id for quota attribution.

Value: The GCP project ID associated with the credential (already in self.project_id_cache).


2. Request Payload Alignment

2.1 Correct Payload Structure

Source: converter.ts lines 31-48, 119-131

Native TypeScript Interface:

interface CAGenerateContentRequest {
  model: string;                    // Top level
  project?: string;                 // Top level
  user_prompt_id?: string;          // Top level - MISSING in current impl
  request: {
    contents: Content[];
    systemInstruction?: Content;
    tools?: ToolListUnion;
    toolConfig?: ToolConfig;
    labels?: Record<string, string>;
    safetySettings?: SafetySetting[];
    generationConfig?: VertexGenerationConfig;
    session_id?: string;            // INSIDE request - MISSING in current impl
  }
}

Correct JSON:

{
  "model": "gemini-2.5-pro",
  "project": "your-project-id",
  "user_prompt_id": "a1b2c3d4e5f6g7",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "generationConfig": {...},
    "safetySettings": [...],
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Current Implementation Issues (lines 1343-1354):

  1. Missing user_prompt_id at top level
  2. Missing session_id inside request
  3. Has extra fields not in native CLI:
    • requestType: "agent" - REMOVE
    • requestId: "agent-{uuid}" - REMOVE (replace with user_prompt_id)

2.2 user_prompt_id Generation

Source: gemini.tsx line 668

Native JavaScript:

const prompt_id = Math.random().toString(16).slice(2);

This generates a 13-14 character hexadecimal string (e.g., "a1b2c3d4e5f6g7").

Python Equivalent:

import secrets

def _generate_user_prompt_id(self) -> str:
    """Generate a unique prompt ID matching native gemini-cli format."""
    # secrets.token_hex(7) produces 14 hex chars
    return secrets.token_hex(7)

Lifecycle: Generate a new user_prompt_id for every request.

2.3 session_id Management

Source: server.ts lines 64-70, converter.ts line 160

Native Behavior:

  • session_id is a persistent UUID for the duration of a conversation
  • Passed to CodeAssistServer constructor, then placed inside request

Implementation:

def __init__(self):
    # ... existing init ...
    self._session_id = str(uuid.uuid4())  # Persistent per provider instance

# In request construction:
request_payload["request"]["session_id"] = self._session_id

Note: The session_id should persist across multiple requests within the same "session" (provider instance lifetime).


3. Error Metadata Alignment

3.1 Priority for Retry Delay Extraction

Source: googleQuotaErrors.ts lines 100-180, googleErrors.ts lines 131-222

Native Priority Order:

  1. ErrorInfo.metadata.quotaResetDelay (e.g., "539.477544ms")
  2. RetryInfo.retryDelay (e.g., "0.539477544s")
  3. Message regex: /Please retry in ([0-9.]+(?:ms|s))/
  4. Default: 10 seconds for RATE_LIMIT_EXCEEDED

Current Implementation (lines 289-302):
Already checks quotaResetDelay and RetryInfo - this is correct.

Improvement: Support millisecond precision (current _parse_duration returns int, should return float).

3.2 Error Classification

Source: googleQuotaErrors.ts lines 50-100

Condition Classification Action
reason == "QUOTA_EXHAUSTED" Terminal Fallback to different model
reason == "RATE_LIMIT_EXCEEDED" Retryable Wait and retry
retry_delay > 300s Terminal Fallback to different model
quotaId contains PerDay Terminal Fallback to different model

4. OAuth Credentials (Already Aligned)

Source: oauth2.ts lines 69-78, gemini_auth_base.py lines 36-44

Field Native Value Our Value Status
Client ID 681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com Same
Client Secret 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 Same
Scopes cloud-platform, userinfo.email, userinfo.profile Same

5. Implementation Checklist

Phase 1: Header Updates (gemini_cli_provider.py lines 1405-1413)

  • Replace static User-Agent with dynamic _get_aligned_user_agent(model)
  • Update X-Goog-Api-Client to gl-node/22.17.0 gdcl/1.30.0
  • Add ideVersion=0.26.0 to Client-Metadata
  • Add updateChannel=stable to Client-Metadata
  • Replace platform=PLATFORM_UNSPECIFIED with dynamic _get_platform_enum()
  • Add X-Goog-User-Project: {project_id} header

Phase 2: Payload Updates (gemini_cli_provider.py lines 1343-1380)

  • Add user_prompt_id at top level (use _generate_user_prompt_id())
  • Add session_id inside request object
  • Remove requestType field (not in native)
  • Remove requestId field (replaced by user_prompt_id)

Phase 3: Provider Initialization

  • Generate persistent _session_id in __init__
  • Add helper methods: _get_aligned_user_agent(), _get_platform_enum(), _generate_user_prompt_id()

Phase 4: Error Parsing (Optional Enhancement)

  • Update _parse_duration to return float instead of int
  • Add classification for QUOTA_EXHAUSTED vs RATE_LIMIT_EXCEEDED

6. Expected Final Request

# Headers
{
    "Authorization": "Bearer {access_token}",
    "User-Agent": "GeminiCLI/0.26.0/gemini-2.5-pro (win32; x64)",
    "X-Goog-Api-Client": "gl-node/22.17.0 gdcl/1.30.0",
    "X-Goog-User-Project": "my-gcp-project-123",
    "Client-Metadata": "ideType=IDE_UNSPECIFIED,pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable",
    "Accept": "application/json",
    "Content-Type": "application/json",
}

# Payload
{
    "model": "gemini-2.5-pro",
    "project": "my-gcp-project-123",
    "user_prompt_id": "a1b2c3d4e5f6g7",
    "request": {
        "contents": [...],
        "systemInstruction": {...},
        "generationConfig": {...},
        "safetySettings": [...],
        "tools": [...],
        "toolConfig": {...},
        "session_id": "550e8400-e29b-41d4-a716-446655440000"
    }
}

7. Verification

After implementation, verify alignment by:

  1. Capture native request: Run gemini-cli with debug logging enabled
  2. Compare headers: Ensure all header values match exactly
  3. Compare payload: Ensure structure and field names match
  4. Monitor rate limits: Check if 429 frequency decreases

Appendix: Version Constants

Keep these updated when the official client releases new versions:

# From packages/core/package.json
GEMINI_CLI_VERSION = "0.26.0"
GENAI_SDK_VERSION = "1.30.0"
NODE_VERSION = "22.17.0"
UPDATE_CHANNEL = "stable"  # or "nightly" for development

Gemini CLI Payload Construction Research Report

Date: January 16, 2026
Scope: OAuth/Code Assist Path Only
Purpose: Document differences between native gemini-cli and gemini_cli_provider.py for alignment


Executive Summary

This document details the findings from analyzing the native Gemini CLI source code (stuff/gemini-cli/) to understand how it constructs API requests for the Code Assist endpoint. The goal is to ensure gemini_cli_provider.py accurately replicates the official client behavior.

Key Findings:

  1. The native CLI does NOT explicitly set X-Goog-Api-Client or Client-Metadata headers in its Code Assist HTTP calls - these appear to be artifacts from mimicking the @google/genai SDK
  2. The labels field is supported but not actively used in the CLI's request construction
  3. The native CLI has specific logic for handling "thought" parts that the Python provider lacks
  4. The thinkingConfig supports multiple configuration styles (budget vs. level) depending on model family

Table of Contents

  1. Authentication Flow
  2. HTTP Headers Analysis
  3. Request Body Structure
  4. Generation Configuration
  5. Message Transformation
  6. Token Counting Considerations
  7. Recommendations
  8. Source File Index

1. Authentication Flow

1.1 OAuth Client Configuration

The native CLI uses Google's OAuth2 flow with specific client credentials.

Source: stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts

// Lines 69-78
const OAUTH_CLIENT_ID =
  '681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com';

const OAUTH_CLIENT_SECRET = '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';

const OAUTH_SCOPE = [
  'https://www.googleapis.com/auth/cloud-platform',
  'https://www.googleapis.com/auth/userinfo.email',
  'https://www.googleapis.com/auth/userinfo.profile',
];

Current Provider Status: Matches - uses identical OAuth credentials.

1.2 Token Refresh

The CLI uses google-auth-library OAuth2Client which automatically handles token refresh via the tokens event.

Source: stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts:154-162


2. HTTP Headers Analysis

2.1 Headers Set by Native CLI

The Code Assist server (CodeAssistServer) sets minimal headers for its API calls.

Source: stuff/gemini-cli/packages/core/src/code_assist/server.ts

// Lines 284-290 (streamGenerateContent)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

// Lines 302-308 (countTokens)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

// Lines 331-337 (retrieveUserQuota)
headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

Key Observation: The native CLI Code Assist path sets only:

  • Content-Type: application/json
  • Authorization: Bearer <token>

2.2 User-Agent Header

The User-Agent is constructed dynamically in the content generator.

Source: stuff/gemini-cli/packages/core/src/core/contentGenerator.ts

// Lines 151-167
const httpOptions = { headers: baseHeaders };
// ...
let headers: Record<string, string> = { ...baseHeaders };

The base headers include the User-Agent pattern:

GeminiCLI/${version}/${model} (${platform}; ${arch})

Version Source: stuff/gemini-cli/package.json

"version": "0.26.0-nightly.20260115.6cb3ae4e0"

2.3 X-Goog-Api-Client Header

Finding: This header is NOT explicitly set in the Code Assist server calls.

Search Result: grep -ri "X-Goog-Api-Client" stuff/gemini-cli returned NO matches in the core Code Assist path.

Implication: The X-Goog-Api-Client: gl-node/22.17.0 gdcl/1.30.0 header in gemini_cli_provider.py is mimicking what the @google/genai SDK would add, not what the native CLI adds.

SDK Version Reference: stuff/gemini-cli/packages/core/package.json

"@google/genai": "^1.30.0"

2.4 Client-Metadata Header

Finding: This header is NOT used as an HTTP header in the native CLI.

Search Result: grep -ri "Client-Metadata" stuff/gemini-cli returned NO matches.

The ClientMetadata structure exists but is used differently:

Source: stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts

// Lines 7-14
export interface ClientMetadata {
  ideType: string;
  ideVersion: string;
  pluginType: string;
  pluginVersion: string;
  platform: string;
  updateChannel: string;
}

This metadata is sent in the request body for the listExperiments call, not as a header:

Source: stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts:23-35

export async function listExperiments(
  // ...
): Promise<ExperimentResult> {
  // Metadata is included in the POST body, not headers
}

2.5 X-Goog-User-Project Header

Finding: Used in MCP (Model Context Protocol) authentication, not in Code Assist calls.

Source: stuff/gemini-cli/packages/core/src/mcp/google-auth-provider.ts:138-148

const headers: Record<string, string> = {};
// ...
if (!Object.keys(headers).some(
  (key) => key.toLowerCase() === 'x-goog-user-project',
)) {
  headers['X-Goog-User-Project'] = quotaProjectId;
}

Provider Status: Correctly excluded - causes 403 errors in Code Assist path.

2.6 Header Comparison Table

Header Native CLI (Code Assist) gemini_cli_provider.py Action
Content-Type application/json application/json Keep
Authorization Bearer <token> Bearer <token> Keep
User-Agent Dynamic with version Hardcoded 0.26.0 Update version
X-Goog-Api-Client Not set gl-node/22.17.0 gdcl/1.30.0 Keep for SDK mimicry
Client-Metadata Not set as header Serialized string Keep for fingerprinting
X-Goog-User-Project Not set Not set Exclude
Accept Not set application/json Optional

3. Request Body Structure

3.1 Top-Level Wrapper (CAGenerateContentRequest)

The Code Assist endpoint expects a specific wrapper structure.

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts

// Lines 35-44
export interface CAGenerateContentRequest {
  model: string;
  project: string;
  user_prompt_id: string;
  request: VertexGenerateContentRequest;
}

// Lines 259-266
export function toGenerateContentRequest(
  model: string,
  config: GenerateContentConfig,
  params: GenerateContentParameters,
  projectId: string,
  sessionId: string,
  userPromptId: string,
): CAGenerateContentRequest {

Current Provider Status: Correctly implemented.

3.2 User Prompt ID Generation

Source: stuff/gemini-cli/packages/core/src/code_assist/codeAssist.ts

The user_prompt_id is generated as a random hex string:

Math.random().toString(16).slice(2)

Current Provider: Uses secrets.token_hex(7) which produces equivalent output.

3.3 VertexGenerateContentRequest Structure

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:46-89

export interface VertexGenerateContentRequest {
  contents: VertexContent[];
  systemInstruction?: VertexContent;
  tools?: VertexTool[];
  toolConfig?: VertexToolConfig;
  generationConfig?: VertexGenerationConfig;
  safetySettings?: SafetySetting[];
  cachedContent?: string;
  labels?: Record<string, string>;  // LINE 56
  session_id?: string;
}

3.4 Labels Field

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:157

labels: req.config?.labels,

Finding: The labels field is passed through from the config but is typically undefined in normal CLI usage.

Test Evidence: stuff/gemini-cli/packages/core/src/code_assist/converter.test.ts:47,76,105

labels: undefined,

Current Provider Status: Missing. Should be added for potential metadata tagging.


4. Generation Configuration

4.1 VertexGenerationConfig Structure

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:59-89

export interface VertexGenerationConfig {
  stopSequences?: string[];
  responseMimeType?: string;
  responseSchema?: unknown;
  candidateCount?: number;
  maxOutputTokens?: number;
  temperature?: number;
  topP?: number;
  topK?: number;
  presencePenalty?: number;
  frequencyPenalty?: number;
  seed?: number;
  responseLogprobs?: boolean;
  logprobs?: number;
  audioTimestamp?: boolean;
  thinkingConfig?: ThinkingConfig;
  responseModalities?: string[];
  mediaResolution?: string;
  speechConfig?: SpeechConfigUnion;
  routingConfig?: GenerationConfigRoutingConfig;
}

4.2 ThinkingConfig Variations

The CLI supports multiple thinking configuration styles based on model family.

Source: stuff/gemini-cli/packages/core/src/config/defaultModelConfigs.ts

// Lines 28-51 - Gemini 2.5 models use thinkingBudget
'chat-base-2.5': {
  extends: 'chat-base',
  modelConfig: {
    generateContentConfig: {
      thinkingConfig: {
        thinkingBudget: 8192
      }
    }
  }
}

// Lines 51-60 - Gemini 3 models use thinkingLevel
'chat-base-3': {
  extends: 'chat-base',
  modelConfig: {
    generateContentConfig: {
      thinkingConfig: {
        thinkingLevel: 'HIGH'
      }
    }
  }
}

Current Provider Status: Only supports thinkingBudget. Missing thinkingLevel support.

4.3 Default Model Configurations

Source: stuff/gemini-cli/schemas/settings.schema.json:459-569

Model thinkingConfig
gemini-2.5-pro { thinkingBudget: 8192 }
gemini-2.5-flash { thinkingBudget: 8192 }
gemini-3-pro-preview { thinkingLevel: 'HIGH' }
gemini-3-flash-preview { thinkingLevel: 'HIGH' }

4.4 Safety Settings

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:158

safetySettings: req.config?.safetySettings?.map(toSafetySetting),

Safety settings are placed at the root of VertexGenerateContentRequest, not inside generationConfig.

Current Provider Status: Places safety settings correctly.


5. Message Transformation

5.1 Thought Part Handling

The native CLI has specific logic for handling "thought" parts in message content.

Source: stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts

// Function: partToCountTokensPart
// Converts thought parts to text blocks for token counting

Source: stuff/gemini-cli/packages/core/src/code_assist/converter.ts:224-242

export function toVertexPart(part: Part): VertexPart {
  if ('thought' in part && part.thought) {
    return {
      text: part.text,
      thought: true,
    };
  }
  // ... other part conversions
}

5.2 History Thought Stripping

Source: stuff/gemini-cli/packages/core/src/core/client.ts:235-236

stripThoughtsFromHistory() {
  this.getChat().stripThoughtsFromHistory();
}

This is called when switching between Vertex and GenAI auth to prevent thoughtSignature compatibility issues.

Current Provider Status: Does not preserve reasoning_content as thought parts when reconstructing history.


6. Token Counting Considerations

6.1 CountTokens Safety Transformation

Source: stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts

The CLI transforms thought parts before sending to countTokens API to prevent validation errors:

// Thought parts are converted to: [Thought: <text>]

Current Provider Status: Missing this safety transformation.

6.2 CountTokens Endpoint

Source: stuff/gemini-cli/packages/core/src/code_assist/server.ts:296-316

async countTokens(
  req: CountTokensRequest,
): Promise<CountTokensResponse> {
  const url = `${this.baseUrl}/v1/projects/${this.projectId}/locations/us-central1:countTokens`;
  // ...
}

7. Recommendations

7.1 Headers

Action Header Rationale
Update User-Agent Use version 0.26.0-nightly.20260115.6cb3ae4e0
Keep X-Goog-Api-Client Mimics SDK behavior for backend compatibility
Keep Client-Metadata Essential fingerprint even though CLI sends in body
Exclude X-Goog-User-Project Causes 403 errors

7.2 Request Body

Action Field Location
Add labels request.labels - for metadata tagging
Enhance thinkingConfig Add thinkingLevel support for Gemini 3
Verify safetySettings Ensure at root of request, not in generationConfig

7.3 Message Handling

Action Feature Impact
Implement Thought preservation Maintain reasoning_content as thought: true parts
Implement CountTokens safety Strip/convert thought parts before token counting

8. Source File Index

Core Code Assist Implementation

File Purpose
stuff/gemini-cli/packages/core/src/code_assist/server.ts HTTP client for Code Assist API
stuff/gemini-cli/packages/core/src/code_assist/converter.ts Request/response type conversions
stuff/gemini-cli/packages/core/src/code_assist/codeAssist.ts High-level Code Assist orchestration
stuff/gemini-cli/packages/core/src/code_assist/oauth2.ts OAuth authentication flow
stuff/gemini-cli/packages/core/src/code_assist/setup.ts Client metadata initialization
stuff/gemini-cli/packages/core/src/code_assist/types.ts TypeScript type definitions

Configuration

File Purpose
stuff/gemini-cli/packages/core/src/config/defaultModelConfigs.ts Default model generation configs
stuff/gemini-cli/schemas/settings.schema.json Full settings schema with defaults
stuff/gemini-cli/packages/core/package.json SDK version dependencies
stuff/gemini-cli/package.json CLI version number

Utilities

File Purpose
stuff/gemini-cli/packages/core/src/utils/apiConversionUtils.ts API type conversion helpers
stuff/gemini-cli/packages/core/src/core/contentGenerator.ts Content generation with headers
stuff/gemini-cli/packages/core/src/core/client.ts Gemini client with history management

Experiments/Metadata

File Purpose
stuff/gemini-cli/packages/core/src/code_assist/experiments/client_metadata.ts ClientMetadata structure
stuff/gemini-cli/packages/core/src/code_assist/experiments/experiments.ts Experiment flag fetching

Test Files (for reference)

File Purpose
stuff/gemini-cli/packages/core/src/code_assist/converter.test.ts Converter unit tests showing expected values
stuff/gemini-cli/packages/core/src/code_assist/server.test.ts Server test fixtures

Appendix A: Current Provider Headers vs Native CLI

gemini_cli_provider.py (Lines 504-514)

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {access_token}",
    "User-Agent": f"GeminiCLI/0.26.0/{model_name_for_header} (win32; x64)",
    "Client-Metadata": (
        "ideType=IDE_UNSPECIFIED,"
        "pluginType=GEMINI,ideVersion=0.26.0,platform=WINDOWS_AMD64,updateChannel=stable"
    ),
    "X-Goog-Api-Client": x_goog_api_client,
    "Accept": "application/json",
}

Native CLI server.ts (Lines 284-290)

headers: {
  'Content-Type': 'application/json',
  Authorization: `Bearer ${await this.getAccessToken()}`,
},

Difference: The native CLI sends fewer headers. The additional headers in the provider (Client-Metadata, X-Goog-Api-Client, User-Agent) are included to mimic the SDK and ensure the request appears to come from an official client.


Appendix B: Request Payload Structure Comparison

Native CLI (toGenerateContentRequest output)

{
  "model": "gemini-2.5-pro",
  "project": "user-project-id",
  "user_prompt_id": "a1b2c3d4e5f6",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "tools": [...],
    "toolConfig": {...},
    "generationConfig": {
      "temperature": 1,
      "topP": 0.95,
      "topK": 64,
      "thinkingConfig": {
        "includeThoughts": true,
        "thinkingBudget": 8192
      }
    },
    "safetySettings": [...],
    "labels": undefined,
    "session_id": "session-uuid"
  }
}

gemini_cli_provider.py (Current)

{
  "model": "gemini-2.5-pro",
  "project": "user-project-id",
  "user_prompt_id": "a1b2c3d4e5f6",
  "request": {
    "contents": [...],
    "systemInstruction": {...},
    "tools": [...],
    "toolConfig": {...},
    "generationConfig": {
      "temperature": 1,
      "topP": 0.95,
      "topK": 64,
      "thinkingConfig": {
        "includeThoughts": true,
        "thinkingBudget": 8192
      }
    },
    "safetySettings": [...],
    "session_id": "session-uuid"
  }
}

Difference: The provider is missing the labels field (even if typically undefined).


End of Research Report

This change updates the session identification logic to be deterministic and stateless, addressing isolation issues in multi-user environments.

- Replaces the static, instance-level `_session_id` with a dynamic generation method based on the SHA256 hash of the first user message.
- Ensures consistent session IDs for the same conversation context, maintaining continuity across server restarts.
- Decouples session identity from the provider instance lifecycle to prevent context bleeding between requests.
- Aligns ID formatting with native client behavior using UUID-formatted hashes.
mirrobot-agent[bot]

This comment was marked as abuse.

@mirrobot-agent mirrobot-agent bot merged commit b3637a9 into dev Jan 16, 2026
2 of 3 checks passed
@mirrobot-agent mirrobot-agent bot deleted the feature/gemini_cli-alignment branch January 16, 2026 01:43
@Mirrowel
Copy link
Owner Author

huh? how tf

@Mirrowel
Copy link
Owner Author

fuck, cause it is my PR code owners does not apply. ofc. shitass github

Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026
Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026
Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026
@Mirrowel Mirrowel restored the feature/gemini_cli-alignment branch January 16, 2026 01:52
Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026
Repository owner deleted a comment from mirrobot-agent bot Jan 16, 2026
Repository owner deleted a comment from Tarquinen Jan 16, 2026
Repository owner deleted a comment from Tarquinen Jan 16, 2026
Repository owner deleted a comment from Tarquinen Jan 16, 2026
Repository owner locked as spam and limited conversation to collaborators Jan 16, 2026
Repository owner unlocked this conversation Jan 16, 2026
@Mirrowel Mirrowel deleted the feature/gemini_cli-alignment branch January 16, 2026 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants