context mgmt #1

minpeter · 2026-01-06T14:12:25Z

feat: add context management with compaction and middleware support\n\n- Add context tracker for monitoring token usage\n- Implement auto-compaction when context threshold reached\n- Add middleware for trimming leading newlines in AI responses\n- Add /help command and model switching capabilities\n- Add includeUsage flag to friendliai client\n- Add debug logging for context usage
Update default model from LGAI-EXAONE/K-EXAONE-236B-A23B to zai-org/GLM-4.6 and add support for aborting ongoing conversations via ESC key
feat: context management follow-ups

…\n- Add context tracker for monitoring token usage\n- Implement auto-compaction when context threshold reached\n- Add middleware for trimming leading newlines in AI responses\n- Add /help command and model switching capabilities\n- Add includeUsage flag to friendliai client\n- Add debug logging for context usage

zai-org/GLM-4.6 and add support for aborting ongoing conversations via ESC key - Replace default model in documentation and code - Add abort functionality to Agent class with AbortController - Modify chat method to return aborted status - Update command handler to support abort signals in streaming - Add ESC key interrupt support in input handling - Implement /context and /compact commands for monitoring and managing context usage

gemini-code-assist · 2026-01-06T14:12:56Z

Summary of Changes

Hello @minpeter, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the agent's ability to manage conversation context, ensuring more stable and efficient long-running interactions. It introduces automatic and manual mechanisms for keeping the conversation within token limits, provides visibility into context usage, and improves user control by allowing the interruption of AI responses. These changes aim to make the agent more robust and user-friendly for extended coding sessions.

Highlights

Context Management System: Introduced a comprehensive context management system with ContextTracker and ContextCompactor utilities. This system monitors token usage, automatically compacts conversation history when a defined threshold is met, and provides debug logging for context usage.
Auto-Compaction: The agent now automatically summarizes older parts of the conversation to stay within the model's context window, preventing out-of-memory errors and improving efficiency for long interactions.
New Commands for Context Control: Added /context to display current token usage and compaction status, and /compact to manually trigger conversation summarization.
Conversation Abort Functionality: Users can now press the ESC key to interrupt an ongoing AI response, providing more control over the interaction.
Default Model Update: The default AI model has been updated from LGAI-EXAONE/K-EXAONE-236B-A23B to zai-org/GLM-4.6.
FriendliAI Client Enhancements: The FriendliAI client now includes an includeUsage flag, enabling more accurate token usage tracking directly from the API.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive context management system for the agent, including token tracking, automatic context compaction, and manual controls via new commands. It also adds the ability to abort ongoing conversations with the ESC key and updates the default model. The implementation is robust, with good error handling and asynchronous control flow. I've identified a couple of areas for improvement: one to fix the logic for fallback context size estimation, and another to refactor a function for better readability. Overall, this is an excellent and significant feature addition.

gemini-code-assist · 2026-01-06T14:15:37Z

src/agent.ts

+function shouldContinueAfterTools(messages: ModelMessage[]): boolean {
+  let lastToolIndex = -1;
+  for (let i = 0; i < messages.length; i += 1) {
+    if (messages[i]?.role === "tool") {
+      lastToolIndex = i;
+    }
+  }
+  if (lastToolIndex === -1) {
+    return false;
+  }
+  for (let i = lastToolIndex + 1; i < messages.length; i += 1) {
+    if (assistantMessageHasText(messages[i])) {
+      return false;
+    }
+  }
+  return true;
+}


The shouldContinueAfterTools function can be simplified for better readability and maintainability by using modern array methods like findLastIndex and some. The current implementation with for loops is correct but more verbose than necessary.

function shouldContinueAfterTools(messages: ModelMessage[]): boolean { const lastToolIndex = messages.findLastIndex((msg) => msg.role === "tool"); if (lastToolIndex === -1) { return false; } // Check if there is any assistant message with text after the last tool message. const subsequentMessages = messages.slice(lastToolIndex + 1); return !subsequentMessages.some(assistantMessageHasText); }

gemini-code-assist · 2026-01-06T14:15:37Z

src/utils/context-tracker.ts

+export class ContextTracker {
+  private readonly config: ContextConfig;
+  private totalInputTokens = 0;
+  private totalOutputTokens = 0;
+  private stepCount = 0;
+  private currentContextTokens: number | null = null;
+
+  constructor(config: Partial<ContextConfig> = {}) {
+    this.config = { ...DEFAULT_CONFIG, ...config };
+  }
+
+  setMaxContextTokens(tokens: number): void {
+    this.config.maxContextTokens = tokens;
+  }
+
+  setCompactionThreshold(threshold: number): void {
+    if (threshold < 0 || threshold > 1) {
+      throw new Error("Compaction threshold must be between 0 and 1");
+    }
+    this.config.compactionThreshold = threshold;
+  }
+
+  updateUsage(usage: LanguageModelUsage): void {
+    this.totalInputTokens += usage.inputTokens ?? 0;
+    this.totalOutputTokens += usage.outputTokens ?? 0;
+    this.stepCount++;
+  }
+
+  /**
+   * Set the exact current context token count.
+   */
+  setContextTokens(tokens: number): void {
+    this.currentContextTokens = Math.max(0, Math.round(tokens));
+  }
+
+  /**
+   * Set total usage directly (useful after compaction or when loading state)
+   */
+  setTotalUsage(inputTokens: number, outputTokens: number): void {
+    this.totalInputTokens = inputTokens;
+    this.totalOutputTokens = outputTokens;
+  }
+
+  /**
+   * Get estimated current context size
+   * Note: This is an approximation based on accumulated usage
+   */
+  getEstimatedContextTokens(): number {
+    // The input tokens from the last request roughly represents
+    // the current context size (system prompt + conversation history)
+    return this.totalInputTokens > 0
+      ? Math.round(this.totalInputTokens / Math.max(this.stepCount, 1))
+      : 0;
+  }
+
+  getStats(): ContextStats {
+    const totalTokens =
+      this.currentContextTokens ?? this.getEstimatedContextTokens();
+    const usagePercentage = totalTokens / this.config.maxContextTokens;
+    const shouldCompact = usagePercentage >= this.config.compactionThreshold;
+
+    return {
+      totalTokens,
+      inputTokens: this.totalInputTokens,
+      outputTokens: this.totalOutputTokens,
+      maxContextTokens: this.config.maxContextTokens,
+      usagePercentage,
+      shouldCompact,
+    };
+  }
+
+  shouldCompact(): boolean {
+    return this.getStats().shouldCompact;
+  }
+
+  reset(): void {
+    this.totalInputTokens = 0;
+    this.totalOutputTokens = 0;
+    this.stepCount = 0;
+    this.currentContextTokens = 0;
+  }
+
+  /**
+   * Called after compaction to adjust token counts
+   * @param newInputTokens The token count of the compacted context
+   */
+  afterCompaction(newInputTokens: number): void {
+    this.totalInputTokens = newInputTokens;
+    this.totalOutputTokens = 0;
+    this.stepCount = 1;
+    this.currentContextTokens = Math.max(0, Math.round(newInputTokens));
+  }
+
+  getConfig(): ContextConfig {
+    return { ...this.config };
+  }
+}


The current implementation of getEstimatedContextTokens calculates the average input tokens per step, which doesn't accurately reflect the current context size as the conversation grows. The comment for the function correctly states that the input tokens from the last request are a better estimate. This can be fixed by storing the last input token count in the tracker. This change improves the accuracy of the fallback context size estimation, making the tracker more robust when the precise token measurement API fails.

export class ContextTracker { private readonly config: ContextConfig; private totalInputTokens = 0; private totalOutputTokens = 0; private stepCount = 0; private currentContextTokens: number | null = null; private lastInputTokens = 0; constructor(config: Partial<ContextConfig> = {}) { this.config = { ...DEFAULT_CONFIG, ...config }; } setMaxContextTokens(tokens: number): void { this.config.maxContextTokens = tokens; } setCompactionThreshold(threshold: number): void { if (threshold < 0 || threshold > 1) { throw new Error("Compaction threshold must be between 0 and 1"); } this.config.compactionThreshold = threshold; } updateUsage(usage: LanguageModelUsage): void { this.totalInputTokens += usage.inputTokens ?? 0; this.totalOutputTokens += usage.outputTokens ?? 0; this.stepCount++; this.lastInputTokens = usage.inputTokens ?? 0; } /** * Set the exact current context token count. */ setContextTokens(tokens: number): void { this.currentContextTokens = Math.max(0, Math.round(tokens)); } /** * Set total usage directly (useful after compaction or when loading state) */ setTotalUsage(inputTokens: number, outputTokens: number): void { this.totalInputTokens = inputTokens; this.totalOutputTokens = outputTokens; } /** * Get estimated current context size * Note: This is an approximation based on accumulated usage */ getEstimatedContextTokens(): number { // The input tokens from the last request roughly represents // the current context size (system prompt + conversation history) return this.lastInputTokens; } getStats(): ContextStats { const totalTokens = this.currentContextTokens ?? this.getEstimatedContextTokens(); const usagePercentage = totalTokens / this.config.maxContextTokens; const shouldCompact = usagePercentage >= this.config.compactionThreshold; return { totalTokens, inputTokens: this.totalInputTokens, outputTokens: this.totalOutputTokens, maxContextTokens: this.config.maxContextTokens, usagePercentage, shouldCompact, }; } shouldCompact(): boolean { return this.getStats().shouldCompact; } reset(): void { this.totalInputTokens = 0; this.totalOutputTokens = 0; this.stepCount = 0; this.currentContextTokens = 0; this.lastInputTokens = 0; } /** * Called after compaction to adjust token counts * @param newInputTokens The token count of the compacted context */ afterCompaction(newInputTokens: number): void { this.totalInputTokens = newInputTokens; this.totalOutputTokens = 0; this.stepCount = 1; this.currentContextTokens = Math.max(0, Math.round(newInputTokens)); this.lastInputTokens = newInputTokens; } getConfig(): ContextConfig { return { ...this.config }; } }

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6476bf824d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-06T14:24:08Z

src/index.ts

+    const onLine = (line: string) => {
+      rl.removeListener("close", onClose);
+      rl.pause();
      resolve(line);


Resume stdin before streaming to enable ESC abort

The new ESC abort handler depends on emitKeypressEvents(process.stdin), but readline.pause() also calls process.stdin.pause(), which stops data events (and thus keypress). Because stdin is paused after each line and never resumed before agent.chat, pressing ESC during streaming won't trigger agent.abort(), so the interrupt feature silently fails. Consider resuming stdin (or skipping the pause) before starting the streaming chat loop.

Useful? React with 👍 / 👎.

minpeter added 3 commits January 6, 2026 20:54

feat: context management follow-ups

6476bf8

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Jan 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

context mgmt #1

context mgmt #1

Uh oh!

minpeter commented Jan 6, 2026

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Uh oh!

gemini-code-assist bot Jan 6, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

context mgmt #1

Are you sure you want to change the base?

context mgmt #1

Uh oh!

Conversation

minpeter commented Jan 6, 2026

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants