feat: add headless LLM service and migrate ghost cell suggestions by nstrayer · Pull Request #13730 · posit-dev/positron

nstrayer · 2026-05-22T20:46:52Z

Summary

Adds IPositronLMService, a workbench service that streams LLM responses directly from the shared process via ai-provider-bridge, bypassing the extension host and vscode.lm. Migrates ghost cell suggestions to use the new service (behind the existing useHeadlessService feature flag), with a dedicated model picker, context builder, and streaming generator.

Motivation: The previous path routed ghost cell requests through the assistant extension, leaving model selection opaque to the workbench and duplicating credential management. The headless service gives the workbench explicit control over model resolution and credentials, with lower latency as a bonus.

Related:

posit-dev/ai-provider-bridge -- standalone provider bridge package
assistant#1427 -- extracts provider-bridge from assistant monorepo

Note: The ai-provider-bridge dependency is currently resolved via "file:..." link. Once the package is published, the resolution needs to be updated to point at a GitHub release (or npm package).

Release Notes

New Features

Added IPositronLMService for direct LLM streaming from the workbench, with tier-based/pattern-based/exact model selection and eager model caching
Added "Select Ghost Cell Model" command with a grouped model picker (per-provider)
Ghost cell suggestions can now use the headless service path for lower latency (opt-in via positron.assistant.notebook.ghostCellSuggestions.useHeadlessService)

Bug Fixes

N/A

Validation Steps

@:positron-notebooks @:assistant

Enable positron.assistant.notebook.ghostCellSuggestions.useHeadlessService in settings
Open a Python or R notebook with at least one executed cell
Verify ghost cell suggestions stream in after a brief delay
Open the command palette and run "Select Ghost Cell Model"
Verify the picker shows models grouped by provider; pin a model
Trigger another ghost cell suggestion and verify it uses the pinned model
Select "Use Default" in the picker and verify it reverts to tier-based auto-selection
Navigate away mid-generation and verify streaming cancels cleanly (no errors in console)
Sign out of all LLM auth providers and verify ghost cells degrade gracefully (no crash, suggestion simply doesn't appear)

Unit tests:

npx vitest run src/vs/workbench/services/positronLM/test/
npx vitest run src/vs/workbench/contrib/positronNotebook/test/common/notebookLMContext.vitest.ts

github-actions · 2026-05-22T20:47:13Z

E2E Tests 🚀
This PR will run tests tagged with: @:critical @:positron-notebooks @:assistant

^readme ^{valid tags}

Introduces IPositronLMService -- a workbench service that streams LLM responses via @assistant/provider-bridge in the shared process, bypassing vscode.lm entirely. Ghost cell suggestions can optionally use this path via a feature flag (positron.assistant.notebook.ghostCellSuggestions.useHeadlessService). Architecture: - Shared process: stateless ProviderRegistry (lazy-loaded), 16ms batched streaming with in-band sentinels, cancellation via EventDispose - Renderer: per-request credential resolution (5s cache), racing model fetch across providers, AsyncIterableSource bridging - Generator: single deep module the controller calls with one function New files: - src/vs/workbench/services/positronLM/ (common, browser, node, test) - src/vs/workbench/contrib/positronNotebook/.../generation/generator.ts

- Move browser/ to electron-browser/ (ISharedProcessService requires electron-browser layer) - Replace `in` operator with typeof check for stream event sentinel - Fix double-quoted strings in lexer (use escaped single quotes) - Add eslint-disable for IServerChannel any signatures and dynamic import - Add eslint-disable for sharedProcessMain workbench import (to be moved to vs/platform/ in a follow-up)

- Use discriminated IPC stream events ({type:'data'|'end'|'error'}) instead of plain string sentinels to prevent text chunks containing "end" from prematurely closing the stream - Fix wrong config key for maxVariables (was positronNotebook.ghostCell. maxVariables, should be the registered constant) - Include error/stderr output in ghost cell context so the model can suggest fixes based on actual exception details - Add 60s outer timeout around headless generation to prevent indefinite loading state if model resolution or streaming stalls

The timeout promise now cancels the CancellationTokenSource so the underlying LM stream stops, preventing stale progress callbacks from updating the UI after a timeout error. The timeout handle is also cleared on normal completion to avoid leaking the timer.

Track a `timedOut` flag so the catch block can differentiate between user-initiated cancellation (silently hide) and timeout-initiated cancellation (show error state). Previously the timeout called cts.cancel() then rejected, but the catch block suppressed all errors when token.isCancellationRequested was true, causing the ghost cell to stay stuck in loading/streaming state on timeout.

Replace all console.log/error/warn calls with _logService at appropriate levels (debug/trace/error). Remove the ping/pong IPC health-check that was used during development.

…ovider-bridge The package was renamed as part of extraction to its own GitHub repo (posit-dev/llm-provider-bridge). Update the file: link and dynamic imports to use the new unscoped package name.

…ority Introduces IPositronLMService -- a workbench service that streams LLM responses via ai-provider-bridge in the shared process, bypassing vscode.lm entirely. Ghost cell suggestions use this path when the useHeadlessService feature flag is enabled. Architecture: - Shared process: PositronLMNode with lazy-loaded ProviderRegistry, 16ms batched streaming, cancellation via last-listener dispose - Renderer: eager model cache warmed at activation, session-change refresh, provider-priority matching, per-request credential re-resolve - Generator: single deep module parsing structured XML via StreamingTagLexer - Context: shared buildNotebookLMContext helper with variable priority Also extracts notebook LM context building into a shared helper with full test coverage, and adds vitest tests for provider-priority matching and the streaming tag lexer.

The headless LM service was tightly coupled to ghost cell configuration and only worked on desktop. This refactor: - Introduces ModelSelection discriminated union (tier/id/patterns) so callers express intent without the service knowing who's calling - Returns StreamResult with failure reason instead of null, enabling informative user-facing messages - Adds availableModels + onDidChangeAvailableModels for picker UIs - Extracts logic into AbstractPositronLMService in common/, with thin platform subclasses for desktop (electron-browser/) and server-hosted (browser/) deployments - Registers tier settings (positronLM.models.fastcheap) owned by the service rather than ghost-cell-specific config - Ghost cells now forward their own config as { patterns } to the service

Standalone function that shows a QuickPick grouped by provider, returns the user's selection without persisting it. Consumers own their own storage. "Use Default" option shown only when a model is currently pinned.

Register selectGhostCellModel command in the workbench that uses IPositronLMService.availableModels with the reusable showModelPicker utility. This shows models from the same catalog the service uses for generation, fixing the mismatch where the extension picker showed vscode.lm models (only Bedrock) while generation used the headless service (all providers).

- electron-browser: always use shared process channel (remote server does not register positronLM channel yet) - Model picker: localize all user-facing strings - Ghost cell action: localize picker title - config.ts: update command link from old extension command - README: add missing hasKey import to usage example

- Store the onCancellationRequested disposable so it's cleaned up when the stream ends normally (prevents leak on long-lived tokens) - Inline toProviderCredentials() -- it was a no-op identity function. Comment left at the site for future extraction if field stripping becomes necessary. - Add rationale comment above PROVIDER_PRIORITY ordering

The command was a NotebookAction2, which silently no-ops when no notebook is active. The settings markdown description links to this command, so clicking "Select a model" from Settings would do nothing. Converting to a plain Action2 removes the notebook precondition -- the picker only needs IPositronLMService and IConfigurationService.

Provider display names now go through nls.localize() for i18n. Fixed deepseek auth provider ID ('deepseek' -> 'deepseek-api') caught by new sync test that asserts AUTH_PROVIDER_MAP stays consistent with the bridge's PROVIDER_MAP.

Avoids layering lint warnings (code-import-patterns, code-amd-node-module) by importing at runtime inside the test body.

Replace the local file: link with the GitHub release tarball so CI and other developers can resolve the dependency. Install all required peer deps (@ai-sdk/*, @aws-sdk/*, @github/copilot-sdk, etc.) that the providers entrypoint needs at runtime.

nstrayer · 2026-05-27T18:58:27Z

+/*---------------------------------------------------------------------------------------------
+ *  Copyright (C) 2026 Posit Software, PBC. All rights reserved.
+ *  Licensed under the Elastic License 2.0. See LICENSE.txt for license information.
+ *--------------------------------------------------------------------------------------------*/
+
+/**
+ * A chunk representing a tag boundary (open or close).
+ */
+export interface TagChunk {
+	type: 'tag';
+	name: string;
+	kind: 'open' | 'close';
+}


This is fully copied from the positron-assistant extension. Same code exactly.

nstrayer added 18 commits May 27, 2026 14:17

fix: remove debug console logs and ping/pong from headless LM service

fe02ce4

Replace all console.log/error/warn calls with _logService at appropriate levels (debug/trace/error). Remove the ping/pong IPC health-check that was used during development.

refactor: rename dependency from @assistant/provider-bridge to llm-pr…

aeb1216

…ovider-bridge The package was renamed as part of extraction to its own GitHub repo (posit-dev/llm-provider-bridge). Update the file: link and dynamic imports to use the new unscoped package name.

fix: replace in operator with hasKey utility per codebase lint rules

16e692b

feat: add reusable model picker utility for LM service consumers

2ae39f6

Standalone function that shows a QuickPick grouped by provider, returns the user's selection without persisting it. Consumers own their own storage. "Use Default" option shown only when a model is currently pinned.

fix: use dynamic import for ai-provider-bridge in sync test

e6c1bd6

Avoids layering lint warnings (code-import-patterns, code-amd-node-module) by importing at runtime inside the test body.

nstrayer force-pushed the feature/headless-lm-service branch from abb8d39 to 4aebce4 Compare May 27, 2026 18:39

nstrayer commented May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add headless LLM service and migrate ghost cell suggestions#13730

feat: add headless LLM service and migrate ghost cell suggestions#13730
nstrayer wants to merge 18 commits into
mainfrom
feature/headless-lm-service

nstrayer commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026 •

edited

Loading

Uh oh!

nstrayer May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nstrayer commented May 22, 2026

Summary

Release Notes

New Features

Bug Fixes

Validation Steps

Uh oh!

github-actions Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nstrayer May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 22, 2026 •

edited

Loading