feat: add headless LLM service and migrate ghost cell suggestions#13730
Open
nstrayer wants to merge 18 commits into
Open
feat: add headless LLM service and migrate ghost cell suggestions#13730nstrayer wants to merge 18 commits into
nstrayer wants to merge 18 commits into
Conversation
|
E2E Tests 🚀 |
Introduces IPositronLMService -- a workbench service that streams LLM responses via @assistant/provider-bridge in the shared process, bypassing vscode.lm entirely. Ghost cell suggestions can optionally use this path via a feature flag (positron.assistant.notebook.ghostCellSuggestions.useHeadlessService). Architecture: - Shared process: stateless ProviderRegistry (lazy-loaded), 16ms batched streaming with in-band sentinels, cancellation via EventDispose - Renderer: per-request credential resolution (5s cache), racing model fetch across providers, AsyncIterableSource bridging - Generator: single deep module the controller calls with one function New files: - src/vs/workbench/services/positronLM/ (common, browser, node, test) - src/vs/workbench/contrib/positronNotebook/.../generation/generator.ts
- Move browser/ to electron-browser/ (ISharedProcessService requires electron-browser layer) - Replace `in` operator with typeof check for stream event sentinel - Fix double-quoted strings in lexer (use escaped single quotes) - Add eslint-disable for IServerChannel any signatures and dynamic import - Add eslint-disable for sharedProcessMain workbench import (to be moved to vs/platform/ in a follow-up)
- Use discriminated IPC stream events ({type:'data'|'end'|'error'})
instead of plain string sentinels to prevent text chunks containing
"end" from prematurely closing the stream
- Fix wrong config key for maxVariables (was positronNotebook.ghostCell.
maxVariables, should be the registered constant)
- Include error/stderr output in ghost cell context so the model can
suggest fixes based on actual exception details
- Add 60s outer timeout around headless generation to prevent indefinite
loading state if model resolution or streaming stalls
The timeout promise now cancels the CancellationTokenSource so the underlying LM stream stops, preventing stale progress callbacks from updating the UI after a timeout error. The timeout handle is also cleared on normal completion to avoid leaking the timer.
Track a `timedOut` flag so the catch block can differentiate between user-initiated cancellation (silently hide) and timeout-initiated cancellation (show error state). Previously the timeout called cts.cancel() then rejected, but the catch block suppressed all errors when token.isCancellationRequested was true, causing the ghost cell to stay stuck in loading/streaming state on timeout.
Replace all console.log/error/warn calls with _logService at appropriate levels (debug/trace/error). Remove the ping/pong IPC health-check that was used during development.
…ovider-bridge The package was renamed as part of extraction to its own GitHub repo (posit-dev/llm-provider-bridge). Update the file: link and dynamic imports to use the new unscoped package name.
…ority Introduces IPositronLMService -- a workbench service that streams LLM responses via ai-provider-bridge in the shared process, bypassing vscode.lm entirely. Ghost cell suggestions use this path when the useHeadlessService feature flag is enabled. Architecture: - Shared process: PositronLMNode with lazy-loaded ProviderRegistry, 16ms batched streaming, cancellation via last-listener dispose - Renderer: eager model cache warmed at activation, session-change refresh, provider-priority matching, per-request credential re-resolve - Generator: single deep module parsing structured XML via StreamingTagLexer - Context: shared buildNotebookLMContext helper with variable priority Also extracts notebook LM context building into a shared helper with full test coverage, and adds vitest tests for provider-priority matching and the streaming tag lexer.
The headless LM service was tightly coupled to ghost cell configuration
and only worked on desktop. This refactor:
- Introduces ModelSelection discriminated union (tier/id/patterns) so
callers express intent without the service knowing who's calling
- Returns StreamResult with failure reason instead of null, enabling
informative user-facing messages
- Adds availableModels + onDidChangeAvailableModels for picker UIs
- Extracts logic into AbstractPositronLMService in common/, with thin
platform subclasses for desktop (electron-browser/) and server-hosted
(browser/) deployments
- Registers tier settings (positronLM.models.fastcheap) owned by the
service rather than ghost-cell-specific config
- Ghost cells now forward their own config as { patterns } to the service
Standalone function that shows a QuickPick grouped by provider, returns the user's selection without persisting it. Consumers own their own storage. "Use Default" option shown only when a model is currently pinned.
Register selectGhostCellModel command in the workbench that uses IPositronLMService.availableModels with the reusable showModelPicker utility. This shows models from the same catalog the service uses for generation, fixing the mismatch where the extension picker showed vscode.lm models (only Bedrock) while generation used the headless service (all providers).
- electron-browser: always use shared process channel (remote server does not register positronLM channel yet) - Model picker: localize all user-facing strings - Ghost cell action: localize picker title - config.ts: update command link from old extension command - README: add missing hasKey import to usage example
- Store the onCancellationRequested disposable so it's cleaned up when the stream ends normally (prevents leak on long-lived tokens) - Inline toProviderCredentials() -- it was a no-op identity function. Comment left at the site for future extraction if field stripping becomes necessary. - Add rationale comment above PROVIDER_PRIORITY ordering
The command was a NotebookAction2, which silently no-ops when no notebook is active. The settings markdown description links to this command, so clicking "Select a model" from Settings would do nothing. Converting to a plain Action2 removes the notebook precondition -- the picker only needs IPositronLMService and IConfigurationService.
Provider display names now go through nls.localize() for i18n. Fixed
deepseek auth provider ID ('deepseek' -> 'deepseek-api') caught by new
sync test that asserts AUTH_PROVIDER_MAP stays consistent with the
bridge's PROVIDER_MAP.
Avoids layering lint warnings (code-import-patterns, code-amd-node-module) by importing at runtime inside the test body.
Replace the local file: link with the GitHub release tarball so CI and other developers can resolve the dependency. Install all required peer deps (@ai-sdk/*, @aws-sdk/*, @github/copilot-sdk, etc.) that the providers entrypoint needs at runtime.
abb8d39 to
4aebce4
Compare
nstrayer
commented
May 27, 2026
Comment on lines
+1
to
+13
| /*--------------------------------------------------------------------------------------------- | ||
| * Copyright (C) 2026 Posit Software, PBC. All rights reserved. | ||
| * Licensed under the Elastic License 2.0. See LICENSE.txt for license information. | ||
| *--------------------------------------------------------------------------------------------*/ | ||
|
|
||
| /** | ||
| * A chunk representing a tag boundary (open or close). | ||
| */ | ||
| export interface TagChunk { | ||
| type: 'tag'; | ||
| name: string; | ||
| kind: 'open' | 'close'; | ||
| } |
Contributor
Author
There was a problem hiding this comment.
This is fully copied from the positron-assistant extension. Same code exactly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #13728
Part of #13729
Summary
Adds
IPositronLMService, a workbench service that streams LLM responses directly from the shared process viaai-provider-bridge, bypassing the extension host andvscode.lm. Migrates ghost cell suggestions to use the new service (behind the existinguseHeadlessServicefeature flag), with a dedicated model picker, context builder, and streaming generator.Motivation: The previous path routed ghost cell requests through the assistant extension, leaving model selection opaque to the workbench and duplicating credential management. The headless service gives the workbench explicit control over model resolution and credentials, with lower latency as a bonus.
Related:
Note: The
ai-provider-bridgedependency is currently resolved via"file:..."link. Once the package is published, the resolution needs to be updated to point at a GitHub release (or npm package).Release Notes
New Features
IPositronLMServicefor direct LLM streaming from the workbench, with tier-based/pattern-based/exact model selection and eager model cachingpositron.assistant.notebook.ghostCellSuggestions.useHeadlessService)Bug Fixes
Validation Steps
@:positron-notebooks @:assistant
positron.assistant.notebook.ghostCellSuggestions.useHeadlessServicein settingsUnit tests: