feat: tab completion for citation keys by wjiayis · Pull Request #110 · PaperDebugger/paperdebugger

wjiayis · 2026-02-04T12:02:23Z

In short, it'll

[Frontend] Recognize that user is trying to add a citation (trigger text is \cite{)
[Frontend] Temporarily suppress default Overleaf dropdown suggestions
[Frontend] Get the last sentence as context for LLM
[Backend] Fetch bibliography in .bib files as raw text, and remove irrelevant fields to save tokens
[Backend] Call XtraMCP to get paper abstract, using paper title as key
[Backend] Query a fast LLM (hardcoded to gpt-5.2 for now) to get at most 3 citation keys
[Frontend] Suppress default Overleaf tab-completion to allow users to accept citation suggestions

feat: v0 tab completion

Junyi-99 · 2026-02-05T04:48:07Z

Hi @wjiayis, thanks for the update. I’ve created a new issue to address the token expiration problem.

Regarding the latency issue, do we have visibility into which part of the pipeline contributes most to the high latency? For example, a rough breakdown across:

frontend → backend → LLM provider (reasoning + response)

I’ll take a look at this PR later this evening as well.

wjiayis · 2026-02-05T04:56:18Z

@Junyi-99 I haven't gotten to the latency breakdown yet, but I've settled everything else and I'm gonna work on this next. Thanks for helping to review when convenient, I'll update my findings when I have them too!

Junyi-99 · 2026-02-05T04:59:24Z

@wjiayis Got it, thanks for the update. Looking forward to your findings.

wjiayis · 2026-02-05T14:51:36Z

Root Cause

There's a ~20s latency in the inline-suggestion loop, and >99% of the latency comes from waiting for LLM to start responding. This issues arises because I'm passing in a large (but realistic) bibliography (the bibliography of PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing itself), and gpt-5-nano takes a while to parse it.

Solution

I think it's reasonable to expect that a regular user's max latency tolerance is ~2s. I'll implement the following 3 solutions to achieve that.

Model Selection

gpt-5-nano is takes a long time to process the long bibliography. Just swapping it out for gpt-5.2 brings latency down to 2-4s. But gpt-5.2 is expensive to call. I'll improve latency and cost with the next 2 solutions.

Prompt Caching

Since bibliography remains generally constant and takes up the bulk of the prompt, I'll use OpenAI's prompt caching - advertised to reduce latency by up to 80% and input token costs by up to 90%.

Place bibliography at the start of the prompt (prompt caching uses exact prefix match)
Run a "no-reply" LLM query at the start of each session and when database reloads, and config it to cache 24h
Each time \cite{ is triggered, the cached bibliography is used -> lower latency

Prompt Refinement

I'll remove info-sparse fields (eg doi, url, pages) and retain only info-rich fields (eg title, booktitle), to reduce the total size of bibliography by (hopefully) at least 40%.

cc: @Junyi-99

Copilot

Pull request overview

Implements a first version of citation-aware tab completion by detecting \cite{ in the editor, fetching suggested BibTeX keys from a new backend endpoint, and suppressing Overleaf’s default autocomplete/tab behavior to allow accepting inline suggestions.

Changes:

Frontend: Adds a “citation suggestions” beta setting, detects \cite{ triggers, fetches citation keys, and intercepts Tab to accept inline suggestions.
Backend: Adds GetCitationKeys RPC/HTTP endpoint, extracts and token-reduces .bib content, and queries an LLM for up to ~3 relevant citation keys.
Plumbing: Updates generated proto clients/servers and ignores CLAUDE.md.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
webapp/_webapp/src/views/settings/sections/beta-feature-settings.tsx	Renames the completion toggle UI to “citation suggestions”.
webapp/_webapp/src/views/settings/index.tsx	Renders the beta features section in Settings.
webapp/_webapp/src/query/api.ts	Adds `getCitationKeys()` API wrapper for the new v2 endpoint.
webapp/_webapp/src/pkg/gen/apiclient/chat/v2/chat_pb.ts	Generated TS client types/service updated for `GetCitationKeys`.
webapp/_webapp/src/libs/inline-suggestion.ts	Adds `\cite{` trigger handling and suppresses Overleaf autocomplete/tab behavior when suggestions are active.
proto/chat/v2/chat.proto	Adds `GetCitationKeys` RPC + request/response messages.
pkg/gen/api/chat/v2/chat_grpc.pb.go	Generated gRPC server/client updated with `GetCitationKeys`.
pkg/gen/api/chat/v2/chat.pb.gw.go	Generated grpc-gateway bindings for `/_pd/api/v2/chats/citation-keys`.
pkg/gen/api/chat/v2/chat.pb.go	Generated Go proto types updated with new messages and RPC.
internal/services/toolkit/client/get_citation_keys.go	Implements bibliography extraction and LLM prompt to suggest citation keys.
internal/api/chat/get_citation_keys.go	Implements the ChatServerV2 handler for `GetCitationKeys`.
.gitignore	Ignores `CLAUDE.md`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

proto/chat/v2/chat.proto

webapp/_webapp/src/libs/inline-suggestion.ts