feat: add /api/retrieve endpoint for pure vector search by MuLeiSY2021 · Pull Request #496 · AsyncFuncAI/deepwiki-open

MuLeiSY2021 · 2026-03-25T10:01:36Z

Summary

Add POST /api/retrieve endpoint that performs pure FAISS vector similarity search over indexed repository code chunks without calling any LLM
Enables external tools (MCP servers, IDE plugins, CLI tools) to leverage deepwiki-open's RAG index as a code search backend
No LLM API key required — only uses the existing embedding model for query vectorization

API

POST /api/retrieve
{
  "repo_url": "https://github.com/user/repo",
  "query": "authentication middleware",
  "type": "github",
  "top_k": 5
}

Response:

{
  "query": "authentication middleware",
  "total_chunks": 500,
  "results": [
    {
      "text": "func AuthMiddleware(...) { ... }",
      "file_path": "middleware/auth.go",
      "is_code": true,
      "token_count": 350
    }
  ]
}

Motivation

Currently deepwiki-open's RAG retrieval is tightly coupled with LLM generation in /chat/completions/stream. This PR separates the retrieval step into its own endpoint, which:

Enables MCP integration — external tools like deepwiki-mcp can provide deepwiki's code search as context to Claude Code, Cursor, etc.
Reduces cost — retrieval-only queries don't consume LLM tokens
Enables new use cases — code search APIs, IDE extensions, CI/CD integrations

Test plan

Tested locally with indexed repositories
Verified embedding validation filters inconsistent vector dimensions
Verified Ollama embedder compatibility path
Unit tests (TODO)

🤖 Generated with Claude Code

Add a new POST /api/retrieve endpoint that performs semantic search over a repository's indexed code chunks using FAISS, returning the most relevant source code snippets without calling any LLM. This enables external tools (e.g. MCP servers, IDE plugins) to leverage deepwiki-open's RAG vector index as a code search backend, without requiring LLM API keys or incurring generation costs. Request: { repo_url, query, type?, token?, top_k? } Response: { query, total_chunks, results: [{ text, file_path, is_code, token_count }] } Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist · 2026-03-25T10:01:51Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the deepwiki-open project by introducing a dedicated API endpoint for direct vector-based code retrieval. This change broadens the utility of the existing RAG index, allowing external applications to perform efficient code searches independently of LLM generation, thereby expanding integration possibilities and optimizing resource usage.

Highlights

New API Endpoint: Introduced a POST /api/retrieve endpoint for pure FAISS vector similarity search over indexed repository code chunks, without involving any LLM.
External Tool Integration: Enabled external tools, such as MCP servers, IDE plugins, and CLI tools, to leverage deepwiki-open's RAG index as a code search backend.
Cost Efficiency: The new retrieval-only endpoint does not require an LLM API key, reducing operational costs by not consuming LLM tokens.
Decoupling Retrieval: Decoupled the retrieval step from LLM generation, allowing for more flexible and independent use of the RAG capabilities.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new /api/retrieve endpoint for pure RAG retrieval, which processes repository data, generates embeddings, and uses FAISS to retrieve relevant code chunks. The review identifies several areas for improvement, including a potential security risk by exposing raw exception details in 500 errors, the need to move local imports to the top of the file for better readability and consistency, refactoring a method that uses RAG.__new__(RAG) into a static method, simplifying the query_embedder function by removing unused weakref and redundant variables, and improving API design by using a Pydantic BaseModel for the response.

gemini-code-assist · 2026-03-25T10:03:12Z

+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        logger.error(f"Retrieve error: {str(e)}")
+        raise HTTPException(status_code=500, detail=str(e))


Exposing raw exception details (str(e)) to the client in a 500 error can be a security risk, as it might leak internal implementation details, file paths, or library information that could be exploited. It's safer to return a generic error message for 500-level errors.

Suggested change

raise HTTPException(status_code=500, detail=str(e))

raise HTTPException(status_code=500, detail="An internal server error occurred during retrieval.")

gemini-code-assist · 2026-03-25T10:03:12Z

+    from api.rag import RAG
+    from api.data_pipeline import DatabaseManager
+    from api.tools.embedder import get_embedder
+    from api.config import configs, get_embedder_type
+    from adalflow.components.retriever.faiss_retriever import FAISSRetriever


These imports are local to the retrieve function. It's generally better to place imports at the top of the file. This improves readability, makes dependencies clear, and allows for import errors to be caught at application startup rather than at runtime when the endpoint is first called. Other endpoints in this file follow this convention, so for consistency it would be good to move these.

gemini-code-assist · 2026-03-25T10:03:12Z

+        rag_instance = RAG.__new__(RAG)
+        valid_docs = rag_instance._validate_and_filter_embeddings(transformed_docs)


Using RAG.__new__(RAG) to create an uninitialized instance just to call _validate_and_filter_embeddings is unconventional and can be confusing. The _validate_and_filter_embeddings method in the RAG class does not use any instance-specific state, so it could be refactored into a @staticmethod.

This would allow you to call it directly on the class (RAG._validate_and_filter_embeddings(...)), which is cleaner and avoids misusing __new__. This would require a change in api/rag.py.

gemini-code-assist · 2026-03-25T10:03:12Z

+            import weakref
+            embedder_ref = embedder
+            def query_embedder(query):
+                if isinstance(query, list):
+                    query = query[0]
+                return embedder_ref(input=query)


The weakref module is imported but not used, and embedder_ref is a redundant variable. The query_embedder function can be simplified by removing them and using embedder directly from the outer scope.

def query_embedder(query): if isinstance(query, list): query = query[0] return embedder(input=query)

gemini-code-assist · 2026-03-25T10:03:12Z

+        return {
+            "query": request.query,
+            "total_chunks": len(valid_docs),
+            "results": [
+                {
+                    "text": doc.text,
+                    "file_path": doc.meta_data.get("file_path", ""),
+                    "is_code": doc.meta_data.get("is_code", False),
+                    "token_count": doc.meta_data.get("token_count", 0),
+                }
+                for doc in docs
+            ]
+        }


This endpoint returns a raw dictionary. For better type safety, automatic serialization, and API documentation, it's a good practice in FastAPI to define a Pydantic BaseModel for the response and use it in the endpoint decorator with response_model.

You could define models like this above the endpoint definition:

from typing import List class RetrieveResult(BaseModel): text: str file_path: str is_code: bool token_count: int class RetrieveResponse(BaseModel): query: str total_chunks: int results: List[RetrieveResult]

And then use it in the endpoint: @app.post("/api/retrieve", response_model=RetrieveResponse). The return statement would then need to return an instance of RetrieveResponse.

gemini-code-assist Bot reviewed Mar 25, 2026

View reviewed changes

MuLeiSY2021 mentioned this pull request Mar 25, 2026

Claude Code plugin: use deepwiki-open as RAG context provider via MCP #497

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add /api/retrieve endpoint for pure vector search#496

feat: add /api/retrieve endpoint for pure vector search#496
MuLeiSY2021 wants to merge 1 commit into
AsyncFuncAI:mainfrom
MuLeiSY2021:feat/api-retrieve-endpoint

MuLeiSY2021 commented Mar 25, 2026

Uh oh!

gemini-code-assist Bot commented Mar 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	raise HTTPException(status_code=500, detail=str(e))
	raise HTTPException(status_code=500, detail="An internal server error occurred during retrieval.")

		rag_instance = RAG.__new__(RAG)
		valid_docs = rag_instance._validate_and_filter_embeddings(transformed_docs)

Uh oh!

Conversation

MuLeiSY2021 commented Mar 25, 2026

Summary

API

Motivation

Test plan

Uh oh!

gemini-code-assist Bot commented Mar 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant