BioRAG — Biomedical Platform Support (RAG from Scratch)

A Retrieval-Augmented Generation (RAG) pipeline that answers technical support questions about a biomedical platform using only local resources. Documents are ingested, chunked, and embedded into a ChromaDB vector store; user queries are matched against this store and answered by a local LLM served through LMStudio.

Architecture

┌──────────────┐     ┌───────────────┐     ┌────────────┐     ┌───────────┐
│  Documents   │────▶│  Ingest &     │────▶│  ChromaDB  │────▶│  Gradio   │
│  (.pdf .docx │     │  Chunk & Embed│     │  Vector    │     │  Chat UI  │
│   .txt .md)  │     │  (ingest.py)  │     │  Store     │     │  (app.py) │
└──────────────┘     └───────────────┘     └────────────┘     └─────┬─────┘
                                                                    │
                           ┌────────────────────────────────────────┘
                           │  query
                           ▼
                     ┌───────────┐     ┌─────────────┐
                     │ Retriever │────▶│ LMStudio    │
                     │ (top-K)   │     │ local LLM   │
                     └───────────┘     └─────────────┘

Key design goal: after the first run (which downloads the embedding model), the system operates fully offline — no calls to HuggingFace Hub or any external service.

Project Structure

BioRAG-from-scratch/
├── .env                        # Configuration (ports, model names, paths)
├── requirements.txt            # Python dependencies
├── run_ingest.py               # Entry point: ingest documents into vector store
├── run_test_retriever.py       # Entry point: test retriever with a sample query
├── data/
│   └── documentation/          # Place your source documents here
├── models/                     # Auto-populated: cached embedding weights
├── vector_store/               # Auto-populated: ChromaDB persistence
├── src/
│   ├── __init__.py
│   ├── app.py                  # Gradio chat interface (entry point)
│   ├── embeddings.py           # Shared embedding loader + offline mode
│   ├── ingest.py               # Document loading, chunking, indexing
│   ├── llm_client.py           # LMStudio / OpenAI-compatible LLM client
│   └── retriever.py            # Vector similarity search
└── tests/
    ├── __init__.py
    └── validate.py             # Import & configuration smoke tests

Prerequisites

Python 3.10 – 3.12 (tested; 3.13+ may work but is not verified)
LMStudio running locally with a loaded model (default endpoint: http://localhost:1234/v1)
Internet access only for the first run (to download the all-MiniLM-L6-v2 embedding model)

Installation

Clone or unzip the project and cd into it:
```
cd BioRAG-from-scratch
```

Create and activate a virtual environment (recommended):

python -m venv .venv
source .venv/bin/activate        # Linux / macOS
.venv\Scripts\activate           # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Review .env and adjust if needed:

LMSTUDIO_BASE_URL=http://localhost:1234/v1
LMSTUDIO_MODEL=local-model
EMBEDDING_MODEL=all-MiniLM-L6-v2
CHROMA_PERSIST_DIR=./vector_store
HF_CACHE_DIR=./models
CHUNK_SIZE=800
CHUNK_OVERLAP=200
TOP_K=4

Variable	Purpose
`LMSTUDIO_BASE_URL`	LMStudio server address
`LMSTUDIO_MODEL`	Model identifier loaded in LMStudio
`EMBEDDING_MODEL`	Sentence-transformers model for embeddings
`HF_CACHE_DIR`	Local directory for cached model weights
`CHROMA_PERSIST_DIR`	Local directory for the vector store
`CHUNK_SIZE` / `CHUNK_OVERLAP`	Text chunking parameters (characters)
`TOP_K`	Number of chunks retrieved per query

Usage

Step 1 — Add your documents

Place your PDF, DOCX, TXT, and/or Markdown files into data/documentation/. Subdirectories are supported.

Step 2 — Ingest documents

Run the ingestion pipeline to load, chunk, embed, and persist your documents:

python run_ingest.py

On the first run this will download the embedding model (~80 MB) into ./models/. Every subsequent run loads the model from disk with no network access.

Step 3 — Start LMStudio

Open LMStudio, load a model (e.g. Mistral, Llama, Phi), and start the local server on port 1234 (default).

Step 4 — Launch the chat interface

python -m src.app

Open the URL printed in the terminal (usually http://127.0.0.1:7860).

Testing

Smoke tests (imports & configuration)

Verify that all dependencies are installed correctly and the project modules load without errors:

python -m tests.validate

Expected output:

1. Checking core imports …
  ✔ langchain_community.document_loaders
  ✔ langchain_community.embeddings.HuggingFaceEmbeddings
  ✔ langchain_community.vectorstores.Chroma
  ✔ langchain_text_splitters.RecursiveCharacterTextSplitter
  ...

All checks passed ✓

Manual retriever test

After ingesting documents you can test the retriever directly:

python run_test_retriever.py

Verifying offline mode

After the first successful run, confirm that no network calls are made by disconnecting from the network (disable Wi-Fi / unplug Ethernet) and running:

python run_test_retriever.py

It should load the model and return results exactly as before.

Troubleshooting

Symptom	Likely cause	Fix
`ModuleNotFoundError: No module named 'langchain'`	Old dependency installed	Run `pip install -r requirements.txt` in a clean venv
`ModuleNotFoundError: No module named 'docx2txt'`	Wrong docx package	`pip install docx2txt` (not `python-docx`)
`Connection refused` on launch	LMStudio not running	Start LMStudio and load a model
`OSError: [Errno 28] No space left on device`	Disk full during model download	Free space; `./models/` needs ~80 MB
Model re-downloads every time	`HF_CACHE_DIR` not set or pointing to empty dir	Check `.env` — `HF_CACHE_DIR=./models`

License

This project is provided as-is for educational purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data/documentation		data/documentation
src		src
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENT_PROMPT.md		AGENT_PROMPT.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_ingest.py		run_ingest.py
run_test_retriever.py		run_test_retriever.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BioRAG — Biomedical Platform Support (RAG from Scratch)

Architecture

Project Structure

Prerequisites

Installation

Usage

Step 1 — Add your documents

Step 2 — Ingest documents

Step 3 — Start LMStudio

Step 4 — Launch the chat interface

Testing

Smoke tests (imports & configuration)

Manual retriever test

Verifying offline mode

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BioRAG — Biomedical Platform Support (RAG from Scratch)

Architecture

Project Structure

Prerequisites

Installation

Usage

Step 1 — Add your documents

Step 2 — Ingest documents

Step 3 — Start LMStudio

Step 4 — Launch the chat interface

Testing

Smoke tests (imports & configuration)

Manual retriever test

Verifying offline mode

Troubleshooting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages