lmLite

Browser-native LLM orchestration in JupyterLite - run language models entirely in your browser using WebAssembly.

Features

Browser-native: Run LLMs entirely in the browser, no server required
Zero setup: No installation, no API keys, works offline
Privacy-first: All computation happens locally in your browser
Multiple models: Support for GPT-2, DistilGPT-2, and more via Transformers.js
Embeddings: Generate text embeddings for semantic search
Similarity search: Built-in cosine similarity for RAG applications
Model caching: Download once, use offline forever
Pythonic API: Clean, async Python interface

Usage

from lmlite import LLM

# Create LLM instance (downloads model on first run)
llm = await LLM.create(generator_model="gpt2")

# Generate text
text = await llm.generate("Python is a great language because")
print(text)

# Generate embeddings
embedding = await llm.embed("Hello world")
print(embedding[:5])  # First 5 dimensions

# Similarity search
docs = [
    "JupyterLite runs entirely in the browser.",
    "Python is widely used for machine learning.",
    "TypeScript is great for frontend applications.",
]

results = await llm.similarity_search(
    "Where does JupyterLite run?",
    docs
)

for doc, score in results:
    print(f"{score:.3f} -> {doc}")

Configuration Options

llm = await LLM.create(
    generator_model="gpt2",           # or "distilgpt2", etc.
    embedding_model="all-MiniLM-L6-v2",
    max_new_tokens=50,
    temperature=0.7,
    top_k=50,
    do_sample=True,
    use_local_models=False,           # Auto-detect local models
    local_models_path="/drive/models"
)

Export Models for Offline Use

After using models in the browser, you can export them for offline use:

# Export to zip file (default)
await llm.export_model_files("gpt2")

# Export to directory
await llm.export_model_files("gpt2", as_zip=False)

The exported files will be saved to /drive/models/ and can be downloaded from JupyterLite.

Development

Prerequisites

pixi - Package manager (required)
micromamba - Conda package manager (required for JupyterLite builds)

Supported Platforms: macOS (Intel/Apple Silicon) and Linux. Windows is not currently supported for development.

Setup

# Clone the repository
git clone https://github.com/decitre/lmlite.git
cd lmlite
pixi install

Development Tasks

# Run tests
pixi run test

# Quick tests (skip notebook tests)
pixi run quick-test

# Run linter
pixi run lint

# Check linting without fixing
pixi run lint-check

# Run tests with coverage
pixi run coverage

# Build wheel
pixi run wheel

Build JupyterLite Demo

pixi run lite-build

Follow the instructions provided by the command.

Run Tests in Different Python Versions

pixi run --environment py311 test
pixi run --environment py312 test
pixi run --environment py313 test

How It Works

LMLite bridges Python (via xeus-python or pyodide) and JavaScript (via Transformers.js):

JavaScript Layer: Uses @huggingface/transformers to run ONNX models in the browser
Python Bridge: Exposes JavaScript functionality through a Pythonic async API
Kernel Support:
- xeus-python (recommended): Full CPython in WebAssembly via emscripten
- pyodide: Alternative WebAssembly Python runtime
Model Loading:
- First run: Downloads models from HuggingFace CDN
- Cached: Uses browser's Cache API or IndexedDB
- Local: Reads from virtual filesystem if available
Execution: Models run entirely in-browser using WebAssembly (WASM)

Supported Models

Text Generation

gpt2 - GPT-2 (124M parameters)
distilgpt2 - Smaller, faster GPT-2 variant

Embeddings

all-MiniLM-L6-v2 - Sentence embeddings (384 dimensions)

For other models, check Xenova's model list.

Architecture

┌─────────────────────────────────────┐
│   Python (JupyterLite)              │
│   ├─ xeus-python (recommended)      │
│   └─ pyodide (alternative)          │
│                                     │
│   from lmlite import LLM            │
│   llm = await LLM.create()          │
│   text = await llm.generate(...)    │
└─────────────┬───────────────────────┘
              │ Bridge
              │ (pyodide.ffi / pjs)
┌─────────────▼───────────────────────┐
│   JavaScript (Browser)              │
│                                     │
│   Transformers.js                   │
│   ├─ Model loading                  │
│   ├─ ONNX Runtime (WASM)            │
│   └─ WebGPU (optional)              │
└─────────────────────────────────────┘

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Dependencies:

Transformers.js - Apache 2.0

Note: Individual model licenses may vary. Check the model card on HuggingFace before use.

Acknowledgments

Troubleshooting

Models downloading every time?

Check browser console for Cache API availability. Some privacy settings may disable caching.

Out of memory errors?

Try smaller models like distilgpt2 or reduce max_new_tokens.

CORS errors?

Ensure you're running from http://localhost or a proper HTTPS domain, not file://.

Citation

If you use LMLite in your research, please cite:

@software{lmlite2026,
  author = {Decitre, Emmanuel},
  title = {LMLite: Browser-native LLM orchestration in JupyterLite},
  year = {2026},
  url = {https://github.com/decitre/lmlite}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
notebooks		notebooks
scripts		scripts
src/lmlite		src/lmlite
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lmLite

Features

Usage

Configuration Options

Export Models for Offline Use

Development

Prerequisites

Setup

Development Tasks

Build JupyterLite Demo

Run Tests in Different Python Versions

How It Works

Supported Models

Text Generation

Embeddings

Architecture

License

Acknowledgments

Troubleshooting

Models downloading every time?

Out of memory errors?

CORS errors?

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lmLite

Features

Usage

Configuration Options

Export Models for Offline Use

Development

Prerequisites

Setup

Development Tasks

Build JupyterLite Demo

Run Tests in Different Python Versions

How It Works

Supported Models

Text Generation

Embeddings

Architecture

License

Acknowledgments

Troubleshooting

Models downloading every time?

Out of memory errors?

CORS errors?

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages