A lightweight, file-based semantic router for agent / model / workflow dispatch. Routes input text to a labeled route by comparing embeddings against a curated set of examples. Zero default dependencies beyond serde, serde_json, toml, and thiserror. Bundle a local embedder via the fastembed feature, or bring your own. No LLM in the hot path. Sub-millisecond routing.
If you're building an AI agent, voice assistant, or workflow system, you need to dispatch user input to one of N specialized handlers. The naive options, keyword matching (brittle) and LLM classifier (slow, expensive, cloud round-trip), both have real costs. semrouter splits the difference: a tiny local embedding model gives you semantic understanding, and a flat file of labeled examples gives you a router you can edit and version-control.
semrouter is a pure classifier. Risk classification, confirmation prompts, and dispatch live in your application; they don't belong in the router. This separation keeps risk policies next to the code that actually runs the dangerous thing.
Default — lean library (bring your own EmbeddingProvider):
[dependencies]
semrouter = "0.1"Compiles against ~12 transitive crates (serde, serde_json, toml, thiserror and their support deps). No embedder, no async runtime, no HTTP client. You implement EmbeddingProvider with whatever backend fits.
Opt-in — bundled fastembed local embedder:
[dependencies]
semrouter = { version = "0.1", features = ["fastembed"] }Adds the FastEmbedEmbedder type backed by fastembed (local ONNX MiniLM, ~210 transitive crates dominated by the ONNX runtime). Pick this when you want batteries included.
CLI install:
cargo install semrouter --features cli,fastembedThe binary requires both features — cli for the clap-derived argument parser and fastembed because the binary needs a usable embedder out of the box.
routes.jsonl:
{"id":"r1","route":"time","text":"what time is it","tags":["time"],"risk":"low"}
{"id":"r2","route":"time","text":"tell me the current time","tags":["time"],"risk":"low"}
{"id":"r3","route":"weather","text":"is it going to rain","tags":["weather"],"risk":"low"}
{"id":"r4","route":"weather","text":"give me the forecast","tags":["weather"],"risk":"low"}router.toml:
[router]
name = "demo"
version = "0.1.0"
embedding_model = "fastembed/AllMiniLML6V2"
vector_dimension = 384
top_k = 3
minimum_score = 0.22
minimum_margin = 0.005
fallback_route = "needs_review"
[storage]
routes_file = "routes.jsonl"
hard_negatives_file = "hard_negatives.jsonl"
feedback_file = "feedback.jsonl"
decision_log_file = "decisions.jsonl"
index_dir = "index"use semrouter::{SemanticRouter, config::RouterConfig, embedding::FastEmbedEmbedder};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let config = RouterConfig::load(Path::new("router.toml"))?;
let embedder = Box::new(FastEmbedEmbedder::new()?);
let router = SemanticRouter::load(config, Path::new("routes.jsonl"), embedder)?;
let decision = router.route("got the time")?;
println!("{:#?}", decision.selected_route);
// → Some("time")
Ok(())
}See examples/quickstart.rs for a runnable version.
The EmbeddingProvider trait is one method:
pub trait EmbeddingProvider: Send + Sync {
fn embed(&self, text: &str) -> Result<Vec<f32>, RouterError>;
}A custom HTTP-backed provider with ureq (~5 transitive crates):
use semrouter::embedding::EmbeddingProvider;
use semrouter::error::RouterError;
struct OpenAIEmbedder { api_key: String }
impl EmbeddingProvider for OpenAIEmbedder {
fn embed(&self, text: &str) -> Result<Vec<f32>, RouterError> {
let resp: serde_json::Value = ureq::post("https://api.openai.com/v1/embeddings")
.set("Authorization", &format!("Bearer {}", self.api_key))
.send_json(serde_json::json!({
"input": text,
"model": "text-embedding-3-small"
}))
.map_err(|e| RouterError::Embedding(e.to_string()))?
.into_json()
.map_err(|e| RouterError::Embedding(e.to_string()))?;
Ok(resp["data"][0]["embedding"]
.as_array().unwrap()
.iter().map(|v| v.as_f64().unwrap() as f32).collect())
}
}That's the full HTTP-embedder surface. semrouter doesn't ship one because every consumer wants different things from their HTTP client (retry, batching, observability); pick yours.
{
"input": "got the time",
"selected_route": "time",
"status": "accepted",
"confidence": { "top_score": 0.591, "second_score": 0.241, "margin": 0.350 },
"candidates": [
{ "route": "time", "score": 0.591, "matched_examples": ["r1", "r2"] },
{ "route": "weather", "score": 0.241, "matched_examples": ["r3"] }
]
}Status is one of: accepted, ambiguous, below_threshold, needs_review.
Each consumer keeps its own corpus + threshold floors and asserts quality in cargo test:
use semrouter::testing::EvalSuite;
#[test]
fn my_corpus_meets_quality_bar() {
EvalSuite::from_dir("tests/fixtures/voice-assistant")
.unwrap()
.assert_passes();
}tests/fixtures/voice-assistant/thresholds.toml:
min_accuracy = 0.85
min_top2_accuracy = 0.90
min_per_route_f1 = 0.50
max_p95_ms = 25.0
max_load_ms = 15000.0If your corpus regresses (accuracy drops, latency spikes), CI fails.
The cli feature (default-on) provides the semrouter binary:
semrouter --config router.toml --routes routes.jsonl --embedder fastembed route "what time is it"
semrouter --config router.toml --routes routes.jsonl --embedder fastembed eval --eval-file eval.jsonl
semrouter --config router.toml --routes routes.jsonl --embedder fastembed eval \
--eval-file eval.jsonl --thresholds thresholds.toml
# → exit 0 = passed, exit 1 = threshold breached, exit 2 = config/parse errorReal numbers from the bundled voice-assistant fixture (6 routes, 35 examples, 22 eval cases) using fastembed/AllMiniLML6V2 on a M-series Mac:
| Metric | Value |
|---|---|
| Accuracy | 90.9% |
| Top-2 accuracy | 100.0% |
| p50 latency | 1.24 ms |
| p95 latency | 2.82 ms |
| p99 latency | 3.74 ms |
| Cold-start (model load) | ~50-70 ms |
The 9.1% "incorrect" cases are correctly routed to the direct_llm fallback intent (where they belong). For the 5 first-class intents, F1 is 1.000 across the board.
semrouter is pre-1.0. The public API surface is unstable; minor version bumps may include breaking changes. Pin to a specific version (semrouter = "=0.1.1") for exact reproducibility.
v1.0.0 will freeze the API.
- v0.2.0: Configurable embedder. Pick any fastembed-supported model from
router.toml(fastembed/AllMiniLML6V2,fastembed/BGESmallENV15,fastembed/MiniLML12V2, etc.) with a tradeoff guide in docs. - v0.3.0: Closed-loop learning.
semrouter tag(interactive CLI to mark recent decisions correct/wrong) +semrouter promote(ingest tagged feedback as new routing examples + runEvalSuiteto gate regression). The router gets better the more you use it. - v1.0.0: API freeze + crates.io 1.0.
MIT. See LICENSE.
