Building production-grade tooling for LLM agents.
A suite of four composable libraries for building reliable LLM agent systems.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β production-agent-toolkit β
β β
β βββββββββββββ ββββββββββββββββββββ βββββββββββββββ β
β β llm-routerβββββΆβreact-guard-patt..βββββΆβagent-memory β β
β β β β β β β β
β β Route β β Guard during β β Remember β β
β β first β β execution β β always β β
β βββββββββββββ ββββββββββββββββββββ ββββββββ¬βββββββ β
β β β
β ββββββββββΌβββββββ β
β β agent-evals β β
β β β β
β β Evaluate β β
β β after β β
β βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Route first. Guard during. Remember always. Evaluate after.
| Library | What it does | Tests |
|---|---|---|
| agent-evals | 3D evaluation framework for LLM agent outputs | 45 |
| react-guard-patterns | Stop-condition guards for agentic loops | 14 |
| llm-router | Semantic task routing across LLM providers | 10 |
| agent-memory | Short-term + episodic memory for agents | 13 |
82 tests total across the toolkit. β production-agent-toolkit
@thedarshanjoshi on X