One eval tool instead of twenty. Sandboxed JS syntax your LLM already knows.
- LLMs compute instead of reasoning — "how many R's in strawberry?" becomes
.split("").filter(c => c == "r").length - ~95% less tool context — one
evaltool replaces dozens of MCP tool schemas; the LLM prompt stays constant as you add capabilities - Polyglot composition — chain tools from Python, Go, TypeScript MCP servers with pipes in a single
evalcall - KV cache friendly — system prompt doesn't change when tools change;
help()discovers capabilities at runtime - No double-encoding corruption — raw template strings (
r\...`) let LLMs write backslashes, quotes, and code exactly as-is. No more\\` spirals when editing files with Windows paths or regex patterns
Total stdlib context cost: 6.3KB / 1770 tokens (full) · 2.8KB / 769 tokens (compact) — cl100k_base
Build and run as an MCP server:
git clone https://github.com/IodeSystems/tshell.git && cd tshell
./gradlew :tshell-cli:installDistAdd to your MCP client (Claude Desktop, Cursor, etc.):
{
"mcpServers": {
"tshell": {
"command": "./tshell-cli/build/install/tshell-cli/bin/tshell-cli"
}
}
}Your LLM now has eval and help. It writes JS-syntax code; tshell executes it.
// LLM can compute instead of reasoning:
"strawberry".split("").filter(c => c == "r").length // → 3
// Chain operations in a single tool call:
let data = [{name: "Alice", score: 85}, {name: "Bob", score: 92}]
data |> filter(d => d.score > 80) |> map(d => d.name) |> join(", ") // → "Alice, Bob"Connect external MCP servers — their tools become tshell commands, composed with pipes.
Use --mcp with the standard JSON config format, or --connect for shorthand:
# Standard MCP JSON config — same format as Claude Desktop, Cursor, etc.
tshell-cli --mcp '{"mcpServers": {"browser": {"command": "npx", "args": ["@anthropic/mcp-playwright"]}}}'
# Or point to a config file
tshell-cli --mcp mcp-config.json
# Shorthand for simple cases
tshell-cli --connect "app=python my_tools.py" --connect "data=npx data-server"LLM
│
▼
tshell-cli (single MCP server)
├── eval("app.users() |> filter(u => u.active) |> sort(\"name\")")
│ ├── tshell core (pipes, filter, sort, strings, math)
│ ├── app.* → Python MCP server (your app code)
│ └── data.* → Go MCP server (your data pipeline)
└── help() → all commands from all servers
Context reduction: each MCP tool inflates the LLM's prompt with schema and
descriptions. A Playwright MCP server adds ~8KB of tool definitions. tshell
replaces all of it with one eval tool — commands are discovered via help()
at runtime, not baked into the prompt. The KV cache survives tool set changes.
See tshell-cli/README.md for CLI flags, Claude Desktop config, and JDBC driver setup.
LLMs are bad at computation. They can't reliably count letters, do arithmetic on large numbers, or sort data by reasoning alone. But they're excellent at writing code — especially JavaScript, which dominates their training data.
tshell gives your LLM a single eval tool with JS syntax. Instead of defining dozens of
tools (each requiring schema, validation, and prompt engineering), you get one tool that
covers computation, string processing, data transformation, and more. Capabilities are
discovered at runtime via help(), so the system prompt stays small and the KV cache
survives tool set changes.
LLMs famously fail at tasks like "how many R's in strawberry?" because they try to reason about it instead of computing it. With tshell, they write code instead.
Count letters — LLMs get this wrong by reasoning, right by computing:
"strawberry".split("").filter(c => c == "r").length
// → 3Word frequency — multi-step analysis in one expression:
let words = "the quick brown fox jumps over the lazy dog the fox"
let counts = words |> split(" ") |> countBy(w => w)
entries(counts) |> filter(e => e[1] > 1) |> map(e => `${e[0]}=${e[1]}`) |> join(", ")
// → the=3, fox=2Without tshell, an LLM needs separate tools for search, math, string processing, data
transformation — each requiring a round-trip. With tshell, complex multi-step operations
happen in a single eval call.
Filter, aggregate, and format — three tool calls collapsed into one:
let data = [{name: "Alice", score: 85}, {name: "Bob", score: 92}, {name: "Carol", score: 78}]
let avg = data |> map(d => d.score) |> reduce((sum, s) => sum + s, 0) |> (total => total / (data |> len()))
let above = data |> filter(d => d.score >= avg)
above |> map(d => `${d.name}: ${d.score}`) |> join(", ")
// → Alice: 85, Bob: 92Parse and restructure — no special date-parsing tool needed:
"2024-03-14T10:30:00" |> split("T") |> (parts => {
let [date, time] = parts
let [y, m, d] = date |> split("-")
{year: num(y), month: num(m), day: num(d), time: time}
})
// → {year: 2024, month: 3, day: 14, time: "10:30:00"}tshell is a JS subset — let, function, if/else, for/while, arrow functions,
destructuring, spread, template strings, ?., ??, ternary all work as expected.
These are the additions and differences.
The pipe operator |> passes the left-hand value as the first argument to the right-hand
function. This replaces method chaining and is the primary way to compose operations.
[3, 1, 2] |> sort
// → [1, 2, 3][1, 2, 3] |> map(x => x * 10)
// → [10, 20, 30]Chained pipes:
"hello world" |> split(" ") |> map(w => w |> upper()) |> join(" ")
// → HELLO WORLDPipe into arrow function:
5 |> (x => x * x)
// → 25|* normalizes the left side to an array, then maps the right-side function over each
element in parallel.
[1, 2, 3] |* (x => x * 10)
// → [10, 20, 30]null becomes empty array:
null |* (x => x * 2)
// → []non-array wrapped then mapped:
5 |* (x => x * 2)
// → [10]Standard operations are pipe-friendly commands instead of methods. The LLM discovers
them via help(). Common commands:
[1, 2, 3] |> filter(x => x > 1) |> reduce((sum, x) => sum + x, 0)
// → 5Comparator sort:
[3, 1, 2] |> sort((a, b) => b - a)
// → [3, 2, 1]Descending sort:
[3, 1, 2] |> sort("desc")
// → [3, 2, 1]{a: 1, b: 2} |> entries()
// → [["a", 1], ["b", 2]]range(1, 5)
// → [1, 2, 3, 4]help() returns all available commands:
typeof(help())
// → stringCommands also work as methods. receiver.command(args) is sugar for command(receiver, args).
JS method names are aliased automatically — toUpperCase() maps to upper(),
toLowerCase() to lower(), includes() to contains(), etc. Mutating array methods
(push, pop, shift, unshift, splice) modify the array in-place. pop() and
shift() return the removed element (JS semantics).
JS-style method chaining:
"hello world".split(" ").map(w => w.toUpperCase()).join(" ")
// → HELLO WORLDMutating methods modify the array:
let arr = [3, 1, 2];
arr.push(4);
arr.sort()
// → [1, 2, 3, 4]pop returns the removed element:
let arr = [10, 20, 30];
let removed = arr.pop();
{removed: removed, arr: arr}
// → {removed: 30, arr: [10, 20]}Method chaining for anagram check:
"listen".split("").sort().join("") == "silent".split("").sort().join("")
// → trueState is discarded after each eval call unless explicitly exported. Use export to
persist values across tool calls in multi-turn LLM conversations.
Without export, x is gone after this eval:
let x = 42
// → 42chain sequences functions. all runs in parallel and collects results.
race returns the first success.
chain(() => [1, 2, 3], arr => arr |> map(x => x * 2), arr => arr |> join(", "))
// → 2, 4, 6all(() => 1 + 2, () => 3 * 4)
// → [3, 12]race(() => fail("down"), () => "fallback")
// → fallbackWhat about double-encoding corruption? LLM writes \\n, transport eats the backslash,
interpreter gets a newline. Or the LLM spirals through \\\\, \\\\\\\\, never landing on the
right number of backslashes. Raw template strings fix this: prefix any template literal
with r to disable backslash escape processing. What you type is what you get.
${expr} interpolation still works — Kotlin-style: raw content, live substitutions.
Backslashes are literal:
r`C:\Users\foo`
// → C:\Users\fooNo escape processing — \n stays as two characters:
r`line1\nline2`
// → line1\nline2Interpolation still works:
let name = "world"; r`Hello ${name}, path=C:\Users`
// → Hello world, path=C:\UsersRegular template for comparison — \t becomes a tab:
`tab\there`
// → tab hereMultiline template strings:
`line1
line2
line3`
// → line1
line2
line3Raw interpolation with literal backslash — \n stays as two characters:
let x = 42; r`result: ${x}\nend`
// → result: 42\nendNo class, this, new, async/await, import/require, prototypes,
generators, or Symbol. Type annotations are accepted but ignored.
Use help() to discover what's available.
Published to Maven Central as com.iodesystems.tshell:tshell.
// build.gradle.kts
dependencies {
implementation("com.iodesystems.tshell:tshell:0.1.0")
}val shell = TShell()
CoreToolkit.install(shell)
val result = shell.eval("[1, 2, 3] |> map(x => x * 2)")
println(result.toDisplayString()) // "[2, 4, 6]"Register Kotlin functions as tshell commands. They appear in help()
automatically — the LLM discovers them without prompt changes.
shell.register(
"greet", "name: string",
"returns a greeting for the given name",
listOf("greet(\"world\")")
) { args ->
val name = (args[0] as TShellValue.TString).value
TShellValue.TString("hello $name")
}val shell = TShell(
maxSteps = 1_000_000, // max AST node evaluations
maxCallDepth = 256, // max recursion depth
timeoutMs = 30_000 // wall-clock timeout
)| Module | Artifact | Provides |
|---|---|---|
| Core | tshell |
Language runtime, CoreToolkit (pipes, arrays, strings, math, JSON, composition), MathToolkit, WebToolkit, FileToolkit |
| Graph | tshell-graph |
Graph database toolkit: nodes, edges, traversal, schema validation. See tshell-graph/README.md |
| MCP | tshell-mcp |
MCP server + client library. See tshell-mcp/README.md |
| CLI | tshell-cli |
Standalone MCP server binary with toolkits, external server composition, JDBC drivers. See tshell-cli/README.md |
| Browser | tshell-playwright |
Lean Playwright automation (12 commands, ~800 chars context vs ~8KB for @playwright/mcp). See tshell-playwright/README.md |
| SQL | tshell-sql |
JDBC toolkit: db.query, db.tables, db.schema. Read-only by default |
| Sample | tshell-sample-koog |
CLI chat agent + benchmarks (33 challenges). See tshell-sample-koog/README.md |
How much prompt space does tshell cost? Measured with cl100k_base (GPT-4 tokenizer), computed at build time from actual prompt content.
| Component | Full | Compact | What it contains |
|---|---|---|---|
| Tool description | 112 | 112 | One-line summary in the tool schema |
| Syntax reference | 447 | 447 | JS subset, pipes, raw templates, composition |
| Command signatures | 1183 | 182 | All stdlib commands (compact: names only, use help()) |
| Total | 1770 | 769 |
Compact mode lists command names only — the LLM calls help("name") for signatures and help("namespace") for namespace overviews.
For comparison: a typical MCP tool server exposes 10-20 tools at ~200 tokens each (2000-4000 tokens).
tshell replaces them all with one eval tool. Additional commands added via toolkits
(FileToolkit, SqlToolkit, etc.) grow only the command signatures section.
See benchmark results for how this prompt performs across 33 challenges.
This README is generated from
TShellLiterateTest.kt.
Every code example above is executed as part of the test — if an example
is wrong, the build fails.