CheapestInference Examples

Code examples for CheapestInference — a flat-rate LLM inference API. No per-token billing.

What is CheapestInference?

One API for every open-source model (DeepSeek, Qwen, Llama, Kimi, Gemma). Fixed monthly price. Uses the standard OpenAI and Anthropic SDKs — just change the base URL. Pay with card or USDC on Base L2.

See current plans and pricing.

Quick start

Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.cheapestinference.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.cheapestinference.com/v1",
});

const response = await client.chat.completions.create({
  model: "deepseek-chat",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

cURL

curl https://api.cheapestinference.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepseek-chat", "messages": [{"role": "user", "content": "Hello!"}]}'

Examples

Example	Description
python/chat.py	Basic chat completion
python/streaming.py	Streaming responses with SSE
python/anthropic_sdk.py	Using the Anthropic SDK
python/tool_calling.py	Function/tool calling
node/chat.mjs	Basic chat completion (Node.js)
node/streaming.mjs	Streaming responses (Node.js)
agent-x402/autonomous_agent.py	Agent that self-subscribes via x402 + USDC

x402: Agents that pay for themselves

CheapestInference supports the x402 protocol. AI agents can discover the API, subscribe with USDC on Base, and start making requests — no human setup:

Agent → GET /v1/chat/completions (no key)
     ← 402 Payment Required + product catalog
Agent → POST /api/billing/checkout (selects plan, pays USDC)
     ← { apiKey: "sk-..." }
Agent → GET /v1/chat/completions (with key)
     ← 200 OK

See the full x402 example.

Supported endpoints

Endpoint	Description
`POST /v1/chat/completions`	OpenAI-compatible chat
`POST /anthropic/v1/messages`	Anthropic-compatible messages
`POST /v1/embeddings`	Text embeddings
`GET /v1/models`	List available models

Links

Dashboard — Sign up, manage keys, view usage
Docs — Full API reference
Pricing — All models with per-token costs and request estimates

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agent-x402		agent-x402
node		node
python		python
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CheapestInference Examples

What is CheapestInference?

Quick start

Python

Node.js

cURL

Examples

x402: Agents that pay for themselves

Supported endpoints

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CheapestInference Examples

What is CheapestInference?

Quick start

Python

Node.js

cURL

Examples

x402: Agents that pay for themselves

Supported endpoints

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages