Rohas supports multiple LLM providers, allowing you to use different AI models based on your needs.
- OpenAI - GPT-4, GPT-3.5, and other OpenAI models
- Anthropic - Claude 3 Opus, Sonnet, Haiku
- Ollama - Local models via Ollama
- Local - Custom local model servers
- Get an API key from OpenAI
- Use the
--provider openaiflag with your API key
rohas run script.ro --provider openai --api-key YOUR_API_KEYprompt "Hello, how are you?"
model: "gpt-4"
temperature: 0.7
maxTokens: 100
gpt-4- Most capable modelgpt-4-turbo- Faster GPT-4gpt-3.5-turbo- Fast and cost-effectivegpt-4o- Latest GPT-4 variant
prompt "Explain quantum computing"
model: "gpt-4"
temperature: 0.7
maxTokens: 500
- Get an API key from Anthropic
- Use the
--provider anthropicflag with your API key
rohas run script.ro --provider anthropic --api-key YOUR_API_KEYprompt "Hello, how are you?"
model: "claude-3-opus-20240229"
temperature: 0.7
maxTokens: 100
claude-3-opus-20240229- Most capableclaude-3-sonnet-20240229- Balanced performanceclaude-3-haiku-20240307- Fast and efficient
prompt "Write a short story"
model: "claude-3-opus-20240229"
temperature: 0.8
maxTokens: 1000
Ollama allows you to run models locally on your machine.
- Install Ollama
- Pull a model:
ollama pull llama2 - Start Ollama server (usually runs on
http://localhost:11434)
rohas run script.ro --provider ollama --base-url http://localhost:11434 --model llama2prompt "Hello, how are you?"
model: "llama2"
temperature: 0.7
maxTokens: 100
Any model available in Ollama:
llama2mistralcodellamaphi- And many more...
prompt "Explain Rust programming"
model: "llama2"
temperature: 0.5
maxTokens: 300
For custom local model servers (e.g., vLLM, text-generation-inference).
- Set up your local model server
- Ensure it's compatible with OpenAI API format (or use adapter)
rohas run script.ro --provider local --base-url http://localhost:8000 --model your-modelprompt "Hello, how are you?"
model: "your-model"
temperature: 0.7
maxTokens: 100
prompt "Generate code"
model: "local-model"
temperature: 0.3
maxTokens: 500
# OpenAI
rohas run script.ro --provider openai --api-key KEY --model gpt-4
# Anthropic
rohas run script.ro --provider anthropic --api-key KEY --model claude-3-opus-20240229
# Ollama
rohas run script.ro --provider ollama --base-url http://localhost:11434 --model llama2
# Local
rohas run script.ro --provider local --base-url http://localhost:8000 --model custom-modelYou can also set provider configuration via environment variables:
export ROHAS_PROVIDER=openai
export ROHAS_API_KEY=your_key
export ROHAS_MODEL=gpt-4
rohas run script.roIf no provider is specified, Rohas will attempt to use environment variables or default to a local provider if available.
You can specify different models for different prompts:
prompt "Simple question"
model: "gpt-3.5-turbo" # Use cheaper model
prompt "Complex analysis"
model: "gpt-4" # Use more capable model
Set a default model via command line:
rohas run script.ro --provider openai --api-key KEY --model gpt-4All prompts will use gpt-4 unless overridden:
prompt "Uses gpt-4 (default)"
# model not specified, uses default
prompt "Uses gpt-3.5-turbo"
model: "gpt-3.5-turbo" # Override default
All providers support these prompt options:
Controls randomness (0.0 to 2.0):
prompt "Creative writing"
temperature: 0.9 # More creative
prompt "Factual answer"
temperature: 0.1 # More focused
Maximum tokens in response:
prompt "Short answer"
maxTokens: 50
prompt "Long explanation"
maxTokens: 1000
Stream responses (if supported):
prompt "Long response"
stream: true
maxTokens: 1000
- Function calling / tool use
- Streaming
- Multiple model variants
- Long context windows
- Structured outputs
- System prompts
- Local execution
- No API costs
- Custom model support
- Full control
- Custom APIs
- Private deployment
- Choose the right model - Use cheaper models for simple tasks, powerful models for complex ones
- Set appropriate temperature - Lower for factual tasks, higher for creative tasks
- Limit max tokens - Set reasonable limits to control costs
- Use streaming - For long responses, use streaming for better UX
- Cache responses - Consider caching for repeated queries
- Handle rate limits - Implement retry logic for production use
# Check API key is set
echo $ROHAS_API_KEY
# Or pass explicitly
rohas run script.ro --provider openai --api-key YOUR_KEY# Check base URL for local/Ollama
rohas run script.ro --provider ollama --base-url http://localhost:11434
# Test connection
curl http://localhost:11434/api/tags # For Ollama# List available models (Ollama)
ollama list
# Use correct model name
rohas run script.ro --provider ollama --model llama2See the examples/ directory for provider-specific examples:
hello-world.ro- Basic prompttemplate-literal-example.ro- Template literals with promptsfunctions-and-await.ro- Async prompts
- OpenAI: Pay per token (varies by model)
- Anthropic: Pay per token (varies by model)
- Ollama: Free (runs locally)
- Local: Free (your infrastructure)
Choose based on your needs:
- Development/testing: Ollama or local models
- Production: OpenAI or Anthropic for reliability
- High volume: Consider local models or caching