Call your OpenClaw over the phone using the Deepgram Voice Agent API.
| ElevenLabs | Deepgram | |
|---|---|---|
| Turn detection | VAD-based | Semantic (Flux) |
| TTS latency | ~200ms TTFB | 90ms TTFB |
| TTS price | $0.050/1K chars | $0.030/1K chars |
| Barge-in | Basic VAD | Native StartOfTurn |
Deepgram Flux understands when you're done talking semantically and acoustically—not just when you stop making noise. This means fewer awkward interruptions and faster responses.
| Feature | Twilio | Telnyx |
|---|---|---|
| Setup Complexity | Moderate | Easy |
| Phone Number Cost | ~$1/month | ~$0.50-$2/month |
| Call Pricing | $0.085/min | $0.005-$0.025/min |
| Media Streaming | WebSocket + TwiML | WebSocket + REST API |
| Authentication | Account SID + Auth Token | API Key + Public Key |
| Documentation | Extensive | Growing |
| Global Coverage | Excellent | Excellent |
Recommendation:
- Twilio: Better for production apps with extensive docs and ecosystem
- Telnyx: More cost-effective, simpler API, better for experimenting
deepclaw uses the Deepgram Voice Agent API—a single WebSocket that handles STT, TTS, turn-taking, and barge-in together.
Phone Call → Twilio/Telnyx → deepclaw ←──WebSocket──→ Deepgram Voice Agent API
│ (Flux STT + Aura-2 TTS)
│
↓
OpenClaw (LLM)
- You call your phone number
- Twilio/Telnyx streams audio to deepclaw via WebSocket
- deepclaw forwards audio to Deepgram Voice Agent API
- Flux transcribes with semantic turn detection
- Deepgram calls your LLM endpoint (OpenClaw via deepclaw proxy)
- Aura-2 speaks the response, streamed back through your phone provider
Barge-in support: Start talking while the assistant is speaking and it stops immediately—handled natively by the Voice Agent API.
The easiest way to set up deepclaw is to let your OpenClaw do it for you:
# Copy the skill to your OpenClaw
cp -r skills/deepclaw-voice ~/.openclaw/skills/Then tell your OpenClaw: "I want to call you on the phone"
OpenClaw will walk you through:
- Creating a Deepgram account (free $200 credit)
- Setting up a Twilio phone number (~$1/month)
- Configuring everything automatically
- Python 3.10+
- Deepgram account (free tier available, $200 credit)
- Phone Provider (choose one):
- Twilio account with a phone number (~$1/month)
- Telnyx account with a phone number (~$0.50-$2/month)
- OpenClaw running locally
- ngrok for exposing your local server
git clone https://github.com/deepgram/deepclaw.git
cd deepclaw
pip install -e .cp .env.example .envEdit .env with your credentials:
For Twilio (default):
DEEPGRAM_API_KEY=your_deepgram_api_key
VOICE_PROVIDER=twilio
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token
OPENCLAW_GATEWAY_URL=http://127.0.0.1:18789
OPENCLAW_GATEWAY_TOKEN=your_openclaw_gateway_tokenFor Telnyx:
DEEPGRAM_API_KEY=your_deepgram_api_key
VOICE_PROVIDER=telnyx
TELNYX_API_KEY=your_telnyx_api_key
TELNYX_PUBLIC_KEY=your_telnyx_public_key
OPENCLAW_GATEWAY_URL=http://127.0.0.1:18789
OPENCLAW_GATEWAY_TOKEN=your_openclaw_gateway_tokenEnable the chat completions endpoint in your openclaw.json:
openclaw config set gateway.http.endpoints.chatCompletions.enabled truedeepclaw automatically creates a dedicated voice agent in OpenClaw that uses a fast model (Claude Haiku 4.5) for low-latency phone conversations. This happens on first python -m deepclaw launch. To customize the model:
# Use a different model for voice (optional)
export OPENCLAW_VOICE_MODEL=anthropic/claude-sonnet-4-5-20250929Or create the agent manually:
openclaw agents add voice --model anthropic/claude-haiku-4-5-20251001ngrok http 8000Note your ngrok URL (e.g., https://abc123.ngrok-free.app).
- Go to your Twilio Console
- Navigate to Phone Numbers → Manage → Active Numbers
- Click your number
- Under "Voice Configuration":
- Set "A Call Comes In" to Webhook
- URL:
https://your-ngrok-url.ngrok-free.app/twilio/incoming - Method: POST
- Save
- Go to your Telnyx Mission Control Portal
- Navigate to Voice → Programmable Voice
- Create a new Voice API Application:
- Application Name:
deepclaw-voice - Webhook URL:
https://your-ngrok-url.ngrok-free.app/telnyx/webhook - Webhook API Version:
API v2(recommended) - Webhook Failover URL: (optional) same as webhook URL
- Application Name:
- Click Create
- Go to Numbers → My Numbers
- Click your phone number
- Under Voice Settings:
- Connection: Select your
deepclaw-voiceapplication
- Connection: Select your
- Save configuration
- In Mission Control Portal, go to API Keys
- Create a new API Key or copy existing one
- For Public Key: Go to Account → Public Key and copy the key
python -m deepclawPick up the phone and talk to your OpenClaw!
┌─────────────┐ ┌──────────────────────────────────────────────────────┐
│ Caller │ │ Your Machine │
│ (Phone) │ │ │
└──────┬──────┘ │ ┌───────────┐ ┌───────────┐ ┌───────────────┐ │
│ │ │ Twilio or │ │ deepclaw │ │ OpenClaw │ │
│ PSTN │ │ Telnyx │──▶│ Server │──▶│ Gateway │ │
│ │ │ Webhook │ └─────┬─────┘ └───────────────┘ │
▼ │ └───────────┘ │ │
┌──────────────┐ │ │ WebSocket │
│ Twilio/ │◀───┼────────────────────────┤ │
│ Telnyx │ │ ▼ │
│ (SIP/Media) │ │ ┌───────────────────┐ │
└──────────────┘ │ │ Deepgram Voice │ │
│ │ │ Agent API │ │
│ Audio │ │ • Flux (STT) │ │
└────────────┼─────────────▶│ • Aura-2 (TTS) │ │
│ │ • Turn detection │ │
│ │ • Barge-in │ │
│ └───────────────────┘ │
└──────────────────────────────────────────────────────┘
The Voice Agent API handles the entire speech pipeline in a single WebSocket connection. deepclaw bridges Twilio's media stream to the Voice Agent API and proxies LLM requests to your local OpenClaw.
deepclaw uses Deepgram Aura-2 TTS with 80+ voices in 7 languages. Edit voice_agent_server.py:
"speak": {
"provider": {
"type": "deepgram",
"model": "aura-2-orion-en", # Change voice here
},
},Popular voices:
| Voice | Style |
|---|---|
aura-2-thalia-en |
Feminine, American (default) |
aura-2-orion-en |
Masculine, American |
aura-2-draco-en |
Masculine, British |
aura-2-estrella-es |
Feminine, Mexican Spanish |
aura-2-fabian-de |
Masculine, German |
See skills/deepclaw-voice/SKILL.md for the complete voice list (80+ voices in 7 languages), or test voices at https://playground.deepgram.com/
Be aware of these security considerations when using OpenClaw and deepclaw. Like the rest of OpenClaw, use at your own risk.
1. LLM proxy endpoint has no authentication
- The
/v1/chat/completionsendpoint is unauthenticated - Anyone who discovers your ngrok URL can use your OpenClaw/Anthropic API credits
- Mitigation: Keep your ngrok URL private. Consider using a fixed ngrok domain.
2. No Twilio signature validation
- Incoming webhook requests are not verified as coming from Twilio
- Mitigation: For production, add Twilio request validation
3. Credentials in .env file
- API keys and tokens are stored in plaintext
- Mitigation: The file is gitignored. Set restrictive permissions:
chmod 600 .env
4. ngrok exposes your local machine
- Your server is accessible from the internet while running
- Mitigation: Only run when needed. Use ngrok's IP allowlist on paid plans.
For production deployments, consider:
- Adding Twilio signature validation
- Running behind a reverse proxy with rate limiting
- Using a dedicated server instead of ngrok
- Implementing proper authentication on the LLM proxy
OpenClaw streaming latency: OpenClaw's /v1/chat/completions endpoint currently buffers responses for ~5 seconds before streaming begins. This adds latency to LLM responses regardless of which voice provider you use (Deepgram, ElevenLabs, etc.).
The initial greeting is instant (generated by Deepgram), but subsequent responses wait for OpenClaw's buffer.
This is an upstream limitation. OpenClaw's native WebSocket agent endpoint streams properly, but external voice APIs require the OpenAI-compatible chat completions endpoint.
- Local wake-word mode — Talk to OpenClaw hands-free at your desk, no phone needed
- One-click desktop installer — No terminal required
- Native OpenClaw plugin — Install with one command
If you run into issues:
- Check existing issues: Search GitHub Issues to see if your problem has been reported
- Open a new issue: Include:
- What you were trying to do
- What happened instead
- Server logs (redact any API keys)
- Your environment (OS, Python version, OpenClaw version)
- Deepgram support: For Deepgram-specific issues, visit Deepgram's Community
Contributions are welcome! Here's how to help:
Open an issue with:
- Clear description of the bug
- Steps to reproduce
- Expected vs actual behavior
- Logs and environment info
Open an issue describing:
- The problem you're trying to solve
- Your proposed solution
- Any alternatives you've considered
- Fork the repo
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Test thoroughly
- Commit with clear messages
- Push to your fork
- Open a PR against
main
Note: The main branch is protected. All changes require a pull request and review.
- Follow existing code patterns
- Add comments for complex logic
- Update documentation for user-facing changes
MIT
Built with:
- Deepgram Voice Agent API — Real-time conversational AI pipeline
- Deepgram Flux — Semantic speech recognition
- Deepgram Aura-2 — Low-latency text-to-speech
- OpenClaw — Open-source AI assistant
- Twilio — Phone infrastructure