Good Listener

Real-time audio transcription, screen capture, and AI-assisted conversation analysis.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     Go Platform (port 8000)                     │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ Audio Capture│  │ Screen Grab  │  │ WebSocket/HTTP Server │  │
│  │ (malgo)      │  │ (screenshot) │  │ (coder/websocket)     │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         ▼                 ▼                      │              │
│  ┌──────────────────────────────────────┐        │              │
│  │  Orchestrator (channels, backpressure)│◄──────┘              │
│  └──────────────────┬───────────────────┘                       │
└─────────────────────┼───────────────────────────────────────────┘
                      │ gRPC (port 50051)
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                  Python Inference Services                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │ Whisper STT │  │ OCR Service │  │ LLM Service (streaming) │  │
│  │ + Silero VAD│  │ (RapidOCR)  │  │ (Gemini/Ollama)         │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                Memory Service (ChromaDB)                     ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Project Structure

good-listener/
├── proto/                    # Protobuf definitions
│   └── cognition.proto
├── inference/                # Python ML services
│   ├── app/
│   │   ├── core/            # Logging, utilities
│   │   ├── services/        # Transcription, VAD, OCR, LLM, Memory
│   │   ├── pb/              # Generated protobuf code
│   │   └── grpc_server.py   # gRPC server entry point
│   ├── tests/
│   └── requirements.txt
├── platform/                 # Go orchestration layer
│   ├── cmd/server/          # Main entry point
│   ├── internal/
│   │   ├── audio/           # Audio capture with backpressure
│   │   ├── screen/          # Screen capture
│   │   ├── orchestrator/    # Service coordination
│   │   ├── server/          # HTTP/WebSocket handlers
│   │   ├── grpcclient/      # gRPC client to Python
│   │   └── config/          # Configuration
│   ├── pkg/pb/              # Generated protobuf code
│   └── go.mod
├── frontend/                 # Electron + React UI
└── Makefile

Quick Start

Prerequisites

Go 1.22+
Python 3.11+
Node.js 18+
protoc (Protocol Buffers compiler)

Installation

# Install all dependencies
make install

# Generate protobuf files (requires protoc)
make proto

Running

# Start all services (inference + platform + frontend)
make dev

# Or start individually:
make inference    # Python gRPC server on :50051
make platform     # Go server on :8000
make frontend     # React dev server

Environment Variables

Create a .env file in the project root:

# LLM Configuration
GOOGLE_API_KEY=your-api-key
LLM_PROVIDER=gemini
LLM_MODEL=gemini-2.0-flash

# Platform Configuration
HTTP_ADDR=:8000
INFERENCE_ADDR=localhost:50051
SAMPLE_RATE=16000
VAD_THRESHOLD=0.5
CAPTURE_SYSTEM_AUDIO=true
AUTO_ANSWER_ENABLED=true

Development

Testing

make test           # All tests
make inference-test # Python tests only
make platform-test  # Go tests only

Proto Regeneration

After modifying proto/cognition.proto:

make proto

Why This Architecture?

Go for orchestration: Native goroutines + channels provide:
- Proper backpressure (bounded channels)
- Graceful cancellation (context)
- Efficient concurrency without GIL
Python for ML inference: Keeps the ML ecosystem:
- PyTorch/faster-whisper for transcription
- LangChain for LLM abstraction
- ChromaDB for vector storage
gRPC for communication:
- Type-safe API contracts
- Streaming support for audio/LLM
- Language-agnostic

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.playwright-mcp		.playwright-mcp
backend		backend
frontend		frontend
scripts		scripts
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Good Listener

Architecture

Project Structure

Quick Start

Prerequisites

Installation

Running

Environment Variables

Development

Testing

Proto Regeneration

Why This Architecture?

License

About

Uh oh!

Releases

Packages

Languages

GriffinCanCode/good-listener

Folders and files

Latest commit

History

Repository files navigation

Good Listener

Architecture

Project Structure

Quick Start

Prerequisites

Installation

Running

Environment Variables

Development

Testing

Proto Regeneration

Why This Architecture?

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages