Original Reddit post

Sample call audio at the bottom of this post I had a seven hour train ride, started out just wanting to mess around with PersonaPlex. Somewhere along the way, Claude Code and I built an entire production-grade AI phone agent that makes and receives real phone calls over Asterisk, talks like a human, records everything, and manages outbound campaigns without me writing a single line of code by hand. No frameworks. No magic SaaS. Just Claude, prompts, and a lot of “okay, now what if it did this?” This thing is called VocAgent. What it actually does You give it: a phone number a prompt a voice It dials out over a real PSTN line. From there: PersonaPlex handles the conversation in real time with a natural AI voice VocAgent records both sides (stereo), transcribes the call, and tracks the outcome Everything shows up in a web UI with call history, audio playback, and analytics Inbound calls work too! For inbound calls, callers land on an IVR that lets them select which AI agent they want to talk to (different personas, prompts, or voices). Once selected, the call is handed off to PersonaPlex and handled end-to-end the same way as outbound. What PersonaPlex does vs what VocAgent does PersonaPlex (open source) is the voice brain: takes audio in generates natural speech out streams responses in real time from a GPU VocAgent is the glue that makes it usable in the real world: connects PersonaPlex to Asterisk manages calls, campaigns, retries, recordings adds safety rails so the AI doesn’t say dumb things like “thanks for calling” on an outbound call wraps everything in a clean web UI Think: LLM voice model meets actual phone infrastructure. The stack (Claude wrote all of this) Total: ~4,200 lines Hand-written by me: 0 Features that somehow kept getting added Inbound + outbound AI phone calls 17 built-in PersonaPlex voices + custom voice cloning from samples Bulk campaign dialer (CSV upload, rate limits, retries, dispositions) Stereo call recording (caller left, AI right) + transcription Reusable call templates Prompt-prefix injection so the AI understands call context Token-bucket rate limiting and stale call recovery Full web UI: calls, campaigns, voices, analytics, settings At no point did I plan all of this. It just… happened. The audio pipeline (simplified): Caller -> Asterisk (8kHz G.711) -> VocAgent (resample 16kHz) -> GPU bridge (resample 24kHz + Opus) -> PersonaPlex (WebSocket) <- same path back Both directions stream simultaneously. The GPU bridge handles codec translation and captures both sides for clean stereo recordings. ±-----------+ ±------------+ ±---------------+ | Asterisk | <–> | VocAgent | <–> | PersonaPlex | | (PBX) | ARI | (Node.js) | TCP | (GPU voice) | ±-----------+ ±------------+ ±---------------+ | HTTP :8089 | Web UI Two machines. Two systemd services. What Claude Code handled (all of it) Asterisk ARI integration and call state machine RTP packet handling and real-time audio resampling Async Python GPU bridge with Opus encoding/decoding Campaign engine with retries and rate limits SQLite schema (8 tables), migrations, WAL mode Entire web UI (file uploads, audio playback, dashboards) Prompt engineering and behavioral guardrails I described behavior. Claude wrote code. I tested on real calls. Gave feedback. Iterated. That’s it. Deployment Node.js service on the Asterisk box Python GPU bridge on the PersonaPlex server Call with Benny submitted by /u/LaysWellWithOthers

Originally posted by u/LaysWellWithOthers on r/ClaudeCode