Original Reddit post

Quick context: I use Claude Code and Codex daily and noticed I was spending half my “agent is working” time just sitting there watching the screen. I was like, what if Claude or Codex can just talk back at me, like Jarvis did Ironman, so I don’t have to go through all the output soup? So I built Heard. What it does: Speaks your agent’s intermediate output - tool calls, status updates, the prose between actions. You can get up, make coffee, and still hear when it hits a failure or needs input. Stack:

  • Python daemon, Unix socket, fire-and-forget hooks (never blocks the agent)
  • ElevenLabs for cloud TTS, Kokoro for fully local (no key needed)
  • Optional Claude Haiku 4.5 for in-character persona rewrites
  • Adapters for Claude Code + Codex; heard run wraps anything else
  • macOS app + CLI, Apache 2.0 What I learned building it: The hard part wasn’t TTS, it was deciding what NOT to say. First version narrated everything and was unbearable in 90 seconds. Now there are 4 verbosity profiles and “swarm mode” for when 2+ agents are running concurrently - background ones only pierce on failures so you don’t get audio soup. Roadmap: Cursor + Aider adapters, Linux/Windows after that. Repo: https://github.com/heardlabs/heard Voice samples: https://heard.dev/ Would love feedback on features that broke or stuff that people would like to see! And if anyone else hate starring at the screen too lol submitted by /u/decentralizedbee

Originally posted by u/decentralizedbee on r/ArtificialInteligence