Original Reddit post

Hey everyone, I’ve been working on a project called POMP , and I’ve finally reached a stage where I need some “in the wild” feedback. A first simple demo video: https://www.youtube.com/watch?v=WHHVK-p24pY The core idea is an Ambient Agentic System . Unlike a standard chatbot, POMP is designed to stay in the background 24/7. It’s primarily voice-controlled (it has “ears” via mic and “eyes” via camera), but what makes it unique is how it handles tasks that require a screen. The “Program that doesn’t exist” concept: When the agent needs to show you something (like a dashboard, a specific Gmail thread, or a WhatsApp summary), it doesn’t just send text. It generates a custom HTML interface on the fly—an ephemeral GUI created specifically for that moment’s context. Current Capabilities (MCP Architecture): I’m leveraging the Model Context Protocol (MCP) to give it real-world agency. Currently, it can: WhatsApp: Send and summarize messages. Gmail: Interact with your inbox. Chrome DevTools: Connect and interact with your browser. Weather/Tools: Standard API integrations via MCP. The Tech Stack: Node.js backend. Voice-to-Action pipeline. Generative HTML/UI rendering. Model Context Protocol (MCP) servers for tool use. Fair Warning: This is an early Alpha . It’s buggy, the latency needs work, and I’m still refining the agentic loops. I’m looking for feedback from people interested in ambient computing and generative UI. I’ve put the code on GitHub because I want to see what other MCP servers the community thinks would be game-changers for an always-on agent. GitHub / Demo: https://github.com/mrqc/pomp Would love to hear your thoughts on the “headless” approach. Is voice-first + generative UI the right direction for the next generation of OS-level agents? I enjoy working on it to bring down my desire for all the interactions I have seen in Star Trek, Star Wars, Minority Report, and others. submitted by /u/erhard-dinhobl

Originally posted by u/erhard-dinhobl on r/ArtificialInteligence