Original Reddit post

Hi everyone, I wanted to share a project that started as an “impossible” experiment and turned into a bit of an obsession over the last few months. The Problem: I’ve always been uneasy about the fact that every time I need to transcribe an important meeting or translate a sensitive conversation, my data has to travel across the world, sit on a Big Tech server, and stay there indefinitely. I wanted the power of AI, but with the privacy of a locked paper diary. The Challenge (The “RAM Struggle”): Most people told me: “You can’t run a reliable Speech-to-Text (STT) model AND an LLM for real-time summaries on a phone without it melting.” And honestly, they were almost right. Calibrating the CPU and RAM usage to prevent the app from crashing while multitasking was a nightmare. I spent countless nights optimizing model weights and fine-tuning memory management to ensure the device could handle the load without a 5-second latency. The Result: After endless testing and optimization, I finally got it working. I’ve built an app that: Transcribes in real-time with accuracy I’m actually proud of. Generates instant AI summaries and translations. Works 100% LOCALLY. No cloud, no external APIs, zero bytes leaving the device. It even works perfectly in Airplane Mode. It’s been a wild ride of C++ optimizations and testing on mid-range devices to see how far I could push the hardware. I’m not here to sell anything; I’m just genuinely curious to hear from the privacy-conscious and dev communities: Would you trust an on-device AI for your sensitive work meetings knowing the data never touches the internet? Do you know of other projects that have successfully tamed LLMs on mobile without massive battery drain? What “privacy-first” feature would be a dealbreaker for you in a tool like this? I’d love to chat about the technical hurdles or the use cases for this kind of “offline-first” approach! submitted by /u/dai_app

Originally posted by u/dai_app on r/ArtificialInteligence