I’m the builder. Sharing because the technical constraints were interesting, and enlightening to work within. Prelude runs two agents sequentially. The first conducts a voice conversation before your therapy session to surface what’s actually on your mind. The second takes the conversation output and generates a structured brief you bring into the session. The hard constraint: everything runs on-device. Apple Intelligence for inference, premium on-device voices for TTS. No network calls at all. What that actually cost me: on-device TTS quality is a noticeable step behind cloud equivalents. The context window is significantly tighter than hosted models, which meant prompt design had to be leaner than I’d have written otherwise. Every token justified. What it forced me to learn: chaining two agents with a constrained local context means the handoff between them has to be clean. If the first agent’s output is noisy, the brief agent compounds the noise. I spent more time on that transition than anything else. Unexpected finding: users share more freely when inference is local. For pre-therapy thoughts specifically, that behavioral shift matters. The on-device constraint ended up making the product better. Tighter prompts, cleaner agent handoff, and a trust dynamic with users that a cloud version couldn’t replicate. Sometimes the limitation is the design. Free forever, offline and no ads Demo/AppStore: https://apps.apple.com/us/app/prelude-therapy-prep/id6761587576 submitted by /u/Emojinapp
Originally posted by u/Emojinapp on r/ArtificialInteligence
