eifachposte

eifachposte

I’m the sole developer and founder I built a local-first desktop AI assistant for Windows that uses Ollama as the inference backend but adds an orchestration layer on top for tool use, persistent memory, and voice interaction. Sharing because it sits in an interesting spot between raw local chat UIs and cloud-heavy agent frameworks. What it does technically: The app runs a local Director model (qwen3:8b default) that doesn’t just chat but produces structured action plans. A safety layer validates every plan before execution. No tool call runs without passing through a policy gate, file writes require user approval, screen automation is opt-in and off by default. There are 30+ tools available: web search, file management, calculator, weather, dictionary, screen reading, timers, reminders, notes, document ingestion, offline Wikipedia lookup. The system selects only the relevant tools per query to keep prompt size manageable for local context windows. Memory persists across sessions in a local SQLite database. The model has context about who you are and what you’ve discussed before, without any of that leaving your machine. There’s an offline reflection process that consolidates and cleans memory over time. Voice runs fully local: faster-whisper for speech-to-text, Kokoro for text-to-speech, Silero for voice activity detection. Common queries (time, weather, math) take a shortcut path that bypasses the LLM entirely for near-instant voice responses. Hardware detection at install profiles your GPU, RAM, and CPU, then assigns the right models and context window sizes automatically. Works on my RTX 3080 10GB without issues. Limitations: Context window is still the main bottleneck for complex tasks on 8B models Windows only for now Speaker identification is broken due to a dependency conflict (non-fatal, just disabled) Single model handles all routing, no multi-agent setup yet Stack: Python, PyWebView, Ollama, SQLite. No Docker, no server, no account required. Optional cloud mode if you want to plug in your own API keys (DeepSeek, OpenAI, Anthropic, Google, Qwen) but local is the default and it works fully offline. Source is proprietary (solo commercial project) but the app is free with no data collection. GitHub releases: https://github.com/zotex12/innerzero-releases Info: https://innerzero.com/ submitted by /u/unstoppableXHD

Originally posted by u/unstoppableXHD on r/ArtificialInteligence

InnerZero: local-first AI orchestration layer over Ollama with tool use, persistent memory, and voice (free, Windows)

InnerZero: local-first AI orchestration layer over Ollama with tool use, persistent memory, and voice (free, Windows)