Original Reddit post

From the last 3 months I have been building and improving my local LLM-orchestrator. It started as a AI calendar assistant, and now is my server AI coordinator, with 4 nodes, tools, and multi agent dispatch. It is a stateless session (main session) that I interact through a WSL terminal or through my dedicated Android app. This session dispatch and is allow to perform some inline tasks. Its injected preamble is everything. Identity, rules, behavior, tools, instructions, but specially memory. It has a multi tier level memory, using RAG, and graphiti. I tried with a permanent session that only recycle at midnight, but by the end of the day was sluggish, confessed, and bloated from a long day of messages. Stateless with a well designed preamble (<8k tokens) provides the best context, awareness and trend on conversations. It has a Today’s memory with raw and compression messages that injects in its preamble, a Yesterday’s memory with graphiti and summary (only summary inject). A Past memory, the growing based Yesterday files. Besides it has daily message compression, night introspection, and a context yaml file that it uses at its discretions for reminders that also injects back. For example, a temporary change in a file or server, it writes it here for awareness. The graphiti memory doesn’t inject in the preamble, but it has a direct query tool that pull from graphiti + RAG based on multiple criteria. Besides, all its agents dispatches and reports back are recorded in the DB and can be query. So, it can look back few weeks for results and correlate with current discussions. Isn’t it what developers do with AI agents? Why it seems to be a major issue with AI and memory? I am missing something? I am working in a repository for my system, it is a frontier LLM-orchestrator and assistant with full system control. submitted by /u/chryseobacterium

Originally posted by u/chryseobacterium on r/ArtificialInteligence