Original Reddit post

Been building Lakon initially as a prompt compression tool because I personally kept running into token/credit limits while using ChatGPT, Claude, Gemini etc. At first I thought: “people just need shorter prompts.” But after talking to users and thinking more deeply, I realized something interesting: Prompt length is only a small part of the problem now. The real token drain usually comes from:

  • long conversation history
  • repeated context
  • AI re-explaining things
  • carrying entire chats forward
  • losing context between models/tools For example, sometimes a single ongoing chat becomes more expensive than the actual prompt itself. So now I’m thinking of evolving Lakon from:

“prompt compressor” into something more like: “AI context optimizer” Current idea for the next patch: user pastes an entire AI conversation using shortcuts or paste the chat link or use our extension for fetching out your exact complete conversation. Lakon extracts:

  • goals
  • decisions
  • important context
  • unresolved tasks then creates a compact continuation snapshot that can be reused in a new chat/model Kind of like compressing the working memory instead of only compressing prompts. Still brainstorming the architecture because ultra-long chats can exceed LLM context limits themselves. Curious: Do you think this is a real pain point, or am I overestimating it because I’m a heavy AI user? submitted by /u/PriorNervous1031

Originally posted by u/PriorNervous1031 on r/ArtificialInteligence