Khazad – a transparent semantic cache for LLM API calls, zero code changes

www.reddit.com

Khazad – a transparent semantic cache for LLM API calls, zero code changes

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 2 hours ago

Original Reddit post

I built Khazad, a semantic cache for LLM API calls that needs zero changes to your app code. Instead of wrapping SDKs or running a proxy, it patches the httpx transport layer. After init(), it intercepts outgoing LLM requests, embeds the conversation, and serves semantically-equivalent ones from a Redis 8 Vector Set. Any httpx-based SDK works out of the box: OpenAI, Anthropic, Gemini, Azure OpenAI, Mistral. Highlights:

Model-aware
Conversation-aware
Streaming both ways Best for repetitive traffic like FAQ bots, RAG front-ends, and dev/CI runs. Python 3.10+, Redis 8, MIT licensed. Feedback welcome. GitHub: https://github.com/GuglielmoCerri/khazad submitted by /u/GugliC

Originally posted by u/GugliC on r/ArtificialInteligence

You must log in or # to comment.

Chat