Original Reddit post

I wanted Claude to answer questions using my personal notes without uploading them anywhere, so I wired it up through a local MCP server called Kwipu (open source, MIT). How it works with Claude: Kwipu runs as an MCP server (FastMCP, stdio) and exposes a single tool, query_graph(question). In Claude Desktop or Claude Code, Claude calls that tool whenever a question touches my notes, and gets back an answer that already cites the source files it came from. The retrieval and the local LLM step happen entirely on my machine, so only the question and the synthesized answer ever reach Claude. That keeps the context Claude receives small and keeps my notes off the cloud. What surprised me is how well this composes with Claude’s own reasoning. Kwipu does the heavy retrieval across a property graph of all my notes (semantic + keyword + recency), returns a grounded, cited answer, and Claude then reasons on top of that instead of trying to grep files one by one. Multi-note questions (“what did I conclude about X across these projects”) actually work. Needs Ollama running locally (llama3.1:8b + nomic-embed-text). Also works with Claude Code. Repo: https://github.com/benmaster82/Kwipu Mostly posting because the local-MCP + Claude combo turned out better than I expected, and I’m curious how others here are giving Claude access to private knowledge bases without sending data out. Anti-hallucination is enforced with mandatory citations, but I’d love to hear how people are validating that Claude actually stays grounded in the returned sources. submitted by /u/WritHerAI

Originally posted by u/WritHerAI on r/ClaudeCode