Original Reddit post

Hey everyone, I wanted to share a project I’ve been working on called Glia. It is a 100% offline, local-first RAG and memory layer designed to connect your AI web chats (Claude, ChatGPT, DeepSeek) with your local developer tools (Claude Code, Cursor, Windsurf) using a unified local database. I wanted something lightweight that did not require pulling heavy Docker containers or subscribing to third-party memory APIs. I settled on a Node.js + SQLite architecture running sqlite-vec (for 768-dim float32 embeddings) alongside SQLite FTS5 for hybrid search, powered completely by local Ollama instances. We just launched a live website that outlines the details and demonstrates the features in action: Website: https://glia-ai.vercel.app/ Codebase: https://github.com/Eshaan-Nair/Glia-AI Technical Stack & Features: Hybrid Search Retrieval: SQLite-vec (using nomic-embed-text locally) + FTS5 keyword prefix matching (porter stemmer). Surgical Sentence-level Trimming: Chunks are sliced into sentences. When a prompt is intercepted, only the exact matching sentences are pulled out of the vector store instead of the whole paragraph. It cuts LLM prompt bloat by ~90-95% in my benchmarks. Knowledge Graph Extraction: An offline task queue uses a local LLM (llama3.1:8b via Ollama) to extract entity triples (subject-relation-object). These are stored in a SQLite facts table (or Neo4j if you run the full Docker compose profile) and fused with the vector retrieval score. HyDE (Hypothetical Document Embeddings): Queries are pre-processed to generate a hypothetical answer, which is embedded together with the original query to bridge semantic gaps. Concurrency: Running SQLite in WAL (Write-Ahead Logging) mode allows the browser extension dashboard and active MCP sessions to read/write concurrently without locking. PII Redaction: Aggressive scrubbing of JWTs, API keys, emails, and IPs in the extension before data is saved. The extension works on Claude.ai , ChatGPT, DeepSeek, Gemini, Grok, and Mistral. The MCP server runs out of the same backend database for your terminal agent or Cursor. You can set it up with a single command: npx glia-ai-setup Glia is completely open-source (MIT). If you like the local-first approach or want to contribute to the SQLite vector pipeline, PRs are very welcome, and a star⭐ on GitHub helps the project get discovered! I would appreciate any feedback on the SQLite hybrid search scaling, the scoring fusion algorithm (RAG pipeline details are in RAG_PIPELINE.md), or local graph extraction performance! submitted by /u/Better-Platypus-3420

Originally posted by u/Better-Platypus-3420 on r/ArtificialInteligence