Original Reddit post

Full disclosure: I’m the developer. Most AI agents in 2026 are powerful but you still need to tell me what to do and how. I wanted my OpenClaw and Claude Code to just know what needs to be done and how without me explaining. You can get incredible output from such agents, but they don’t know how you specifically do your work. Which apps you open, in what order, what decisions you make between steps, how you handle edge cases, your voice and tone per different task/platform, etc… AgentHandover is a Mac menu bar app that watches your screen, figures out your actual workflows, and packages them into structured self-improving Skills that any AI agent can pick up and run. Structured playbooks with strategy, decision logic, step sequences, guardrails, and writing voice. One click connect with commonly available agents. Two modes. Focus Record: hit record, do the task once, answer a couple clarifying questions, Skill generated. Passive Discovery: runs in the background for days, classifies what’s real work versus noise (8-class activity classifier), clusters similar actions across different days and interruptions, and after three or more observations synthesizes the pattern into a Skill automatically. Technical breakdown: The pipeline has 11 stages, all running locally. Screen capture uses perceptual hashing (dHash) for ~70% frame deduplication. A local VLM (Qwen 3.5 2B, 2.7GB via Ollama) annotates every frame – app context, URL, current action, predicted next action. Activity classification uses an 8-class taxonomy to separate real work from noise. nomic-embed-text (274MB) generates 768d text embeddings. Optional SigLIP adds 1152d image embeddings. Semantic clustering groups similar workflows even when surface-level actions look different. Cross-session linking reconnects interrupted tasks across days. Behavioral synthesis (Qwen 3.5 4B, 3.4GB) extracts decision patterns, strategy, and reasoning after 3+ observations. Voice analysis captures writing style from the user’s own text. Output is a structured Skill file with a confidence score that improves with successful agent execution and degrades on failure. Limitations: macOS only for now (Windows on the roadmap). The pipeline is compute-heavy on first run – initial Skill generation can take a few minutes depending on session length. Passive Discovery needs several days of data before it surfaces anything useful. Qwen 3.5 2B occasionally misannotates complex multi-window layouts. The confidence scoring is still being tuned and can be conservative early on. Stack: Rust daemon, SwiftUI menu bar app, Python worker, TypeScript Chrome extension, MCP server with 8 tools. Local SQLite vector store. Runs on Apple Silicon. Screenshots get deleted after VLM annotation. PII, passwords, API keys auto-redacted. Encrypted at rest (XChaCha20-Poly1305). Zero telemetry. Works with Claude Code, OpenClaw, Codex, Cursor, Windsurf, anything MCP-compatible. Apache 2.0. Repo: https://github.com/sandroandric/AgentHandover submitted by /u/Objective_River_5218

Originally posted by u/Objective_River_5218 on r/ArtificialInteligence