I built a Claude Code plugin called HandsOn that gives Claude actual screen access. It can take screenshots, see what’s on your screen, click buttons, type text, scroll, drag — full desktop control through the accessibility tree, OCR, and framework detection. What Claude can do with it Desktop automation — Automate any app on your computer, not just browsers. Settings, install wizards, legacy apps with no API Accessibility-first targeting — Reads UI through the Windows UIA / macOS AXUIElement accessibility tree, with automatic OCR fallback App interaction — Fill out forms, click through dialogs, manage windows, launch programs Visual diffing — Baseline a screenshot, make changes, see exactly what pixels moved Dialog monitoring — Background watcher catches new popups/toasts while Claude is working Framework detection — Identifies Qt, WPF, Electron, WinForms, etc. and adapts its approach How it works It’s not pixel-guessing. HandsOn uses a layered targeting strategy : Accessibility tree (UIA / AXUIElement) — fast, precise, DPI-aware OCR — finds any visible text when accessibility can’t Framework detection — tells Claude why something failed and what to try Claude’s vision — screenshot fed directly to Claude for everything else 33 tools across vision, input, accessibility, OCR, window management, visual diff, and more. Works on Windows and macOS . Install
From GitLab
claude plugin marketplace add
git@gitlab.com
:3spky5u/HandsOn.git
From Codeberg
claude plugin marketplace add
git@codeberg.org
:3spky5u/HandsOn.git
Then install
claude plugin install handson ```
Still alpha — Claude will occasionally misclick or need a retry on complex workflows — but it's genuinely useful and getting better with each release.
Repo:
GitLab
|
Codeberg
| MIT licensed | Happy to answer questions.
submitted by
/u/3spky5u-oss
Originally posted by u/3spky5u-oss on r/ClaudeCode
