Original Reddit post

been working on this for a while now. it’s a swift MCP server that reads the accessibility tree of any running app on your mac, so claude can see buttons, text fields, menus, everything, and click/type into them. way more reliable than screenshot + coordinate clicking because you get the actual UI element tree with roles and labels. no vision model needed for basic navigation. works with claude desktop or any mcp client. you point it at an app and it traverses the whole UI hierarchy, then you can interact with specific elements by their accessibility properties. curious if anyone else has been building mcp servers for desktop automation or if most people are sticking with browser-only tools submitted by /u/Deep_Ad1959

Originally posted by u/Deep_Ad1959 on r/ClaudeCode