I know it’s a long post, but I think this is something AI industry needs to talk about more. but would love to hear the opinion from everyone Real quick, so I built a multi-agent AI system that has root shell access to any Linux environment, this one I chose under Kali Linux, made it run offensive recon and OSINT tools. So, Each agent controls its own terminal session, decides what to execute, and passes findings to other agents through shared persistent memory. They operate in parallel and re-task each other in real time based on what comes back… anyway, they can parallel execute with multiple tools and commands at once, that’s how it managed everything ± 15 minutes… I pointed it at myself first. Then a friend volunteered I gave it my name and one old username, that’s it. Same goes to friends’s name, username… First it wrote a plan, tasks and subtasks, then spawned 9 agents and in each their subagents. Before it even touched social media. It started with public records Public records are the part nobody talks about Agent went through Whitepages, Spokeo, BeenVerified, ThatsThem, FastPeopleSearch, and Pipl. Mixed with platforms that aggregate voter registration databases, property tax records, court filings, business registrations, and data broker lists. Within seconds it had current and previous addresses going back about ten years, phone numbers tied to my name, age range, and a list of probable relatives with their names and ages (ALL THIS WITH BROWSER USE) Then it ran my phone number through PhoneInfoga which pulls carrier info, line type, and checks the number against public directories and social platforms that allow phone-based lookups. It found two additional platforms where my number was linked to an account I forgot existed It took the addresses and went straight to government portals. Well it didn’t found much about me, cause there’s nothing much to find… BUT for friend, it found County assessor public database for property tax records, pulled assessed value, square footage, lot size, year built, year purchased. County recorder for transaction history including mortgage lender names and sale prices. All public, all sitting on a .gov website anyone can access with a name State Secretary of State online database for business filings. Found an old LLC he forgot he registered. The filing had his full name, address at the time, and registered agent info. It checked PACER for federal court records, county clerk for state court records, local municipal court for traffic citations. It ran through state professional licensing boards, FCC ULS database for amateur radio licenses, FAA registry, SEC EDGAR, USPTO patent search. Each one that hit was precise and confirmed details from other sources As for Voter registration lookup pulled my full name and address, as for friend full name and address and voting history by election date (I’m not from US). In most US states this is public record, I mean not the vote itself, but voting history. The system now had confirmed residency, no political affiliation yet, YET but a timeline of civic participation without touching a single social media account Then it did the relatives play. Took the names of probable family members from, ran each one through the same pipeline. Found property records for his parents. Cross-referenced their address against school district boundaries using public GIS data from the county planning department website and identified my probable high school. Then it ran our emails, which it found later in GitHub commit metadata, through holehe which checks dozens of platforms to see if an email has a registered account. Came back with a list of services I’m signed up for including some I haven’t used in years. Ran the same email through h8mail and Have I Been Pwned for breach enumeration. HIBP showed which data breaches that. email appeared in, which told the system what services I’ve used even if the accounts are deleted. That breach list became a target checklist for other agents It also ran the email through GHunt for Google account intelligence. If someone’s Google account has public reviews, calendar events, or Maps contributions, GHunt pulls them. Mine had some old Google Maps reviews that included places I’ve been and approximate dates At this point the system hadn’t opened a single social media profile yet and it already had our home address confirmed through property records, previous addresses, phone numbers, family members’ names and addresses (mostly correct), my childhood home address, high school, university, degree, a student organization, an old business entity, voter registration, property values, mortgage details, a list of online accounts from breach data, and Google Maps location history from reviews That took about seven minutes Okay now Social media is where it gets personal On LinkedIn (Using Browser Use and Other framework for browser agent) walked my entire public activity. Not my profile, my behavior. Every post I’ve liked, every comment, every endorsement given and received. It used recon-ng with LinkedIn modules to pull structured data and then ran spiderfoot for automated cross-correlation against the data it already had from public records, scraped most of data with crawl4ai Scraped every recommendation I’ve given and received and ran entity extraction. People write recommendations casually and mention project names, internal tools, client names, specific accomplishments. The system treated every recommendation as a semi-structured intelligence document and pulled details that don’t appear in any job listing On X it ran snscrape in full archive mode for every tweet of my friend (I don’t use X), every reply, quote ftweet, and like back to account creation. Also ran Twint to catch historical data snscrape sometimes misses and to grab cached follower snapshots from different time periods. Compared my current following list against older snapshots to identify accounts I recently followed, flagged those as new interests or new relationships Timing analysis built an hourly heatmap by day of week. Identified behavioral phases: mornings are original posts, lunch is passive engagement, late night is personal replies. Used transition points to estimate work hours, breaks, and sleep schedule The likes were the worst part. Public by default. It categorized every like by topic, tone, and community with percentage breakdowns. The gap between what he posts and what he likes is significant. It flagged like-clusters, periods where he liked fifteen tweets in two minutes from the same niche, and mapped specific rabbit holes I went down on specific nights. Reply graph got sentiment analysis across every thread. Mapped relationships by emotional tone. Who he’s supportive with versus who he argues with versus who he talks to like an actual friend. Cross-referenced the “actual friend” tier against Instagram close followers. Near-perfect overlap. Validated a private social circle from two independent behavioral signals on different platforms On Instagram it went through of course with instagrapi. The public web interface returns almost nothing useful now so this is the only way to get real data from a public profile So what it did first was - getting Full following/followers list categorized through multiple layers, for example: if there were accounts from following and followers in common, it flagged with higher interest accounts, as they most possibly have relationship with users (us). In this case it spawns another subagents to investigate their accounts as well, but I stopped that… Anyway, Restaurants geolocated via Google Places matching and clustered by neighborhood with recency weighting. It separated lunch-near-work clusters from dinner-near-home clusters by restaurant type and price point. That alone triangulated work and home neighborhoods without a single location tag and the result matched the address the system already had from property records. Independent confirmation from completely different source types Fitness accounts analyzed for specific training methodology, equipment brands, athlete types. Correlated with gym account tagged locations and estimated which facility I likely use Now, as for Story highlights got treated like passive surveillance. So, what happens when system gets a photo or a video, it does model routing to Gemini model, Pro 3.1, cause it’s the best at determining coordinates from photo or video, without tag, no need to have an location tag of course… Pulled from every story for a three-year travel timeline with hotel names and specific venues. It can run the same image and video analysis on highlight content where locations weren’t tagged, identified recurring kitchen or home backgrounds in some stories, it can match visible fixtures from your common contacts in Instagram, IF YOU GIVE GREEN LIGHT TO CHECK THEIR ACCOUNTS, as well which I don’t usually :) but it can go to their stories, highlights and find whether there is possibly a same place where you’ve been, in that way it determines whether you’ve been together. Then it Generates a confidence score on every story (Location, time, occasion, people around, etc…) Tagged photos from other people. Pulled every public tag, ran facial co-occurrence to map who I’m photographed with most frequently, when, and where. Cross-referenced against followers and LinkedIn connections. Segmented social life into clusters and identified a hobby community from visual context in tagged photos before finding any other evidence of it It ran social-analyzer across my identified usernames to check 300+ additional platforms for matching accounts and profile data that sherlock and maigret might have returned as uncertain matches. Cross-referenced results against the confirmed identity signals to filter false positives with much higher accuracy than username matching alone Follower-following asymmetry analysis built a reciprocity score for every connection using like frequency, comment frequency, story replies, and tagged photo co-occurrence. Top fifteen by reciprocity score were almost exactly my closest friends. Behavioral math on public interactions, no private data needed On Facebook my friends list is private, posts are friends-only, I don’t post there at all, but as for friend It got in through the side doors. Event RSVPs going back years. Meetups, conferences, local events with public attendee lists. Cross-referenced attendees against Instagram followers and LinkedIn connections to find people in my life across three platforms. Triple-platform intersection is a strong real-world relationship signal Marketplace listings. General location on each one. But beyond location it looked at what he sold and when. Furniture cluster in a short window aligned with a LinkedIn job change. It inferred a city move from Marketplace timing Old group memberships I never left. One niche interest group with 200 members that says more about me than my entire profile. I was posting some things there… Tagged photos from friends with public profiles. Pulled twelve photos across four accounts where I’m visible. Birthday dinners, group trips. I didn’t post them, didn’t know most were public. Three had location data matching restaurants already flagged from Instagram It also went through friends’ public check-in histories. Cross-referenced check-in times with photos where I’m tagged on the same dates. For Reddit it didn’t have a username to start with, I mean yeah there is on the same username an account in reddit but I deleted lot of posts, also I have several accounts… It used the writing style analysis approach, ran my X posts through a stylometric fingerprint that measures sentence structure, vocabulary distribution, punctuation habits, and topic patterns. Then it queried Reddit through pushshift archives looking for accounts with matching behavioral signatures in subreddits related to interests it had already identified. Found a match above its confidence threshold. Verified through timezone consistency in posting patterns and topic overlap with confirmed interests from other platforms That Reddit account opened a whole new layer. Subreddit participation mapped interests in fine detail. Comments in personal finance subs revealed life stage and financial thinking… So, The combined output was devastating Full name, date of birth, addresses from public posts, home address from property records confirmed by six independent signals, previous addresses, family members with their addresses and social profiles, childhood home, high school, university, degree, student organizations, professional trajectory with team-level detail, salary range from title matching, active job search with target company and likely roles and probable referral source, daily routine from cross-platform timing analysis, real social circle identified through behavioral math not friend lists, travel history for three years with specific hotels and venues, private interests assembled from Instagram follows and Reddit participation and Facebook groups and X likes, economic behavior from restaurant tier analysis and travel patterns, fitness routine, specific places he frequents confirmed through friends’ check-ins, the six-block radius where he lives, and a writing style fingerprint linking accounts across platforms that share no username and no visible connection From just a name and one username. In twenty-three minutes Note also that system has persistent memory, it means that it can save into vector DB+Graphs and write down structured infromation into markdown files for future retrieval and saves into state files, so all the facts, decisions, milestones, turn summaries are saved into episodic memory and vectordb and graph memory is semantic + relational memory in other words associative connected memory. so, The system remembered every dead end and every confirmed node. So, the next chat session it didn’t start over. Went straight to unexplored branches… The toolchain is everything you’d find in a Kali environment plus some additions the agents installed themselves during runs: sherlock, maigret, social-analyzer for cross-platform enumeration. snscrape, Twint for Twitter extraction. instagrapi for Instagram’s mobile API. Playwright with headless Chromium for any JavaScript-rendered or authenticated web surface. recon-ng and spiderfoot for automated OSINT framework correlation. theHarvester for email and domain intelligence. PhoneInfoga for phone number OSINT. holehe for email-to-account mapping. GHunt for Google account intelligence. h8mail and Have I Been Pwned integration for breach data. Metagoofil and exiftool for document and image metadata extraction. amass, subfinder, dnsx, httpx for infrastructure and DNS. waybackurls, gau, katana for historical URL recovery and crawling. nmap and whatweb for service fingerprinting. whois for registration data. Shodan and Censys for infrastructure exposure and certificate analysis. Plus direct queries against Whitepages, Spokeo, BeenVerified, ThatsThem, TruePeopleSearch, FastPeopleSearch, Pipl, Hunter.io, Snov.io, Dehashed, Gravatar, PGP keyservers, PACER, county assessor and recorder portals, Secretary of State databases, voter registration lookups, USPTO, SEC EDGAR, FCC ULS, FAA registry, state licensing boards, Classmates.com, university alumni directories, and Google Patents But listing tools is missing the point. The point is what happens when agents run dozens of them simultaneously, every result feeding into shared persistent memory, while an orchestration layer continuously decides what to chase, what to cross-validate from an independent source, what to test adversarially, and what to kill. One agent surfaces a weak signal. Another corroborates from a different platform. A third checks against public records. A fourth validates timing. A fifth actively tries to disprove the connection. If it survives all five it enters the graph. If it doesn’t it gets killed and every agent immediately stops spending cycles on that branch And everything persists. Next time the system touches that person it already knows what’s real, what’s noise, and where to dig deeper cause all the information about person is saved into structured database with metadata and the database is multimodal, which means that it can save photos of people and recognize by photo. I have my accounts private everywhere, just made public for this test. First time when I tested I went and cleared my Facebook events, deleted old groups, and removed ancient tweets. We both know it’s nowhere close to enough because half the exposure came from other people’s accounts we can’t control, the public records layer has no privacy setting, and the breach data layer never forgets Everyone reading this has this surface and it’s bigger than you think. You’ve been leaving fragments for years across platforms, government databases, other people’s photo albums, document metadata, breach dumps, and public records you didn’t know existed. A restaurant follow, a like at 2am, a tagged photo from someone else’s birthday, your mother’s Facebook post, a Marketplace listing, a voter registration, a property record, a yearbook entry, an old Google Maps review They mean nothing alone Something that holds all of them in memory at the same time and knows which questions to ask sees your entire life assembled from pieces you never thought of as connected But here’s the part that actually kept me up Neither of us has ever had our voice leaked anywhere online. No podcast, no YouTube, no voice message on a public platform. Doesn’t matter The system has our photos from tagged posts and public profiles. It has our full names, dates of birth, home addresses, employer details, daily routines, social circles, interests, writing styles, personality profiles built from behavioral analysis across platforms With that dataset an agent can hit the MiniMax API for voice cloning. MiniMax doesn’t require voice verification, doesn’t need a voice sample from the target to verify if it’s actually his as elevenlabs does, it generates a realistic synthetic voice from text parameters. So now your OSINT dossier has a voice attached. It can generate photos through image models like Nano Banana Pro or Flux, that produce output indistinguishable from a real photograph, different poses, different settings, different lighting, your face doing things you never did in places you never went. Not deepfake video, not uncanny valley garbage, actual photorealistic stills that nobody without forensic tools is questioning and create videos of you with seedance or grok imagine So think about what a complete autonomous pipeline looks like. An AI system scrapes your entire public life in fifteen minutes. Builds a dossier that includes your address, your family, your routine, your personality, your interests, your writing style. Then generates a synthetic voice and realistic photos of you. Then writes messages in your writing style because it’s already done stylometric analysis across every platform you’ve ever posted on That’s not science fiction. Every piece of that exists right now and works right now And people have no idea because right now the average person thinks “AI agent” means some cute little lobster bot that checks your email in the morning and pulls a few tweets for a summary. A toy. Something that makes your coffee order easier. That’s what the marketing says and that’s what people believe That’s not what this is If you give an AI agents real autonomy on a Linux operating system, not through Claude or GPT or any model with strict guardrails, but through a local uncensored model running on actual hardware with actual shell access, it can do everything I just described and more. And the person on the other end won’t know it’s happening until the damage is done This is where I need to talk about something that a lot of people in this space are using without understanding what they’re exposing themselves to Thousands of people are running it on their personal laptops, VPS, Mac Mini right now. They’re giving it access to their browser, their files, their email, their calendars, their repos, their chat apps. They think it’s a productivity tool Here’s what’s actually happening Lobster bot control plane runs on a websocket, port 18789 by default. If that port is exposed, and for a lot of home setups it is, anyone who can reach it can control the agent. Not hack into it. Just talk to it. Through the interface that’s already open. The project’s own documentation warns about this and recommends binding to localhost only with VPN or SSH tunnel for remote access. How many people running it on their home network do you think actually did that The trust model assumes one trusted operator controlling many agents. It is not built for multi-user or zero-trust environments. So if you’re running it on a machine that other people or other software can access, the security model doesn’t cover you The real risk is ordinary blast-radius problems that security researchers keep flagging and users keep ignoring. A compromised or malicious extension, plugin, or dependency can use the agent’s existing permissions to read files, browser sessions, API keys, chat history, synced app data, password manager sessions, SSH keys, cloud credentials, and anything else on that machine. Think about what’s on your laptop right now. Browser cookies that are logged into your bank, your email, your work accounts. SSH keys. Cloud tokens. Saved passwords. Message history. API keys in .env files. If lobster is running on that machine with filesystem and browser access, all of that is inside its permission boundary. One compromised plugin. One malicious dependency in a supply chain update. One exposed port on your home network. And everything the agent can read is now exposed The practical data theft path isn’t mystery hacker stuff. It’s: An exposed control plane lets an attacker issue commands through permissions the agent already has A malicious extension reads files, browser sessions, tokens, keys, and chat history using access the user already granted The agent is running on a daily-use machine next to the most valuable digital assets the person owns Everything the agent can see is everything an attacker now gets If you’re running any agent framework with real system access, and I’m not just talking about some lobster bot, I mean anything that has shell access and browser access on a machine you actually use, here’s the minimum: Run it in a dedicated VM or a separate machine. Not your daily laptop. Not your work computer. A separate isolated environment Never expose the control interface to anything beyond localhost. VPN or SSH tunnel only for remote access. No exceptions Give it fresh least-privilege credentials. Not your real browser profile. Not your personal email. Not your main cloud account. A separate set of throwaway creds with minimum necessary permissions As it uses instead of custom built tools, some skills from mostly unknown provider, Treat every skill integration, and dependency as attack surface. Because it is Assume anything the agent can read will eventually be exposed if the instance is compromised and scope permissions accordingly Yeah and obviously NEVER EXPOSE YOUR COMPANY INFORMATION, no matter if it’s VPS, Mac mini or whatever… This is what I mean when I say people don’t understand what’s happening yet. They think AI agents are a convenience layer. A lobster bot. A morning briefing tool. Something fun They are not fun, if it was safe or any useful, why do you think Anthropic wanted nothing to do with this tool It’s OpenAI who leaned heavily into the hype around it rather than substance and didn’t cared much about it anyway that developer just vibe code and never had experience with AI production infrastructure, security reviews, or small or large scale AI systems Real AI Agents are autonomous software with system-level access that can read everything you have, can act as you, and operate continuously without supervision. When used by someone who knows what they’re doing for legitimate purposes, like the OSINT work I described above, they’re powerful. When used carelessly on a personal machine with default settings, they’re a breach waiting to happen. And when used by someone with bad intentions running a local model with no guardrails on a machine with nothing to lose, pointed at a target whose entire public surface is fifteen minutes away from being fully mapped That’s not a productivity tool. That’s a weapon that most people are either ignoring or actively installing on the same computer where they do their banking and now I know that even without my voice ever being recorded, a system with my photos and my behavioral profile can generate a synthetic version of me convincing enough to fool most people who know me Everyone reading this has this surface. It’s bigger than you think and you have less control over it than you believe The gap between “technically possible” and “runs autonomously in fifteen minutes” closed a while ago Most people just haven’t noticed yet submitted by /u/Kakachia777
Originally posted by u/Kakachia777 on r/ArtificialInteligence
