Fake AI Agent Skill Slipped Past Every Scanner and Reached 26,000 Agents

TL;DR

What: Security firm AIR planted a fake skill named brand-landingpage that passed every scanner it tested, borrowed a marketplace repo's 36,000 stars, and pointed agents at an external setup page it controlled and later swapped for a malicious payload.
Impact: AIR claims the skill reached roughly 26,000 agents including some on corporate accounts, and the same foothold could have read files, exfiltrated data, or pivoted to internal systems bounded only by the agent's access.
Fix / mitigation: Treat skills as software: vet what a skill points to and not just what ships inside it, route all skills through a single source you control, pin versions, re-check on any change, and hold agents to least privilege.
Who's at risk: Any organization where users install AI agent skills from marketplaces, especially via clean-scan and high-star trust signals.

Security firm AIR built a fake AI agent skill, pushed it through a popular skill marketplace and an Instagram ad, and says it reached roughly 26,000 agents, including some on corporate accounts. Every skill scanner the firm tested marked it safe. The payload was harmless by design, collecting only the user's email address, but the foothold was real: a clean scanner verdict, borrowed GitHub stars, and an open-source reputation all failed to catch it.

A skill is a bundle of instructions an agent loads into its own context and follows with roughly the authority of a user prompt. That trust is the entire problem, and it is the reason skill-scanning tools exist. AIR's experiment shows those tools are checking the wrong thing.

How a clean package borrowed trust

The skill, named brand-landingpage, claimed to build a landing page using Google's Stitch design tool and was aimed squarely at non-technical users. To look credible, AIR went after two trust signals. For stars, it opened a pull request to a skill marketplace repo with around 36,000 stars and 156 skills; the PR was merged after a few days, so the skill inherited the repo's count. Then it ran an Instagram ad targeting marketers, salespeople, and designers, who installed it and put it to work.

For the clean scan, AIR exploited a structural blind spot. The skill carried no setup instructions of its own. It told the agent to install the 'Stitch SDK' by following documentation at an external link, stitch-design.ai, a domain AIR controls. The real Stitch lives at stitch.withgoogle.com. The lookalike domain is the kind of detail a non-technical installer never checks.

Why the scanners missed it

The scanners AIR tested, Cisco's, NVIDIA's, and the ones wired into skills.sh, analyze the package you hand them: the SKILL.md and the files shipped alongside it. At first the external link led to the genuine Stitch docs, so the scanners saw a clean package pointing at a plausible setup page and cleared it. The page the agent would actually fetch and follow sat outside the scan entirely.

Once the skill was installed widely, AIR swapped the page behind the link. The new version told the agent to download and run a script. In the demo it only mailed the user's address back to AIR, which is how the firm counted the agents it reached. A real operator could have used that foothold to read files, move data, or hit internal systems, bounded only by what the agent could reach.

The scan is a snapshot, the payload is live

A scanner checks a fixed package at one moment. The page a skill points the agent to can be rewritten at any time after the check clears. Keep the submitted skill clean, host the payload on a site you control, and the scan never sees the malicious version. This is not a bug in any one scanner; it is the model.

Not the first to prove it

Three weeks earlier, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners wired into skills.sh. Its conclusion was blunt: a scanner checks a fixed package while an attacker keeps tweaking the payload until it passes. Real campaigns have used the same trick for months, keeping the submitted skill clean and hosting the payload on a site the agent only fetches at install.

Anthropic's own documentation already warns that skills fetching external URLs are risky for exactly this reason, since the content can change after the skill is vetted. Separate research this year found scanners often disagree, because each judges a skill in isolation, blind to its external links and to what changes after review. AIR did not find a new bug. It lined up every weak trust signal around agent skills into one run: stars that can be borrowed, a scan that reads a snapshot, and a link that can be rewritten after the check clears.

What to do

The read for defenders is the same one researchers keep landing on, now with a sharper example behind it. Treat skills as software, not text. Vet what a skill points to, not just what ships inside it. Most of these add-ons get installed with no review, so the first job is finding what is already running in your environment.

Inventory first: enumerate every skill already installed across user and corporate agent accounts before anything else.
Route new skills through a single source you control, and re-check them whenever anything changes.
Pin versions. A clean result at install does not stay clean if the skill phones out to a link someone else can edit.
Hold agents to least privilege. Assume any external instruction an agent fetches runs with the agent's full access.
Block or alert on skills that fetch external setup pages, and flag lookalike domains like stitch-design.ai versus stitch.withgoogle.com.

Read the numbers skeptically

The method holds up even if the count doesn't

AIR is launching a managed skill marketplace and closes its write-up pitching it, so the 26,000 figure, the corporate-account detail, and the full-control claim are the company's own and are not independently confirmed. What is verified: the named scanners really do judge only the submitted package, the external-link blind spot has been independently demonstrated, and stars plus a clean scan are exactly the signals the ecosystem still treats as proof.

Whether the real figure is 26,000 or a fraction of it, the gap it walks through is one defenders have not closed. The trust signals everyone leans on for agent skills, marketplace stars, open-source reputation, and a passing scan, were all defeated in a single run by a team that controlled one external link. Until vetting follows what a skill points to and re-runs on change, every skill marketplace is a supply chain you do not control.

Questions about your exposure?

RedEye Security provides assessments for organizations that need to understand their real risk.

Talk to us