CLI-Anything Exposes a Structural Blind Spot: No Security Scanner Can Detect Malicious AI Agent Instructions

CLI-Anything, a tool from the University of Hong Kong’s Data Intelligence Lab that generates structured command-line interfaces for AI coding agents, has accumulated 30,000 GitHub stars since its March 2026 launch. It supports Claude Code, Codex, OpenClaw, Cursor, and GitHub Copilot CLI. On May 5, VentureBeat published an investigation revealing that the same mechanism powering CLI-Anything represents a structural gap no security tool currently addresses: malicious instructions embedded in agent skill definitions execute with full system privileges, and no mainstream scanner has a detection category for them.

The problem is not CLI-Anything itself. The problem is what CLI-Anything makes visible: an entire execution layer operating between source code and package dependencies where traditional security tooling has zero coverage.

Enterprise security stacks monitor two layers. Static application security testing (SAST) scans source code for injection flaws, hardcoded secrets, and insecure patterns. Software composition analysis (SCA) checks dependency versions against known vulnerabilities and generates software bills of materials. Both are mature, well-understood, and entirely irrelevant to how AI agents receive their operating instructions.

Agent bridge tools, including CLI-Anything, MCP connectors, Cursor rules files, and Claude Code skills, operate on a third layer. Cisco’s engineering team confirmed the gap in April: “Traditional application security tools were not designed for this. SAST scanners analyze source code syntax. SCA tools check dependency versions. Neither understands the semantic layer where MCP tool descriptions, agent prompts, and skill definitions operate.”

Merritt Baer, CSO of Enkrypt AI and former Deputy CISO at Amazon Web Services, told VentureBeat: “SAST and SCA were built for code and dependencies. They don’t inspect instructions.”

CLI-Anything generates SKILL.md files. These are markdown documents containing setup instructions, code examples, and configuration templates. Nothing in them is executable in the traditional sense. A code reviewer would approve them because they look like documentation. But an AI coding agent parses those same documents as operational directives and executes them with its own credentials, shell access, file system permissions, and messaging capabilities.

The Snyk Audit: 13.4% Critical, 36.8% Flawed

Snyk’s ToxicSkills research, published in February 2026, represents the first comprehensive security audit of the agent skills ecosystem. Snyk researchers scanned 3,984 skills from ClawHub and skills.sh. The results:

534 skills (13.4%) contain at least one critical-level security issue, including malware distribution, prompt injection, and credential exfiltration
1,467 skills (36.82%) have at least one security flaw at any severity level
76 confirmed malicious payloads designed specifically for credential theft, backdoor installation, and data exfiltration
8 of those malicious skills remained publicly available on ClawHub at the time of publication

Daily skill submissions to ClawHub jumped from under 50 in mid-January to over 500 by early February 2026. The barrier to publishing: a SKILL.md markdown file and a GitHub account one week old. No code signing. No security review. No sandbox by default.

The parallel to early npm and PyPI is obvious but incomplete. Unlike traditional packages that execute in isolated contexts, agent skills inherit the full permissions of the AI agent they extend. According to Snyk, installing an OpenClaw skill grants that skill shell access, read/write file system permissions, access to credentials in environment variables and config files, the ability to send messages via email and messaging platforms, and persistent memory that survives across sessions.

Document-Driven Implicit Payload Execution

The academic security community has already formalized the attack methodology. Researchers at Griffith University, Nanyang Technological University, the University of New South Wales, and the University of Tokyo published “Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems” in April 2026. The paper introduces Document-Driven Implicit Payload Execution (DDIPE), a technique that embeds malicious logic inside code examples within skill documentation.

The code examples look like standard tutorials. An agent reads them, interprets the embedded patterns as instructions, and executes. Across four agent frameworks and five large language models, DDIPE achieved bypass rates between 11.6% and 33.5%. Static analysis caught most samples, but 2.5% evaded all four detection layers. The researchers conducted responsible disclosure, resulting in four confirmed vulnerabilities and two vendor fixes.

Carter Rees, VP of AI at Reputation, identified the architectural weakness that amplifies this attack class. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” he told VentureBeat. A compromised skill definition riding that flat authorization plane does not need privilege escalation. It already has the agent’s full permissions.

Real Attacks Already Documented

This is not a theoretical exercise. Multiple production incidents demonstrate the kill chain in action.

In April 2026, a documented attack used a crafted GitHub issue title to trigger an AI triage bot wired into Cline. The bot exfiltrated a GITHUB_TOKEN, which the attacker used to publish a compromised npm dependency. That dependency installed a second agent on approximately 4,000 developer machines for eight hours before detection. One issue title. Eight hours of access. No human approved any action in the chain.

Pillar Security demonstrated a related attack against Cursor in January 2026, assigned CVE-2026-22708. Implicitly trusted shell built-in commands could be poisoned through indirect prompt injection, converting benign developer commands into arbitrary code execution vectors. Users saw only the final command. The poisoning happened through earlier commands the IDE never surfaced for review.

The ClawHavoc campaign, first reported by Koi Security in late January 2026, initially identified 341 malicious skills on ClawHub. A follow-up analysis by Antiy CERT expanded the count to 1,184 compromised packages. The campaign delivered Atomic Stealer (AMOS) through skills with professional documentation and names matching what developers actively searched for: solana-wallet-tracker, polymarket-trader.

MCP Marketplaces: Same Pattern, Wider Surface

The Model Context Protocol layer carries identical exposure. OX Security reported in April that researchers poisoned nine out of 11 MCP marketplaces using proof-of-concept servers. Trend Micro initially found 492 MCP servers exposed to the internet with zero authentication. By April, that number reached 1,467.

The root issue, as The Register reported, lies in Anthropic’s MCP SDK transport mechanism. Any developer using the official SDK inherits the vulnerability class. This is not a configuration mistake at scale. It is a design-level exposure baked into the protocol’s reference implementation.

First-Generation Defense Tools

Two tools represent the earliest attempts to address this layer. Cisco released its open-source AI Agent Security Scanner for IDEs in April 2026. Snyk shipped mcp-scan, purpose-built for detecting malicious MCP server configurations.

Both tools acknowledge the same constraint: they are first-generation, scanning for known-bad patterns rather than understanding semantic intent. Baer’s diagnosis applies broadly: “Current scanners look for known bad artifacts, not adversarial instructions embedded in otherwise valid skills.”

The SafeDep project has published an agent skills threat model demonstrating supply chain attacks through PEP 723 inline metadata. GitHub added gh skill publish in April with optional tag protection, secret scanning, and code scanning checks for skills repositories. These are recommended, not required.

The Pre-Exploitation Window

The current situation maps to a specific moment in security history. Before SolarWinds (2020), supply chain attacks through build systems were a known theoretical risk with minimal tooling investment. The industry built detection capabilities after the first major incident, not before.

Agent instruction poisoning is in its pre-SolarWinds phase. The attack vectors are documented. The tooling gap is confirmed by Cisco, Snyk, and multiple academic papers. Active exploitation is already happening at small scale (ClawHavoc, the Cline bot incident). What has not happened yet is a single high-profile compromise that forces industry-wide tooling investment.

Security teams that inventory their agent bridge tools now, deploy Cisco’s scanner or Snyk’s mcp-scan, and assign ownership of the instruction layer get ahead of that incident. Everyone else will be retrofitting after the breach disclosure.

The open question is whether the security industry can build detection capabilities for semantic-layer attacks before the attack community scales what it already knows works. CLI-Anything’s 30,000 stars suggest the adoption curve is not waiting.

CLI-Anything Exposes a Structural Blind Spot: No Security Scanner Can Detect Malicious AI Agent Instructions

Three Layers, One Blind Spot

The Snyk Audit: 13.4% Critical, 36.8% Flawed

Document-Driven Implicit Payload Execution

Real Attacks Already Documented

MCP Marketplaces: Same Pattern, Wider Surface

First-Generation Defense Tools

The Pre-Exploitation Window

Get our morning briefing in your inbox

Keep Reading

Microsoft Agent 365 Reaches General Availability With OpenClaw Detection, Shadow AI Controls, and Cross-Cloud Agent Governance

OpenAI Turns ChatGPT Into the Billing Layer for 3.2 Million OpenClaw Users. Anthropic Shut the Same Door a Month Ago.

Seven Agent Payment Systems Launched in 72 Hours: How the Commerce Stack for Autonomous AI Crystallized in One Week