security February 23, 2026

The ClawHavoc Attack: 1,200 Malicious AI Skills and How to Protect Yourself

In January 2026, the ClawHavoc campaign planted 1,200 malicious skills in AI agent marketplaces. Here's what happened, why it matters, and how to verify your agent skills are safe.

V
Varun Pratap Bhardwaj

What Happened

In January 2026, security researchers uncovered something that changed the conversation around AI agent safety overnight. A coordinated attack campaign — now known as ClawHavoc — had planted over 1,200 malicious skills across major AI agent marketplaces.

These were not crude, obvious payloads. They were polished, well-documented skills that appeared to do exactly what they advertised. A Markdown formatter. A database query helper. A file organizer. Legitimate descriptions, legitimate-looking code, legitimate reviews. Underneath, they carried data exfiltration routines, credential theft mechanisms, and remote code execution capabilities.

The attackers understood something that most of the AI community had not yet internalized: agent skills are executable code with broad system access. A skill installed in your Claude Desktop, Cursor, or any MCP-compatible tool can read your files, execute shell commands, make network requests, and interact with APIs on your behalf. The trust model is implicit. If you install it, it runs with your permissions.

ClawHavoc exploited that implicit trust at scale.

If you installed agent skills between November 2025 and January 2026

You should audit your installed skills immediately. The ClawHavoc campaign was active for approximately two months before discovery. Skills installed during this window from unverified sources should be treated as potentially compromised until verified.

The First CVE for Agent Software

The ClawHavoc discovery led directly to CVE-2026-25253 — the first Common Vulnerability and Exposure identifier ever assigned to agent software. This was not a symbolic gesture. It was the formal recognition by the global security community that AI agent skills represent a real, categorized attack surface.

Before this CVE, agent skills existed in a security gray zone. They were not traditional software packages (npm, PyPI), not browser extensions, not mobile apps. They fell outside existing vulnerability taxonomies. CVE-2026-25253 changed that. Agent skills are now a tracked threat category with formal reporting infrastructure.

The Scale Nobody Expected

ClawHavoc was the headline, but it was not the whole story. Subsequent research catalogued over 6,000 malicious agent tools across various marketplaces and distribution channels. The 1,200 from ClawHavoc were just one coordinated campaign. Other malicious skills were uploaded independently by different threat actors, each with their own objectives.

The breakdown of malicious capabilities found across these 6,000+ tools:

  • Data exfiltration — Skills that silently copied file contents, environment variables, and configuration data to external endpoints
  • Credential harvesting — Skills that extracted API keys, tokens, and passwords from project files and system keychains
  • Remote code execution — Skills that established reverse shells or downloaded secondary payloads
  • Prompt injection — Skills that manipulated the AI agent’s context to alter its behavior on other tasks
  • Persistence mechanisms — Skills that modified system configurations to survive uninstallation attempts

This is not theoretical. These capabilities were found in skills that real developers had installed and were actively using.

Why Traditional Scanning Fails

The natural response to a supply chain attack is to reach for a scanner. Run a security tool, get a report, remediate the findings. That is the playbook for npm audit, Snyk, Dependabot, and every other software composition analysis tool.

It does not work for agent skills. Here is why.

Heuristic scanners are reactive. They maintain databases of known bad patterns — signatures of known malware, flagged package names, suspicious API calls. If an attack matches a known pattern, the scanner catches it. If it does not, the attack passes through. One widely-used security scanner states explicitly in its documentation: “No findings does not mean no risk.”

That caveat is honest, but it reveals the fundamental limitation. Heuristic scanning tells you whether a skill matches something that was already discovered. It cannot tell you whether a skill does something it should not.

Agent skills compound this problem because they operate at a higher abstraction level than traditional packages. A malicious npm package typically needs to exploit a known vulnerability or execute suspicious system calls. A malicious agent skill just needs to describe a capability the AI agent will execute faithfully. The attack surface is the capability model itself, not a buffer overflow or a SQL injection.

The gap between what a skill claims to do and what it actually does is the attack surface. Traditional scanners do not measure that gap.

How to Protect Yourself

The defense against capability-level attacks requires capability-level verification. You need a tool that does not ask “does this match a known bad pattern?” but instead asks “what can this skill actually do, and does that match what it claims?”

SkillFortify provides formal verification for agent skills. It mathematically proves what a skill CAN do versus what it CLAIMS to do. Five soundness theorems guarantee the analysis is complete — not heuristic, not probabilistic, but provably correct. Independent evaluation shows F1 = 96.95% with zero false positives.

Here is the practical workflow:

1. Scan Your Project

skillfortify scan ./your-project

This scans all agent skill files, MCP configurations, and tool definitions in your project directory. Every skill is analyzed against its declared capabilities. Discrepancies are flagged with severity ratings.

2. Verify Individual Skills

skillfortify verify ./skills/suspicious-skill.md

Deep verification of a single skill. Returns a detailed capability analysis showing exactly what the skill can do — file system access, network requests, command execution, data access patterns — compared against its stated purpose.

3. Lock Your Configuration

skillfortify lock ./your-project

Generates a cryptographic lockfile for your current skill configuration. Any modification to skill files, MCP configs, or tool definitions will be detected on the next scan. This is the agent equivalent of package-lock.json — reproducible, auditable, tamper-evident.

4. Generate a Software Bill of Materials

skillfortify sbom ./your-project

Produces a complete inventory of all agent skills, their declared capabilities, their verified capabilities, and their verification status. Essential for compliance workflows and security audits.

Why formal verification, not heuristic scanning

Heuristic scanners ask: “Does this match something bad we have seen before?” Formal verification asks: “What can this provably do?” The first approach misses novel attacks by definition. The second approach catches any capability discrepancy, regardless of whether the attack pattern has been seen before. That is the difference between reactive security and mathematical security.

What You Should Do Today

The ClawHavoc campaign is a wake-up call, not a one-time event. Agent skill supply chain attacks will increase as AI agent adoption grows. The attack economics are too favorable — high access, low detection, broad distribution.

Immediate actions:

  1. Audit every agent skill you have installed. If you cannot explain what each skill does and why you need it, remove it.
  2. Install SkillFortify and run a scan. pip install skillfortify — then scan your project directories and MCP configurations.
  3. Lock your configurations. Generate lockfiles so you will know immediately if anything changes.
  4. Add verification to your workflow. Run skillfortify scan before accepting any new skill, the same way you would review a dependency before adding it to package.json.

The agent skill ecosystem is where npm was in 2015 — widespread adoption, minimal verification. We know how that story played out. The difference is that agent skills have more access to your system than npm packages ever did.

Do not wait for your own incident to start verifying.

Resources