Researchers Build Self-Replicating AI Worm That Operates Entirely on Local, Open-Weight Models

## A Self-Replicating AI Worm Just Infected 62% of a Test Network — No Cloud, No API, No Kill Switch

University of Toronto researchers have demonstrated something that cybersecurity experts have feared for years: an AI-driven computer worm that can autonomously exploit vulnerabilities, replicate itself across a network, and adapt its attack strategy in real time — all without relying on any commercial AI service that could be shut down.

The proof-of-concept worm, documented in a preprint posted to arXiv on June 2, used a locally hosted open-weight large language model to reason its way through a 33-host test network. Across 15 independent runs, it gained elevated access on an average of 23.1 hosts and successfully replicated to 20.4 of them — roughly 62% of the full network — over seven days, with no human input whatsoever.

This is not a theoretical exercise. This is a working autonomous threat that patches itself, adapts to new vulnerabilities, and spreads without any centralized infrastructure to target.

### How It Works

Led by associate professor Nicolas Papernot, the CleverHans Lab team (spanning the University of Toronto, Vector Institute, University of Cambridge, and ServiceNow) built a worm that fundamentally differs from traditional malware. Conventional worms ship with a fixed exploit payload chosen at build time. Patch those specific bugs, and the worm stops spreading.

This worm uses an open-weight LLM running on a single GPU to generate attack logic at runtime, tailored to whatever it finds on each target. No pre-encoded exploit chain. No dependency on OpenAI, Anthropic, or any other API that a platform could revoke or rate-limit.

In the tiered design, infected GPU-capable hosts serve as distributed reasoning nodes that provide inference for lower-compute devices on the network that cannot run the model themselves. The worm reached up to seven generations of self-replication, with a mean of 5.1 across test runs. Full GPU-tier replication — where the worm stages a local LLM instance on a compromised GPU host — succeeded in 68.8% of attempts.

### The Patching-Window Problem

The most alarming finding: the worm bypassed its own training cutoff by ingesting public advisory text at runtime. It successfully exploited three vulnerabilities disclosed after the model was trained, including CVE-2026-39987 (a pre-authentication RCE in Marimo with a CVSS score of 9.3), CVE-2026-31431 (a Linux kernel privilege-escalation flaw), and CVE-2026-43284/43500 (DirtyFrag).

Against those three post-training hosts, the worm reached root in 41 of 67 attempts. CVE-2026-39987 was disclosed on April 8, 2026. Sysdig observed exploitation in honeypots just 9 hours and 41 minutes after disclosure. The parallel to WannaCry is the patch gap, not the blast radius: EternalBlue had been patched for months before WannaCry struck. An adaptive worm can keep testing new paths while defenders are still validating fixes.

### Zero Marginal Cost, No Central Kill Switch

Two characteristics make this fundamentally harder to contain than a traditional worm.

First, the cost shifts from rented API access to whatever compute the worm can capture. Once a GPU-capable victim exists, the attacker no longer pays per attempt. The infected infrastructure itself becomes the attack engine.

Second, because everything runs on open-weight models with no vendor dependency, provider-side controls are irrelevant. No API key to revoke. No service to suspend. No rate limit to enforce. Containment has to happen at the network and host layer.

The researchers also observed the worm rewriting its own code on several occasions to bypass local security controls — behavior they never coded for.

### Context: This Is Not Isolated

The Toronto work sits at the intersection of two emerging trends. Morris II (Cohen et al., 2025) demonstrated self-replicating adversarial prompts spreading across AI email assistants. In March 2026, ClawWorm showed self-replicating attacks across LLM agent ecosystems. But in those cases, the AI application layer was the target. Here, the LLM is the attack engine, and ordinary network infrastructure is the target.

Real-world operations are already testing the same boundary. Anthropic disclosed in November 2025 that it disrupted a large AI-orchestrated espionage campaign attributed to Chinese state-sponsored group GTG-1002, where Claude Code handled 80-90% of the operation. Google's Threat Intelligence Group reported in May 2026 what it assessed as the first zero-day exploit developed with AI assistance. The Toronto research is the lab version of that same direction, pushed into autonomous host-level propagation.

### What Defenders Should Do Now

The current prototype was deliberately built without stealth features — no encryption, no polymorphic code, no persistence mechanisms. A malicious variant with those additions would eliminate the easy detection signals this version leaves behind. The window to develop defenses is now.

**Segment GPU-capable machines aggressively.** The worm's design routes LLM inference through any compromised GPU host it can reach. In a flat network, one compromised deep-learning server becomes a reasoning hub for every infected device on the same subnet. Apply zero-trust controls to prevent lateral reach to and from those hosts.

**Treat published advisories as near-term weaponization targets.** For internet-facing CVEs, the exploitation window is measured in hours. Verify exploitability fast, patch internet-facing exposure first, and use compensating controls when deployment cannot happen before the next business cycle.

**Rotate credentials on any compromised or suspected host.** The worm demonstrated systematic credential reuse as a propagation path. Harvested credentials move laterally faster than most detection cycles.

**Monitor for agent-specific behavioral signals.** Non-standard port activity, automated SSH public key injection, and clusters of LLM inference appearing on unexpected endpoints are the observable artifacts this prototype leaves behind. They are the starting point for detection logic.

The implementation is not publicly released. The University of Toronto is establishing a vetting process for qualified defensive researchers to request access.

### What This Means For You

If you work in cybersecurity, this research should reshape your threat model. The era of AI-powered attacks that require cloud APIs or centralized infrastructure is ending. Open-weight models mean that anyone with a GPU can build autonomous, adaptive malware that learns from public vulnerability disclosures faster than your patch cycle runs.

For IT leaders, the implications are direct: flat networks are now existential liabilities. GPU infrastructure — increasingly common in everything from research labs to corporate data science teams — must be segmented and monitored. And the assumption that published CVEs give you weeks or months to patch is no longer valid when an AI worm can read the same advisory and exploit it within hours.

For everyone else, this is a reminder that the cybersecurity arms race is accelerating. The tools are getting smarter, cheaper, and more autonomous. Your best defense remains the basics: update immediately, segment your networks, use multi-factor authentication, and treat every public vulnerability disclosure as a countdown timer, not a suggestion.

Researchers Build Self-Replicating AI Worm That Operates Entirely on Local, Open-Weight Models

Related Stories

YouTube is testing an AI search mode that \'feels more like a conversation\'

YouTube is testing an AI-powered search feature that shows guided answers

YouTube is giving creators a new weapon against AI deepfakes