Anthropic’s New AI Can Hack Into Almost Anything
Anthropic has built an AI so capable at hacking that they’ve decided not to release it to the public. Instead, they’re using it defensively — to find and patch the very vulnerabilities it could exploit — through a program called Project Glasswing.
The findings are hard to overstate. The model found a 27-year-old bug in OpenBSD — an operating system specifically known for its security focus — and demonstrated how it could be used to crash any machine running the OS remotely. It also discovered a 16-year-old flaw in FFmpeg, the video processing library that virtually every major streaming service relies on — a bug that had survived years of dedicated fuzzing by security researchers and had gone undetected since it was introduced in 2003. And Mythos Preview didn’t just find bugs — it built working exploits. For a 17-year-old vulnerability in FreeBSD’s file-sharing server, the model autonomously found the flaw and wrote a complete attack that gives a total stranger on the internet full control of the machine, no password required. Perhaps most striking: engineers at Anthropic with no formal security training asked the model to find vulnerabilities overnight and woke up the next morning to a complete, working exploit.
The gap between this model and its predecessors is enormous. The previous flagship model, Opus 4.6, succeeded at autonomously building exploits for Firefox vulnerabilities just twice out of several hundred attempts. Mythos Preview succeeded 181 times in the same test. Crucially, Anthropic says this capability wasn’t deliberately trained in — it emerged as a side effect of general improvements in reasoning, coding ability, and autonomous operation.
One of the more alarming capabilities is the model’s ability to combine multiple bugs into a single attack — something that previously required highly skilled human researchers working over weeks. In web browsers, Mythos Preview independently found the necessary attack components and chained them together into what’s called a “JIT heap spray” — a sophisticated technique that can allow an attacker to escape the browser’s security sandbox entirely. In one case, the team turned this into an attack where visiting a single malicious webpage could give the attacker write access to your operating system’s core. On Linux, the model successfully chained together two, three, and sometimes four separate vulnerabilities to achieve full administrator access on machines with all modern defenses enabled.
So why is Anthropic telling us this? They frame this as a “watershed moment” for cybersecurity. Their argument: the same capabilities that make Mythos Preview dangerous also make it enormously useful for defense. Used at scale, it can find and report vulnerabilities faster than any human team. They’ve already identified thousands of high- and critical-severity vulnerabilities and are working through a coordinated process to notify software maintainers. Because over 99% haven’t been patched yet, they can’t share specifics — but they’ve published cryptographic fingerprints of the bug reports as a form of accountability, proving they have the findings without revealing what they are.
Their practical advice is urgent. Turning a known public vulnerability into a working exploit — something that used to take a skilled researcher days or weeks — can now happen in hours, cheaply and automatically. The window between “patch released” and “attackers can exploit it” has shrunk dramatically. They also argue that even today’s publicly available models are already capable of finding serious bugs, and that organizations that haven’t explored this are leaving real value on the table. As vulnerability discovery accelerates, the volume of security incidents will likely grow to match — and most incident response teams cannot staff their way through that volume. Models should be carrying much of the technical work.
Anthropic compares this moment to when fuzzing tools were first deployed at scale — there were fears they’d empower attackers, and they did, briefly. But today, fuzzers are a cornerstone of software security. They expect AI to follow the same arc: disruptive in the short term, net-positive for defenders in the long run. But they acknowledge the transition period “may be tumultuous.”
The report ends on a note of genuine alarm wrapped in cautious optimism. The security equilibrium of the past 20 years may be breaking down, and the field needs to start preparing now — not for a hypothetical future threat, but for one that is, by Anthropic’s own account, already here.
Anthropic has not made Mythos Preview publicly available and says it has no plans to do so in the near term. Project Glasswing is currently limited to critical infrastructure partners and open source developers.
Replies posted on Bluesky will appear here automatically.
