A rumor is circulating that Anthropic may soon release a public version of its highly advanced AI model, Mythos.
This isn't just another AI launch; it's a major event precisely because of Mythos's demonstrated power. According to recent reports, the model can identify newly discovered software vulnerabilities and write functional hacking code—or 'exploits'—to take advantage of them in a matter of hours. In one test, it created eight different exploits for just over $15,000 in computing credits, making the cost per exploit less than $2,000. This capability, while useful for defense, raises serious security concerns if it falls into the wrong hands.
These concerns have not gone unnoticed by policymakers. First, the U.S. government recently issued an executive order creating a framework for companies to voluntarily submit their most powerful 'frontier models' for a 30-day security review before public release. The timing of this order and the Mythos release rumor are likely connected. It suggests that any public release of Mythos would have to pass a significant safety inspection, shifting the narrative from a simple performance showcase to a demonstration of responsible AI governance.
Second, beyond its cyber capabilities, the rumored public version of Mythos is said to have greatly improved performance on long-term, multi-step tasks. This is a key challenge in the AI industry, often referred to as 'long-horizon' problems. A breakthrough here would mean that AI could function more like an 'autopilot' for complex work, a significant step toward more capable AI agents that can plan and execute tasks over extended periods.
So, Anthropic's strategy appears to be a careful balancing act. They are not planning to release the raw, unrestricted model. Instead, the public version is expected to come with very strong 'guardrails'—safety filters and restrictions—to prevent misuse, especially for creating offensive cyber tools. This approach builds on their limited-access 'Project Glasswing,' where trusted partners are already using a preview version of Mythos for defensive purposes, like finding and fixing vulnerabilities.
In essence, the potential launch of a 'Mythos-class' model for the public is a pivotal test. It's about whether a company can successfully deploy an AI with potentially dangerous capabilities by building a robust safety framework around it, navigating both technological risks and government oversight.
- Guardrails: In AI, these are safety measures, rules, or filters designed to prevent the model from generating harmful, unethical, or inappropriate content.
- Exploit: A piece of code or a sequence of commands that takes advantage of a bug or vulnerability in a software system to cause unintended or unanticipated behavior.
- Frontier Model: A term for the most powerful and capable AI models at the cutting edge of technology, which may have risks that are not yet fully understood.
