Leading artificial intelligence developers OpenAI and Anthropic are adopting a more stringent approach to releasing their most advanced AI models, particularly those with potent cybersecurity capabilities. Reports indicate that both companies are moving towards restricting public access to these powerful tools, opting instead for controlled rollouts to vetted organizations only. This shift underscores a growing concern within the AI industry about the potential misuse of sophisticated AI for malicious purposes, a sentiment echoed by cybersecurity experts.
Key Takeaways
- OpenAI is reportedly developing a cybersecurity product exclusively for its “Trusted Access for Cyber” program.
- Anthropic previously restricted access to its highly capable AI model, Claude Mythos, due to its proficiency in discovering security vulnerabilities.
- Both companies are prioritizing controlled access for defensive security operators over broad public release of their most powerful AI.
- This trend suggests a move towards distributing advanced AI capabilities more like classified research, with select access granted to trusted entities.
- The restrictions reflect a proactive measure to mitigate risks associated with AI-powered cyber threats and potential regulatory scrutiny.
OpenAI is reportedly in the process of building a dedicated cybersecurity product that will be exclusively accessible through its “Trusted Access for Cyber” initiative. This program, previously announced, is designed as a controlled deployment mechanism, ensuring that certain cutting-edge products remain out of the general public’s reach and are instead utilized by defensive security professionals. This move follows the release of GPT-5.3-Codex, currently considered OpenAI’s most advanced cybersecurity offering, and the company is backing participants with significant API credits.
This development emerges amidst escalating concerns among cybersecurity professionals regarding the potential for increasingly powerful AI models to overwhelm existing security infrastructures. Earlier this week, Anthropic itself expressed apprehension over its own advanced AI model, Claude Mythos. Anthropic described Mythos as its most capable AI to date, noting its exceptional effectiveness in identifying security vulnerabilities, including zero-days across major operating systems and browsers. This led Anthropic to decide that access to Mythos should be limited to a carefully selected group of organizations.
The reported actions by OpenAI mirror this cautious approach. Anthropic is currently engaged in a legal dispute following its designation as a supply chain risk by the Pentagon, stemming from its refusal to relax usage restrictions on Claude for surveillance and autonomous weapons applications. Federal agencies have intensified their scrutiny of AI companies’ safety protocols in recent months.
While OpenAI has not officially confirmed or denied these reports, the rationale behind such restrictions is evident. Anthropic’s Mythos Preview, which was reportedly leaked prior to its formal launch, demonstrated an alarming ability to pinpoint “tens of thousands of vulnerabilities” that even seasoned human experts would find challenging to discover. The model is characterized as highly autonomous, exhibiting reasoning capabilities akin to a senior security researcher. Such power, if widely available, presents a significant concern for security teams.
In response, Anthropic launched Project Glasswing, a controlled access initiative providing Mythos Preview only to organizations that have undergone vetting, including major players like Amazon Web Services, Apple, Google, Microsoft, and entities involved in critical infrastructure maintenance. OpenAI’s decision to restrict access to its advanced products appears to be an effort to preempt potential regulatory pressures. By voluntarily implementing limitations, OpenAI positions itself as a responsible stakeholder in a landscape where other companies are facing increased scrutiny.
These access restrictions also highlight a deeper industry trend. Anthropic’s own safety assessment acknowledged that the benchmark used to evaluate AI cyber risk, Cybench, is no longer sufficient to measure the capabilities of current frontier models, as Mythos surpassed it entirely. This indicates that the tools designed to measure AI-driven threats are falling behind the pace of innovation. Anthropic further noted that many evaluations involve subjective judgment calls and fundamental uncertainties.
As part of its initiative, Anthropic has committed substantial resources, including usage credits and direct donations, to open-source security organizations. OpenAI has not yet announced similar commitments alongside its access program. However, both companies frame their restricted access models as ultimately beneficial for defensive security, arguing that equipping defenders with superior tools before adversaries can access them outweighs the limitations on general availability.
Long-Term Technological Impact on the Industry
The emerging pattern of restricted access to frontier AI models signifies a critical evolution in how advanced technology is distributed and managed within the tech ecosystem, particularly concerning blockchain innovation, AI integration, and Web3 development. This controlled release strategy suggests a maturing understanding of the dual-use nature of powerful AI tools. For blockchain, this could mean that future AI integrations into decentralized systems will be carefully curated, focusing on applications that enhance security, transparency, and efficiency without introducing undue systemic risk. Layer 2 solutions, which often deal with scaling and security, might benefit from highly specialized AI tools vetted for robustness and safety, ensuring that these critical infrastructure components are fortified against sophisticated threats. In Web3, the emphasis on decentralized identity and ownership could be bolstered by AI models designed to detect and prevent fraudulent activities, but access to such potent tools will likely be governed by strict protocols to prevent their weaponization. Ultimately, this trend points towards a future where the most impactful AI technologies are not broadly deployed but are instead disseminated akin to sensitive research – selectively, under rigorous agreements, and primarily to entities demonstrably equipped and committed to their responsible utilization.
Original article : decrypt.co
