AI ships into the SOC: Mythos in curl, GPT-5.5-Cyber, and Anthropic plugs into SpaceX

Tue, May 12, 2026 · 9 stories

Mozilla shipped patches for 271 Firefox vulnerabilities sourced from Anthropic's Claude Mythos Preview. The curl maintainer publicly thanked the same model for a real finding. OpenAI opened GPT-5.5-Cyber to verified defenders. AI-assisted vulnerability discovery stopped being a research demo this week and started landing as diffs in code that the entire internet audits. The floor for "specialized security model" moved permanently.

On the consumer surface, OpenAI replaced ChatGPT's default model with GPT-5.5 Instant, claiming a 52 percent drop in hallucinated claims on high-stakes prompts in medicine, law, and finance. The number deserves third-party probing. If it holds, the working definition of "ChatGPT-grade reliable" just shifted under every team integrating the free tier.

And on the infrastructure side, Anthropic raised Claude's usage limits on the back of a new SpaceX compute partnership. Two hyperscalers already hold equity in the company. Reaching for a third compute supplier, and a non-hyperscaler at that, is what "frontier-model demand outpaces hyperscaler supply" looks like in practice.

Mozilla Hacks / daniel.haxx.se

Mozilla ran Anthropic's Claude Mythos Preview against Firefox and shipped patches for 271 vulnerabilities with minimal false positives, while curl maintainer Daniel Stenberg disclosed that the same model flagged a real bug in curl. Both findings landed the same week OpenAI launched GPT-5.5-Cyber for verified defenders, signaling a coordinated industry move into AI-assisted vulnerability discovery on production codebases.

Why this matters:

AI-driven vulnerability hunting just stopped being a research demo. When two of the most-audited codebases on the planet take fixes from a single Anthropic preview model in one week, the floor for "specialized security model" capability moved permanently. Expect a wave of CVEs in the next six months that originated with an LLM, and vendors who did not run their code through one looking suddenly negligent.

OpenAI Replaces ChatGPT's Default With GPT-5.5 Instant, Claims 52% Fewer Hallucinations

OpenAI Blog / The Verge

OpenAI replaced ChatGPT's default model with GPT-5.5 Instant, claiming 52.5 percent fewer hallucinated claims on high-stakes prompts in medicine, law, and finance, and a 37.3 percent drop in inaccurate claims on flagged conversations per its internal evaluations. The model is rolling out to free and paid ChatGPT users with a system card published alongside.

Why this matters:

The default model matters more than benchmark headlines. Free users outnumber API users by orders of magnitude, and they form public opinion on whether AI hallucinates in court filings and medical questions. A 52 percent cut, if it survives third-party probing, narrows the gap between "AI assistant" and "fact-grade reference" enough that some regulated workflows will quietly reclassify it. Worth watching whether Anthropic and Google publish comparable internal numbers within the quarter.

Anthropic Plugs Into SpaceX and Raises Claude Usage Limits

Anthropic

Anthropic announced a multi-year compute partnership with SpaceX alongside higher usage limits across the Claude API and Claude.ai. The deal extends Anthropic's compute supply beyond AWS and Google Cloud, the two hyperscalers that already hold equity in the company.

Why this matters:

This is the first concrete signal that frontier-model demand has outpaced even hyperscaler-co-investor supply. Two hyperscalers already hold structural positions in Anthropic. Adding a third compute supplier, in a non-hyperscaler, says the existing $45B-plus in committed capacity is not enough. For VCs, it confirms that AI infrastructure plays now include the non-obvious physical layer (launch capacity, satellite-adjacent data centers, energy-coupled sites). For everyone else, the higher limits attached to the announcement mean the API ceiling that throttled real workloads this quarter just moved.

Quick Hits

Computer Use Is 45x More Expensive Than Structured APIs

Reflex

A side-by-side cost analysis from the Reflex team shows screenshot-and-click agents run roughly 45x more expensive per task than the structured-API equivalent. The number every agent founder should write on a sticky note.

Cloudflare Agents Can Now Create Accounts, Buy Domains, and Deploy Infrastructure

Cloudflare Blog

Native agentic primitives for the full domain-to-deploy loop, with no human in the registration path.

AlphaEvolve: DeepMind's Gemini-Powered Coding Agent Ships Real Results Across Science and Infra

DeepMind Blog

The agent is producing measurable impact in scientific discovery and Google's own infrastructure optimization, moving past benchmark claims into deployed value.

OpenAI Open-Sources MRC Supercomputer Networking via the Open Compute Project

OpenAI Blog

Multipath Reliable Connection protocol contributed to OCP, with NVIDIA already adopting it in its Spectrum-X Ethernet fabric.

Microsoft Targeted $92B Return on Its Early OpenAI Investment

Bloomberg Technology

Internal modeling on one of the most consequential venture bets in tech history. Same week, Ilya Sutskever disclosed his OpenAI stake at roughly $7B.

Cerebras Systems Seeks $4.8B in Upsized AI-Chip IPO

Bloomberg Technology

Investor demand pushed the offering up by a third, a clean read on public-market appetite for AI infrastructure.

AI ships into the SOC: Mythos in curl, GPT-5.5-Cyber, and Anthropic plugs into SpaceX

Top Stories

Quick Hits

Daniel Ryan

Get this in your inbox