Back to archive

AI ships into the SOC: Mythos in curl, GPT-5.5-Cyber, and Anthropic plugs into SpaceX

Tue, May 12, 2026 · 9 stories

Mozilla shipped patches for 271 Firefox vulnerabilities sourced from Anthropic's Claude Mythos Preview. The curl maintainer publicly thanked the same model for a real finding. OpenAI opened GPT-5.5-Cyber to verified defenders. AI-assisted vulnerability discovery stopped being a research demo this week and started landing as diffs in code that the entire internet audits. The floor for "specialized security model" moved permanently.

On the consumer surface, OpenAI replaced ChatGPT's default model with GPT-5.5 Instant, claiming a 52 percent drop in hallucinated claims on high-stakes prompts in medicine, law, and finance. The number deserves third-party probing. If it holds, the working definition of "ChatGPT-grade reliable" just shifted under every team integrating the free tier.

And on the infrastructure side, Anthropic raised Claude's usage limits on the back of a new SpaceX compute partnership. Two hyperscalers already hold equity in the company. Reaching for a third compute supplier, and a non-hyperscaler at that, is what "frontier-model demand outpaces hyperscaler supply" looks like in practice.


Top Stories

Claude Mythos Finds 271 Firefox Vulnerabilities and a Curl Bug

Mozilla Hacks / daniel.haxx.se

Mozilla ran Anthropic's Claude Mythos Preview against Firefox and shipped patches for 271 vulnerabilities with minimal false positives, while curl maintainer Daniel Stenberg disclosed that the same model flagged a real bug in curl. Both findings landed the same week OpenAI launched GPT-5.5-Cyber for verified defenders, signaling a coordinated industry move into AI-assisted vulnerability discovery on production codebases.

Why this matters:

AI-driven vulnerability hunting just stopped being a research demo. When two of the most-audited codebases on the planet take fixes from a single Anthropic preview model in one week, the floor for "specialized security model" capability moved permanently. Expect a wave of CVEs in the next six months that originated with an LLM, and vendors who did not run their code through one looking suddenly negligent.

OpenAI Replaces ChatGPT's Default With GPT-5.5 Instant, Claims 52% Fewer Hallucinations

OpenAI Blog / The Verge

OpenAI replaced ChatGPT's default model with GPT-5.5 Instant, claiming 52.5 percent fewer hallucinated claims on high-stakes prompts in medicine, law, and finance, and a 37.3 percent drop in inaccurate claims on flagged conversations per its internal evaluations. The model is rolling out to free and paid ChatGPT users with a system card published alongside.

Why this matters:

The default model matters more than benchmark headlines. Free users outnumber API users by orders of magnitude, and they form public opinion on whether AI hallucinates in court filings and medical questions. A 52 percent cut, if it survives third-party probing, narrows the gap between "AI assistant" and "fact-grade reference" enough that some regulated workflows will quietly reclassify it. Worth watching whether Anthropic and Google publish comparable internal numbers within the quarter.

Anthropic Plugs Into SpaceX and Raises Claude Usage Limits

Anthropic

Anthropic announced a multi-year compute partnership with SpaceX alongside higher usage limits across the Claude API and Claude.ai. The deal extends Anthropic's compute supply beyond AWS and Google Cloud, the two hyperscalers that already hold equity in the company.

Why this matters:

This is the first concrete signal that frontier-model demand has outpaced even hyperscaler-co-investor supply. Two hyperscalers already hold structural positions in Anthropic. Adding a third compute supplier, in a non-hyperscaler, says the existing $45B-plus in committed capacity is not enough. For VCs, it confirms that AI infrastructure plays now include the non-obvious physical layer (launch capacity, satellite-adjacent data centers, energy-coupled sites). For everyone else, the higher limits attached to the announcement mean the API ceiling that throttled real workloads this quarter just moved.


Quick Hits

Computer Use Is 45x More Expensive Than Structured APIs

Reflex

A side-by-side cost analysis from the Reflex team shows screenshot-and-click agents run roughly 45x more expensive per task than the structured-API equivalent. The number every agent founder should write on a sticky note.

Share:
Daniel Ryan

Daniel Ryan

Founder of Stratavore

Daniel Ryan has been shipping software for 15 years and is now building Stratavore. He reads too much AI news so you don't have to.

Get this in your inbox

Subscribe for free weekly AI intelligence digests.