AI weekly digest: Anthropic crosses $1T while OpenAI's voice learns to reason
Anthropic took over xAI's Colossus 1 in a ~$5B/yr deal, doubled Claude Code limits, and crossed a reported $1T valuation. OpenAI shipped GPT-Realtime-2 and put Codex inside Chrome. DeepMind's AI co-mathematician posted 48% on FrontierMath Tier 4 — and proved a thesis-grade result.
The week's gravity sat firmly with Anthropic — a SpaceX/xAI compute pact, a doubled Claude Code rate limit, an alignment win on the "blackmail" failure mode, and reports of a $1.0–1.2T valuation that put it ahead of OpenAI for the first time. OpenAI answered with a voice-API trifecta and Codex inside Chrome. Underneath it all, DeepMind's AI co-mathematician cleared a real research bar, and three large public companies cut staff in the same week with "AI" in the rationale.
1. Anthropic leases all of xAI's Colossus 1 — a ~$5B/yr, ~300MW, ~220K-GPU compute pact
Anthropic announced a SpaceX/xAI partnership giving it access to the entire Colossus 1 cluster in Memphis (cited as ~150K H100s, ~50K H200s, ~30K GB200s) within "the next few days," with Elon Musk saying xAI is comfortable leasing it because training has shifted to Colossus 2. The immediate user-visible result: Claude Code 5-hour rate limits doubled for Pro/Max/Team/Enterprise, peak-hour throttling removed for Pro/Max, and Opus API rate limits substantially raised.
Source: AINews — Anthropic-SpaceX/xAI 300MW/$5B/yr deal for Colossus I
2. Anthropic's revenue breakout: ~80x annualized growth, ~$1T valuation, ~$200B in Google Cloud spend
Following a $15B ARR jump in a single month and reported 80x annualized growth in Q1, secondary-market and traditional reporting now place Anthropic at $1.0–1.2T, overtaking OpenAI to become the 11th–15th most valuable company in the world. In parallel, Anthropic is reportedly committing $200B to Google Cloud over five years while Google itself plans to invest up to $40B in Anthropic — a striking counterpoint to Block's 40%, Cloudflare's 20%, and Coinbase's 14% layoffs all citing AI.
Source: AINews — Anthropic growing 10x/year while everyone else is laying off >10%
3. OpenAI ships GPT-Realtime-2, -Translate, and -Whisper — voice gets GPT-5-class reasoning
OpenAI released three new streaming audio models in the Realtime API. GPT-Realtime-2 brings "GPT-5-class" reasoning, 128K context (up from 32K), tool calls with audible "preambles," and adjustable reasoning levels (minimal → xhigh) at unchanged $1.15/$4.61 per audio hour I/O. Independent benchmarks (Artificial Analysis) put it at 96.6% on Big Bench Audio S2S and 96.1% on Conversational Dynamics, with Scale AI's S2S leaderboard placing it #1 and instruction retention rising from 36.7% to 70.8%. Translate covers 70+ → 13 languages live; Whisper streams transcription in real time. ChatGPT Voice itself is not yet upgraded.
Source: AINews — GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs
4. Codex moves into Chrome and turns into a long-running agent runtime
OpenAI's Codex now runs directly inside Chrome on macOS and Windows, executing in parallel across tabs in the background instead of taking over the browser — the team frames it as scripting browser work under the hood rather than driving pixels. Combined with the recently-shipped /goal primitive (persisted goals that survive restarts and multi-hour pauses), one independent run hit 61% on public ARC-AGI-3 games after 160 hours / 30k actions.
Source: TLDR AI — Codex now works directly in Chrome on macOS and Windows
5. DeepMind's AI co-mathematician posts 48% on FrontierMath Tier 4 — and proves a thesis-grade result
Pushmeet Kohli announced a multi-agent AI co-mathematician that scored 48% on FrontierMath Tier 4, a new high on what is currently the hardest published math benchmark. Fields medalist Tim Gowers said the system proved a result that could plausibly form a PhD thesis chapter. Caveats: the run relied on heavy custom infrastructure and large compute budgets, so the score isn't directly comparable to standard leaderboard entries — but the qualitative milestone is real.
Source: AINews recap, May 8 — Top tweets section
6. Anthropic's "Teaching Claude why" — alignment by understanding, not just demonstrations
Anthropic published research claiming it has eliminated the previously-observed Claude 4 blackmail behavior. The key methodological claim: behavior demonstrations alone weren't enough — the breakthrough came from training the model to understand why misaligned behavior is wrong, using constitution-style documents, fictional aligned-AI stories, and more diversified harmlessness data. Worth reading alongside the related Natural Language Autoencoders post on translating model activations into auditable text.
Source: AINews recap, May 8 — Top tweets section
7. Anthropic's Mythos cyber model and Dario's "less than a year" warning
Dario Amodei warned that organizations may have less than a year to patch AI-discovered vulnerabilities before adversaries' models catch up. The warning centers on Mythos, Anthropic's still-unreleased cybersecurity model, which has reportedly surfaced thousands of vulnerabilities and could compress patch windows from weeks-to-months down to days. OpenAI countered with GPT-5.5-Cyber in limited preview for critical-infrastructure defenders, and a reported U.S. AI security executive order is shifting toward lab-collaboration on defense rather than pre-approval of frontier models.
Source: TLDR IT — Anthropic Warns of an AI Security Deadline
8. Apple opens iOS 27, iPadOS 27, and macOS 27 to third-party AI providers
Apple will let users pick from multiple third-party AI providers to power Siri, writing tools, and other system features this fall. Google and Anthropic are named as likely beneficiaries. The strategic read: Apple is conceding the "best model" race and competing instead on distribution and choice — an inversion of the closed Apple Intelligence pitch from 2024.
Source: TLDR — Apple to let users choose rival AI models across iOS 27 features
9. Subquadratic claims a 12M-token context window with sub-quadratic attention
A Miami startup, Subquadratic, launched a model with a 12-million-token context window that reportedly outperforms GPT-5.5 on retrieval benchmarks, and says compute now scales linearly with context length rather than quadratically. They claim a 1,000x attention-efficiency gain versus frontier models and say a 50M-token version is in the pipeline. Researchers are demanding independent reproduction before treating the numbers as load-bearing.
Source: The New Stack — Subquadratic 12-million-token context window
10. Zyphra ships ZAYA1-74B-Preview — a 74B/4B-active MoE under Apache 2.0, trained on AMD
Zyphra released ZAYA1-74B-Preview (74B total, 4B active per token) as a strong pre-RL base checkpoint trained on AMD hardware, alongside ZAYA1-VL-8B (700M active / 8B total) — both Apache 2.0. Community reaction treated it as proof Zyphra has graduated past small-MoE experimentation, and the AMD training story is an unusual datapoint in a year still dominated by NVIDIA.