When AI Builds Itself

Anthropic published a remarkable paper this week called "When AI Builds Itself." Their engineers now ship roughly eight times more code per quarter than they did in 2024. More than 80% of the code merged into Anthropic’s codebase is written by Claude, up from low single digits before Claude Code launched in February 2025. Continue Reading →

Claude 4.8 is Here

Anthropic yesterday released Claude Opus 4.8, which the company calls it "a modest but tangible improvement" over Opus 4.7. When a vendor undersells its own launch, pay attention to which number it is quietly proud of. Continue Reading →
Andrej Karpathy is joining Anthropic. He'll work on the pre-training team under team lead Nick Joseph, where he will start a new initiative focused on using Claude to accelerate Claude's own pre-training research. In his announcement, he said "the next few years at the frontier of LLMs will be especially formative" and that he was getting "back to R&D." Continue Reading →
OpenAI put Codex on your phone yesterday. The mobile app (iOS and Android) now connects to your running desktop Codex session through a secure relay layer. You can review diffs, approve commits, and monitor agent progress from anywhere. No SSH, no VPN, no laptop required. During the preview period, it's available on every plan, including Free. Continue Reading →
Anthropic launched Claude Platform on AWS on Sunday, two weeks after OpenAI put GPT-5.5 and GPT-5.4 on Amazon Bedrock. AWS now hosts both frontier model families on a single bill with IAM authentication, CloudTrail logging, and consumption-based pricing. Continue Reading →
This has already been a crazy week in the world of AI. The Wall Street Journal reported that OpenAI missed its monthly revenue targets multiple times this year. The company also missed its internal goal of one billion weekly active ChatGPT users by year-end 2025, and OpenAI CFO Sarah Friar reportedly told colleagues she’s worried Continue Reading →
Anthropic released Claude Opus 4.7 on Wednesday with impressive numbers: 10.9 percentage points higher on SWE-bench Pro (the gold-standard coding test), 3x more production tasks resolved on Rakuten’s benchmark, 98.5% on visual acuity up from 54.5%, and state-of-the-art scores on finance evaluations. For devs, this is a genuine step forward. For consumers, the story is a bit different. Continue Reading →

Managed Agents Are Here

Anthropic launched Claude Managed Agents in public beta at eight cents per agent runtime hour plus model usage fees. Developers get sandboxed containers, authentication, checkpointing, error recovery, session persistence, and end-to-end execution tracing: every piece of infrastructure that separates a demo from a production deployment, available as a set of composable APIs. Continue Reading →