Andrej Karpathy is joining Anthropic. He’ll work on the pre-training team under team lead Nick Joseph, where he will start a new initiative focused on using Claude to accelerate Claude’s own pre-training research. In his announcement, he said “the next few years at the frontier of LLMs will be especially formative” and that he was getting “back to R&D.”
Karpathy’s resume speaks for itself. He was a co-founder of OpenAI. He left in 2017 to run Full Self-Driving and Autopilot as Senior Director of AI at Tesla until 2022. He returned to OpenAI from 2023 into early 2024, then founded Eureka Labs: an AI-native education company he described as a deep personal passion. If you don’t know his work, subscribe to his YouTube channel. His videos about how he uses LLMs are worth your time. (I’m a huge fan.)
Andrej joining a new pre-training research effort fascinates me because he would have no incentive to take a job he didn’t believe in. For most of last year, the consensus was that pre-training had hit diminishing returns and the real action had shifted to post-training and reinforcement learning. Using Claude to accelerate Claude’s own pre-training is a bet on recursive self-improvement. Maybe we haven’t hit the pre-training ceiling, or maybe the ceiling will move once models start improving themselves.
Every company needs a Claw strategy. Do you have one?
Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it. This work was created with the assistance of various generative AI models.