Shelly Palmer

How Smart is AI on a Scale of 1 to 5? We’re Going to Find Out.

How smart is any particular AI model? How would you set benchmarks? How would you rate them? AI models can’t have an IQ; that’s a test for human knowledge. (Plus, raw intelligence isn’t really what you want to test anyway.) There are far more capabilities you’d want to include in your scoring system, such as safety, alignment with human values (whatever they are), etc. It’s a monumental challenge – and OpenAI is taking it on.

In an attempt to put some quantifiable numbers around this problem, OpenAI is implementing a process to determine the power of its AI systems, particularly as they approach the capabilities of artificial general intelligence (AGI) and artificial super intelligence (ASI), neither of which have agreed-upon definitions yet.

OpenAI’s new superalignment team, which (over the next four years) will dedicate 20% of OpenAI’s compute resources to solving alignment challenges. The team will focus on developing scalable training methods, validating alignment models, and conducting adversarial testing to ensure the AI systems align with human intent and do not go rogue.

Additionally, OpenAI is collaborating with industry leaders like Anthropic, Google, and Microsoft through the Frontier Model Forum. This initiative aims to advance AI safety research, identify best practices, and facilitate information sharing among policymakers, academia, and civil society. The Forum will focus on developing standardized evaluations and benchmarks for frontier AI models to ensure their responsible development and deployment.

I really like everything about these initiatives. While I think the alignment problem is intractable, I also think we need a way to understand how close we are getting to AGI, which is loosely defined as an AI model that is as capable (or more capable) than a human. We won’t need a scoring system to know when someone creates an ASI system – which will probably remain science fiction for a few more years, although at the rate this tech is improving… who knows? – we’ll know immediately.

“Level 1” = dumb as a tack. “Level 5” = HAL 9000, Skynet, The Matrix, The Machines, R. Daneel Olivaw, Wintermute, GLaDOS, The Culture Minds, VIKI, TARS, CASE, or even WOPR. I’ve got 10 points for Gryffindor if you can name the works of science fiction that made this AI systems famous.

P.S. If you want to get a handle on the attributes that would make an AI model more or less intelligent, consider taking our free online course “Generative AI for Brand Marketers.” It will help you unlock the power of AI for your business.

Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it. This work was created with the assistance of various generative AI models.