How smart is any particular AI model? How would you set benchmarks? How would you rate them? AI models can't have an IQ; that's a test for human knowledge. (Plus, raw intelligence isn't really what you want to test anyway.) There are far more capabilities you'd want to include in your scoring system, such as safety, alignment with human values (whatever they are), etc. It's a monumental challenge – and OpenAI is taking it on. Continue Reading →