The Turing Test is for Humans to Fail

Alan Turing proposed the Imitation Game (aka the Turing Test) in his seminal paper “Computing Machinery and Intelligence,” where he suggested replacing the question “Can machines think?” with a more practical (and less ambiguous) challenge focused on whether machines can perform well in an imitation game. Turing believed that such a game would provide a clearer criterion for machine intelligence.

Rightly or wrongly, with the advent of generative AI, the Turing Test has become synonymous with a method for determining whether a machine can exhibit intelligent behavior indistinguishable from that of a human.

The Imitation Game has three players: a human evaluator who engages in natural language conversations with both a human and a machine – without knowing which is which. The conversations are conducted through a text-only channel, which prevents the evaluator from being influenced by the machine’s ability to render speech. If the evaluator cannot reliably tell the machine from the human, the machine is considered to have passed the test.

Except… the Turing Test does not assess the machine’s ability to give correct answers; it assesses how “human” the responses are. That means it’s not really a test for the computer to pass, but rather a test for the human evaluator to fail. This brings us to today’s sensationalist, nonsense, crazy, why are we still doing this? story.

On Monday, Anthropic prompt engineer Alex Albert got everyone’s pixels in a pickle when he tweeted that Claude 3 Opus (Anthropic’s newest hopeful ChatGPT killer) demonstrated a type of “metacognition” (or self-awareness) during a “needle-in-the-haystack” evaluation.

Here’s an excerpt from his tweet:

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

Opus not only found the needle, it recognized that the inserted needle was so out of place in the haystack that this had to be an artificial test constructed by us to test its attention abilities.

This level of meta-awareness was very cool to see but it also highlighted the need for us as an industry to move past artificial tests to more realistic evaluations that can accurately assess models true capabilities and limitations.

Ugh!!! No. No. No. No. No. I don’t know why smart people who are schooled-in-the-art do this. Is it a PR stunt to keep Anthropic in the news for an extra day or two? If so, it worked; I would not be talking about Claude 3 Opus today if Mr. Albert didn’t go viral with this “fun story.” (It may be fun, but it should not be a story.)

Can I say with certainty that Claude Opus 3 is not self-aware? No. No one can. We can’t define human consciousness, so how would we ever even attempt to define machine consciousness? That said, there are dozens of other simpler and more plausible explanations for Mr. Albert’s experience, such as:

Jim Fan of Nvidia tweeted: “People are reading way too much into Claude-3’s uncanny ‘awareness.’ Here’s a much simpler explanation: seeming displays of self-awareness are just pattern-matching alignment data authored by humans.”

Yacine Jernite of Hugging Face tweeted: “This is REALLY bugging me and pretty irresponsible framing. When car manufacturers start ‘teaching to the test’ by building engines that are emission-efficient for the typical length of a certification test, we don’t suspect that engines are starting to gain awareness.”

At the moment, almost every publicly available LLM is a sophisticated word calculator that is explicitly programmed to interact with humans in the most human way possible. This is the primary cause of the “self-awareness” confusion, and it is probably where a self-imposed regulatory line should be drawn.

A tweet by AI Research Scientist Margaret Mitchell sums this up nicely: “The level of self-referential language I’m seeing from the Claude examples are not good. Even through a “safety” lens: minimally, I think we can agree that systems that can manipulate shouldn’t be designed to present themselves as having feelings, goals, dreams, aspirations.”

My problem with Alex Albert’s tweet lies in the complex question fallacy (aka a loaded question): “When did you last hit your wife?” Damage done.

Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it. This work was created with the assistance of various generative AI models.

The Turing Test is for Humans to Fail

About Shelly Palmer

Tags

Categories

Get Briefed Every Day!