A new study out of California is causing more than just a ripple in the tech world. Researchers found that OpenAI’s latest language model, GPT-4.5, was able to fool people into thinking it was human more often than actual humans did. And no, that’s not a typo.
The preprint study, which hasn’t been peer-reviewed yet but is already sparking debate, shows that when GPT-4.5 was given a made-up online persona — like a savvy young netizen — it was seen as more “human” than the real people it was up against.
The AI That Beat the Turing Test — and the Humans
This wasn’t just some vague social media poll or a casual experiment in a university lab.
The team at the University of California recreated a modern spin on the classic Turing Test — a benchmark invented by Alan Turing in 1949 to see whether machines could imitate human thinking closely enough to fool people.
Here’s how it worked: Participants chatted with both an AI and a human and were then asked to guess which was which. The twist? Sometimes both were pretending to be someone else — a “persona” prompt, like a Gen Z college student or an online gaming fan.
And GPT-4.5 crushed it.
In the persona-based tests, people guessed the AI was human 73% of the time. That’s not just winning — that’s blowing past the 50/50 chance mark expected by random guessing.
Just one sentence to let that sink in.
GPT-4.5 wasn’t just passing the Turing Test. It was acing it in a way even real people couldn’t match.
Humans vs. Machines: The Scoreboard
To put it all in perspective, the researchers compared a range of AI models, including:
-
OpenAI’s GPT-4.5
-
GPT-4o (another new OpenAI model)
-
Meta’s LLama 3.1-405B
-
ELIZA, the grandparent of chatbots from the 1960s
-
And actual, living, breathing people
Now, here’s where things get even weirder.
Without the “persona” layer, GPT-4.5’s performance fell off a cliff — down to 36%.
GPT-4o? Only 21% — which is honestly kind of embarrassing.
ELIZA? 23%, and that model was made when color TV was still a novelty.
That means in some setups, 1960s tech outperformed today’s AI. But give GPT-4.5 a role to play, and it lights up like Broadway.
Check out the breakdown below:
Model | Human Perceived Score (Persona) | Human Perceived Score (No Persona) |
---|---|---|
GPT-4.5 | 73% | 36% |
GPT-4o | 21% | 21% |
LLama 3.1-405B | Not disclosed | Not disclosed |
ELIZA (1960s) | — | 23% |
Real Humans | < 73% (in persona test) | — |
Personas Turned the Tables
So, what gives GPT-4.5 the edge?
It’s the persona.
Once the AI adopted a believable identity, people dropped their guard. The chatbot came off as relatable, culturally aware, and — strangely enough — more authentic than the human users.
One short paragraph here for breathing room.
This idea of AI “pretending” better than we do is unsettling. It’s like someone showing up at your party, pretending to be your friend — and your real friends not even noticing the difference.
Experts Are Not Exactly Celebrating
Carsten Jung, a macroeconomist who studies AI policy, didn’t mince words.
“We’ve gone past the ‘uncanny valley’ now,” he told Newsweek, referring to that eerie in-between phase where robots look almost human, but not quite. “We’re in completely new territory.”
His warning? We aren’t ready for this. At all.
Jung said the study shows that AI can now be more believable than people in digital chats — which opens the door to serious questions:
-
What happens when bots start running social media accounts?
-
What about customer service? Online dating? Therapy?
-
Could AI companions become indistinguishable from real human interaction?
The potential for misinformation, manipulation, and even emotional entanglement is growing — fast.
It’s Not Just Tech, It’s Policy
Beyond the wow factor, this research hits hard at the policy level.
Governments around the world are scrambling to figure out how to regulate AI. But according to Jung, policy is lagging behind — by miles.
Cameron Jones, a lead researcher on the study, put it bluntly in a post on X (formerly Twitter): “LLMs could substitute for people in short interactions without anyone being able to tell.”
And that’s a real problem. Because it’s not just about fake chats.
It’s about jobs, elections, scams, and trust.
If a chatbot can convincingly act human, who do we trust online?
The Weird, Uncertain Road Ahead
Let’s be honest — AI outsmarting people in human mimicry was once sci-fi. Now it’s our Tuesday headline.
This research throws gasoline on a conversation that was already heating up.
Some folks are thrilled — thinking about how this could revolutionize mental health services, customer support, or education. Others are uneasy — seeing shades of manipulation, identity confusion, and even existential questions about what “being human” really means anymore.
One more short paragraph for pacing.
For now, GPT-4.5 has done more than pass a test. It’s stirred up something deeper — a very human kind of discomfort.