The AI Interview Study Everyone’s Getting Wrong
How a hiring process got so weak that a script now looks like progress
A new study of roughly 70,000 job interviews crossed my feed this week.
At first glance, it appears to show that AI is better at interviewing than humans.
The paper is:
“Voice AI in Firms: A Natural Field Experiment on Automated Job Interviews”
(Brian Jabarian & Luca Henkel, 2025).
The headline results are striking. AI-led interviews were associated with more job offers, more job starts, and higher 30-day retention. Unsurprisingly, many are treating this as evidence that AI interviewing “works.”
That conclusion doesn’t follow from the data.
The secret sauce isn’t intelligence — it’s consistency.
The AI interviewer followed a rigid, standardised script. Human interviewers did not. When a scripted system outperforms humans, the primary finding is not that AI is superior — it’s that unstructured human interviews allow too much variation in what is asked, what is followed up on, and what behaviour is actually surfaced.
This is not a technological breakthrough.
It is a reduction of interactional variance being misattributed to technology.
We are also measuring the wrong outcomes.
“More offers,” “more starts,” and “30-day retention” are efficiency metrics suited to high-volume, entry-level hiring. They say nothing about how people perform once novelty wears off, how they handle ambiguity, how they learn, or how they contribute in less scripted environments. In short, the study optimised for predictable behaviour in a predictable process — not for strength of judgment or adaptability at work.
There is also a quiet distortion in how decisions are made downstream.
The paper notes that recruiters placed more weight on standardized test scores when evaluating AI-led interviews. As interaction becomes more constrained, judgment shifts away from interpretation and toward artefacts that feel easier to justify. What looks like “cleaner data” is often just narrower data — and narrower data tends to carry existing biases more efficiently, not less.
Candidate reactions matter more than they appear to.
Around 5% of candidates exited the process upon realising they were being interviewed by an AI. That is not a technical footnote. It is a behavioural filter. The system is not only evaluating candidates — it is shaping who remains willing to participate and under what conditions.
This matters beyond this single experiment. As AI interviewing tools proliferate, organisations risk adopting systems that reduce behavioural complexity until outcomes look better on paper — and mistaking that reduction for insight.
This is not evidence that AI interviews work.
It is evidence that organisations have tolerated weak interview practice for far too long.
The real question isn’t whether hiring should be automated —
it’s why basic interview discipline still isn’t enforced.


