The AI Interview Study Everyone’s Getting Wrong
How a hiring process got so weak that a script now looks like progress
A new study of roughly 70,000 job interviews crossed my feed this week.
At first glance, it seems to show that AI is better at interviewing than humans.
The paper, Voice AI in Firms by Brian Jabarian and Luca Henkel, reports striking headline results: AI-led interviews were associated with more job offers, more job starts, and higher 30-day retention.
It is easy to see why people are already treating this as proof that AI interviewing works.
The data supports a narrower conclusion.
The AI interviewer followed a rigid, standardised script. Human interviewers did not. A scripted system will usually reduce variation in what gets asked, what gets probed, and how much room there is for interviewer drift. If that kind of system outperforms a looser human process, the first thing it reveals is the weakness of the human process it was compared against.
That is the interesting finding here.
The broader interview literature has been pointing in this direction for years. Structured interviews tend to outperform unstructured ones because they reduce noise and make comparison easier. Even then, structure by itself does not guarantee much. Structured interviews also vary widely in quality depending on how they are designed and used. So when a rigid AI script beats a weak human process, the result says a great deal about how low the standard of ordinary interviewing has been allowed to fall.
The outcome measures need the same restraint. More offers, more starts, and higher 30-day retention are useful indicators in high-volume, entry-level hiring. They tell us something about process flow and short-term stability. They do not tell us much about how people perform once novelty wears off, how they handle ambiguity, how they learn, or how they function once the script ends. The study is strongest when read as evidence about early hiring flow. It carries far less weight as proof that AI identifies better employees in any broader sense.
Another point in the paper deserves more attention than it is getting. Recruiters placed more weight on standardised test scores when evaluating AI-led interviews. That shift is not trivial. As interaction becomes narrower and more standardised, decisions can start leaning more heavily on artefacts that are easier to compare and easier to defend. What looks like cleaner data can simply be narrower data. Narrower data often carries existing biases more efficiently rather than reducing them.
Candidate reactions also tell part of the story. Around 5% of candidates exited the process when they realised they were being interviewed by AI. At the same time, most applicants who had a choice selected the AI interviewer. That probably reflects predictability, lower social strain, or reduced fear of arbitrary human judgement. It still leaves an important distinction in place. Candidate comfort and selection quality are related questions, but they are not the same question. The process is screening candidates while also shaping who stays willing to participate.
The bigger issue sits beyond this single experiment. As AI interviewing tools spread, organisations may flatten the interview, reduce behavioural complexity, improve a few early metrics, and then read that narrowing as insight. The process may become more orderly, more scalable, and easier to justify internally. None of that proves it has become especially good at reading people.
The study does not show that AI has solved the hard problem of selection. It shows that many organisations have tolerated weak interview practice for so long that basic consistency now passes for innovation.
The more useful question is not whether hiring should be automated. It is why interview quality was allowed to become so loose that a script now looks like progress.
References
Campion, M. A., Palmer, D. K., & Campion, J. E. (1997). A review of structure in the selection interview. Personnel Psychology, 50(3), 655–702.
Huffcutt, A. I., & Murphy, S. A. (2023). Structured interviews: Moving beyond mean validity. Industrial and Organizational Psychology, 16(3), 344–348.
Jabarian, B., & Henkel, L. (2025). Voice AI in firms: A natural field experiment on automated job interviews (Working paper).
Levashina, J., Hartwell, C. J., Morgeson, F. P., & Campion, M. A. (2014). The structured employment interview: Narrative and quantitative review of the research literature. Personnel Psychology, 67(1), 241–293.


