The Cheating Problem Is an AI Mirror

Jay Caspian Kang's New Yorker piece on academic despair in the AI age poses the question with uncomfortable clarity: was it always true that half of students would cheat if it were easy enough? The question sounds like it's about students. It is actually about systems. And right now, two academic papers published this week suggest the systems we are deploying as solutions are themselves broken in structurally similar ways.

Overconfident Machines, Unverified Answers

A 2026 paper on arXiv by Michael, BenShushan, Bien, and Moore, 'Confidence Calibration in Large Language Models', finds that LLMs systematically misrepresent their own certainty across diverse tasks. They are, in a very specific sense, confident cheaters: producing fluent outputs that signal authority without possessing it. Separately, Imran and Bulathwela's paper on AI tutoring systems finds that intelligent tutoring tools routinely reward correct-looking answers without understanding the reasoning behind them, which is precisely the problem professors say AI-assisted student submissions create. The irony is close to perfect: we are deploying overconfident, reasoning-blind AI systems to catch students who are using overconfident, reasoning-blind AI systems to cheat.

What the Classroom Actually Measures Now

The despair Kang documents is not really about academic integrity in the traditional sense. It is about what higher education was actually selling: the performance of effortful reasoning as a proxy for genuine understanding. AI has simply made the gap between performance and understanding visible, which is uncomfortable because that gap was always there. A 2026 paper on arXiv by Belotti et al., 'Artificial Effort', examines how AI disrupts real-effort tasks in which cognitive cost was assumed to be the point. When the cost collapses, so does the signal. The classroom is not uniquely broken. It is just the first institution to have its measurement infrastructure publicly exposed. TurboFund's piece on flipping screen time from passive to active is, accidentally, the most useful framework here.