The AI That Fights Wars Can't Explain Itself

The same week the Pentagon announced AI contracts with Google, Nvidia, and five other tech companies for classified military networks, researchers published a paper explaining exactly how and why safety-trained AI systems can be made to answer harmful requests. Shubham Kumar and Narendra Ahuja's arXiv paper on minimal causal explanations for LLM jailbreaks does not mention the Pentagon. It doesn't need to. The implication runs itself.

Explainability Is the Gap Between AI Capability and AI Accountability

Kumar and Ahuja find that jailbreak success in large language models can be explained through minimal, local causal interventions, meaning tiny, targeted prompt changes reliably unlock harmful outputs. The word reliable is doing enormous work in that sentence. When you layer this onto classified military AI deployments, the accountability question becomes structural. Safety training is not a guarantee. It is a probabilistic surface that can be navigated by sufficiently motivated actors. The Pentagon's contracts move fast because competitive pressure with China demands speed. Explainability research moves slowly because causal rigor demands care. These two timelines are now in direct conflict. A 2025 paper in AI and Society by Mittelstadt et al. found that institutional AI deployment consistently outpaces the interpretability tools needed to audit those deployments. The military is the most consequential version of a problem every enterprise AI customer already has.

When Tool-Use Tax Meets Defense Contracts

A separate arXiv paper this week, Zhang et al.'s analysis of the tool-use tax in LLM agents, found that adding tool-use capabilities to AI agents does not uniformly improve performance and often introduces reasoning degradation. The Pentagon's AI systems will be tool-augmented agents operating in high-stakes, low-oversight environments. The tool-use tax paper suggests these systems may underperform in precisely the complex, multi-step scenarios where military AI is theoretically most useful. Uber CEO Dara Khosrowshahi, speaking to The Verge this week, discussed replacing human drivers and eventually himself with AI. The civilian deployment context is forgiving of failure. The military context is not. For defense-adjacent AI startups navigating this investment landscape, TurboFund's list of 25 seed-stage AI investors includes several with defense and dual-use portfolios. The money is moving. The safety guarantees are not keeping pace.