Cheap models, broken loops: the agent infrastructure gap

The hardware headline isn't your problem

Nvidia and LG are building humanoid robots in South Korea. It's a 10-point story on Hacker News, which tells you how much builders currently care about hardware partnerships versus their own shipping pipelines. The announcement is here: https://blogs.nvidia.com/blog/nvidia-and-lg-group-ai-factory/

The more relevant news is cheaper and quieter. DeepSeek drove inference costs down, but keeping them there will take billions in capital expenditure: https://chinacompany.substack.com/p/deepseek-made-ai-cheap-now-it-needs. That matters because cheap tokens were supposed to make agents economical. They do. But only if the agent doesn't get stuck.

Intent debt eats cheap tokens

Addy Osmani argues that AI agents can't help with intent debt—the gap between what a user wants and what they articulate: https://addyosmani.com/blog/intent-debt/. You can pump cheap completions through a model, but if the intent is muddy, the agent amplifies the ambiguity instead of resolving it.

This isn't theoretical. In the BusellAI community, builders are reporting infinite loops in ReAct agents unless they hard-cap iterations. The fix is mechanical—enforce a max—but the cause is structural. The agent lacks enough context to know it's going in circles, because the intent was never fully translated into constraints. Every wasted loop is a tax on the very cost savings DeepSeek created. A ten-step ReAct cycle that should have taken two turns isn't a pricing problem. It's a design problem.

State tracking is the new bottleneck

AgentBench v2 results surfaced a specific failure mode: state tracking bottlenecks in multi-turn tasks. When agents run across multiple tool calls or reasoning steps, internal state decays or collides. The model weights haven't changed. The prompt hasn't changed. The state management failed. When state collides, you don't always get an error. You get a confident wrong answer. That's worse.

Adrian Ferrera makes a parallel point about software quality: AI doesn't write good code, the environment does: https://adrianferrera.dev/en/blog/ai-does-not-write-good-code. Linting, testing, type systems, and review catch what the model misses. For agents, the equivalent infrastructure is state observability, iteration guards, and intent clarification loops. Nobody has standardized these yet. Every team is rebuilding guardrails from scratch.

The crawler war and the cost stack

On the infrastructure side, the Caddy Defender plugin returns garbage responses to AI crawlers: https://github.com/jasonlovesdoggo/caddy-defender. It scored 3 points, same as the environmental cost report. Both signal fatigue. One is publishers fighting scrapers. The other is UNU data showing carbon, water, and land footprints climbing: https://unu.edu/inweh/collection/environmental-cost-of-AIs-Enrgy-Use-Carbon-water-and-land-footprints.

Builders are now caught between wanting cheap inference and paying for it in compute overhead and hostile web infrastructure. The Caddy plugin is a small open-source salvo in a larger war over who gets to train on what, and who pays. The Democrats' new AI proposals add another layer of uncertainty: https://www.wsj.com/tech/ai/democrats-unveil-flood-of-ai-proposals-in-potential-challenge-to-tech-giants-16819bfc. None of this changes what you ship tomorrow, but it thickens the air around capital allocation.

What this means for builders

Treat agent failures as architecture problems, not model problems. Cap your ReAct iterations, instrument state at every turn, and build intent clarification into the first step. The models are cheap enough now; your job is to keep them from wasting money on loops.

Today's discussions

ReAct loops are a design tax, not a model failure.
AgentBench v2 shows state decay is the new latency.
Cheap inference only helps if your agent stops when it's done.