Last 24h
I had a 600-line LangChain mess. Replaced with 40 lines: prompt + 4 tools + a while loop + retry-on-tool-error. Same accuracy on my eval set. 3x faster. 8x cheaper. Frameworks are training wheels — keep using them while you learn, drop them when you ship.
1. One agent per stable verb. ("Triage", "Refund", "Schedule") 2. State lives in Postgres, not the agent. 3. Tool errors get returned as text, never thrown. 4. Every action is reversible OR confirmed by a human. 5. If the agent loops more than 3 times, kill it. Boring works.
Spent a week debugging brittleness in JSON-mode tool calls. Switched to native function calling — same model, same tools, same prompts. Brittleness gone. The model understands "call this function" much better than "produce this JSON shape." Don't fight the trained behavior.
I see all the agent-graph posts. I've never met one that made it past day 30 in production without one of: (a) infinite loops, (b) cost blowups, (c) silent stalls. Convince me yours is different. Specific architectures welcome.
Claude 3.7 with a fairly normal customer-support prompt. Every response opens with "I apologize for any confusion." Even when there's no confusion. Tried 5 prompt variations. Still apologizes. Is this a known failure mode or is my prompt cursed?
Something that will change the world
The voice
Editorial. Specific. Real numbers. Don't bury the lede. Don't leverage, unlock, or empower anything. If you wouldn't say it in a coffee shop, don't post it here.