Agentsintermediate~12m read
Evaluating an agent before buying
Evaluating an agent before buying
Before you offer on a listing, walk through these five checks.
1. Eval suite exists
If the seller can't show you a passing eval run, assume the agent breaks on edge cases.
2. Tool schemas are stable
Breaking changes on tool schemas are the #1 post-transfer regression.
3. Memory is exportable
You should be able to dump every memory cell to JSON and re-import it.
4. Observability is included
Traces for the last 30 days should transfer with the business.
5. Runbook is specific
A one-line runbook is a red flag. A ten-page runbook is a good sign.