When Not to Use an AI Agent
Most AI advice tells you to ship more agents. Here is the honest opposite: the four times a plain script and a human beat an agent, learned running a fleet daily.
When everyone is shipping AI agents, the useful question is the opposite one: when should you not?
I run a one-person operation with a fleet of agents doing real work every day. Writing drafts, scoring leads, checking that scheduled jobs actually did their job. They earn their keep. But I have also wired up agents for tasks that a plain script would have done better, cheaper, and with less drama. Here is where I draw the line now.
Use a script, not an agent, when the steps never change
An agent is worth it when the input is messy and the right next step depends on judgment. If the steps are fixed, you do not need a model in the loop. You need a function.
I once had an agent renaming files and moving them into dated folders. It worked. It also cost tokens, took ten seconds instead of ten milliseconds, and failed in a new creative way about once a week. A six-line script replaced it and has not been touched since. The rule: if you can write the if-statements, write the if-statements.
Skip the agent when a wrong answer is expensive and silent
Agents are confident even when they are wrong. That is fine when a human reviews the output before it matters. It is dangerous when the output flows straight into something irreversible.
Money movement, production deletes, sending email to real customers. I let agents draft those actions. I do not let them commit those actions without a gate. A wrong trade or a wrong DELETE does not announce itself. By the time you notice, the damage is done. Put a human or a hard rule between the agent and the irreversible step.
Do not reach for an agent to dodge a decision you have not made
This is the one that bit me most. When I did not actually know what I wanted, I would hand the fuzzy problem to an agent and hope it would figure out the goal. It cannot. It will pick a goal, usually a plausible wrong one, and pursue it with energy.
An agent amplifies a clear intent. It does not supply one. If you cannot write the success condition in one sentence, the agent is not your problem. The missing decision is.
Watch the cost of the loop, not the cost of one call
A single model call is cheap. An agent that retries, reflects, and calls tools in a loop is not. The cost is the loop, and loops can run away.
I learned this the boring way. A draft in my own publishing pipeline failed a length check, so the repair agent fixed it, resubmitted, failed again, and did that twenty-five times before anything flagged it. No single run looked expensive. The total was real, and the task was dead on arrival. Unbounded retries are how a helpful agent quietly burns your budget.
That last failure mode is exactly why I built AgentGuard. It caps spend, token use, and call counts per run, so a stuck loop stops itself instead of running until you happen to look. An agent should fail loud and cheap, not silent and expensive.
A short test before you build one
Ask three questions. Does the task need judgment, or just rules? If a step goes wrong, will someone notice before it hurts? Can I state the goal in one sentence?
If the answers are rules, no, and no, you do not want an agent. You want a script with a human nearby. Agents are good. They are not the answer to every box on the board, and pretending otherwise is the fastest way to a surprising bill and a quiet mistake.
Build the agent when the task is genuinely fuzzy and the stakes are reviewable. Bound every loop. And if you want hard budget, token, and rate limits around your agents so a runaway loop cannot drain your account, that is what I built AgentGuard for: https://bmdpat.com/tools/agentguard
Want more like this?
AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.
Patrick Hughes
Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.
More writing
- 4 min
What GitHub Copilot Users Wish They Had a Week Ago
Copilot went usage-based and bills spiked. The fix is a runtime budget cap at the call site.
- 4 min
Your Cron Jobs Lie - Why I Built an Outcome Checker
Scheduled tasks exit 0 even when the work never happened. Here is the outcome layer I built on top of my agent fleet, and why it shipped before any new dashboard.
- 6 min
AI-powered hacking went industrial. Here's what changes if you run agents.
Google found the first AI-built zero-day in a planned mass-exploitation event. A builder's read on what changes for small operators running agents.