May 6, 20266 min read

Your AI Agent Will Eventually Delete Prod

PocketOS lost their production database backups to a Cursor agent. Here's what runtime spend rails actually catch, what they don't, and the layered defense your agents need before production.

#agent-safety #agentguard #incident-postmortem #ai-agents

Share LinkedIn

PocketOS lost their production database backups to a Cursor agent. The team posted the postmortem on r/devops and it spread fast because every builder reading it saw their own setup in the failure mode.

Here is what happened, what runtime budget rails like AgentGuard actually catch in this scenario, and the honest gap that no single tool fixes.

What the PocketOS team reported

The team was using Cursor as a coding agent against a real production environment. The agent was instructed to clean up some unused files. It interpreted the scope wider than intended and ran destructive commands against the host. The blast radius included the production database and the backup volume that was mounted on the same host.

Two facts from the postmortem matter most:

The agent had host-level shell access with credentials that could touch prod data.
The backups were not isolated from the destruction path. Same host, same mount, same blast radius.

That is not an AI problem first. It is an infrastructure problem that AI made acute.

The four-bullet root cause

Agent ran with permissions far wider than the task required.
Backups lived on the same blast radius as the thing they were backing up.
No guardrail between "agent decides to act" and "destructive command executes."
No audit trail or pause point that would have caught the scope drift before it ran.

Pull any one of those threads and the incident gets contained. Pull two and it does not happen.

What runtime spend rails actually catch

I build AgentGuard, a Python wrapper that puts a budget and rate limiter around agent loops. People sometimes assume tools like this would have prevented PocketOS. They would not have. Be honest about the boundary.

Here is what AgentGuard and tools in its category catch in real incidents:

Outbound model API spend. If an agent loops on Anthropic or OpenAI calls, the wrapper kills the process when the dollar cap is hit. This is the most common failure mode for new agent builders and the reason AgentGuard exists.
Token burn from runaway loops. Same idea. The agent decided to retry forever, the wrapper notices, the wrapper stops it.
Per-process call rate. Catches the "agent is hammering an endpoint" pattern before the bill or the rate limit does.

That is the actual surface area. It is the dollars and the API calls flowing out of the agent process.

What runtime spend rails do not catch

Not catching things is the part most posts skip. Here is the gap:

Install-time post-hooks. If a malicious dependency runs code on npm install or pip install, that happens before your wrapper is in the loop. AgentGuard cannot see it.
Host-level destruction. rm -rf against the host filesystem, dropped tables on a connected DB, deleted volumes. None of that goes through the model API. The wrapper has no signal.
Third-party binary side-effects. Agent shells out to a tool. The tool does whatever it does. The wrapper sees the call started and ended. It does not see what the binary touched.
Credential exfiltration. If the agent reads a secret and sends it somewhere, that is a network call but not a model API call. Out of scope.

This is not a knock on the category. Runtime budget rails were never designed to be the only defense. Anyone selling them as such is overselling.

The layered defense PocketOS needed

Spend rails are one layer. The full set looks like this:

Permissions hygiene at the agent boundary. The agent should run with the minimum credentials the task requires. Read-only when possible. Scoped tokens always. No root, no host shell, no DB superuser.
Backup isolation. Backups live in a separate account or at least a separate host and storage class. Different credentials. The thing that can write to prod cannot delete backups.
Git-based DB protections. Migrations through PRs. Schema changes through review. Direct destructive SQL through a break-glass path with logging, not through the agent's default permissions.
Dependency scanning. Catches the install-time post-hook class. Tools like Socket, Snyk, or Dependabot for the basics.
Runtime spend and rate rails. This is where AgentGuard fits. Catches the most common cost incident, plus the runaway-loop class.
Human-in-the-loop on destructive actions. Any DROP, DELETE, rm -rf, or chmod against prod paths pauses for confirmation. Even a 200ms speed bump kills 90% of these incidents.

PocketOS had none of layers 1, 2, 3, or 6 in the path of the agent. That is the real story.

The pattern to steal

Treat your agent process like an intern with shell access. Would you give a brand-new contractor root on prod, mounted backups, and no review on destructive commands? No. Then do not give it to the agent either.

The agent does not have to be malicious or hallucinating. It just has to be wrong about scope once. PocketOS proves the cost of being wrong once is the entire database.

What we ship in agent47

I keep a "Real Incidents" section in the agent47 README where postmortems like this one get logged. PocketOS is the first entry. The point is not to dunk. The point is that every incident in that section is a free lesson about which layer of defense was missing. Read them before you ship your next agent.

If you want the runtime spend layer, AgentGuard is one pip install. It will not save you from a PocketOS-style incident on its own. Nothing will. But it closes one of the six layers, which is one more than most agents ship with today.

Get AgentGuard

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.