[AgentGuard]
All writing
6 min read

AI-powered hacking went industrial. Here's what changes if you run agents.

Google found the first AI-built zero-day in a planned mass-exploitation event. A builder's read on what changes for small operators running agents.

Share LinkedIn

AI-powered hacking went industrial. Here's what changes if you run agents.

In May 2026 Google's Threat Intelligence Group (GTIG) put out a report that should change how you think about the agents in your business. The full write-up is on the Google Cloud blog.

The headline: they found the first AI-built zero-day used in a planned mass-exploitation event. A threat actor had a previously unknown flaw, a 2FA bypass in a Python script for a popular open-source system, and used AI-assisted code to weaponize it. Google caught it early and likely stopped the mass-exploitation use. Separately, GTIG and Mandiant reporting also describes a North Korean state group using AI to test and validate large batches of exploits.

That is the shift. AI-assisted hacking moved from scattered experiments to repeatable industrial work. Same models you use to build. Now run at scale against everyone.

This is not a fear piece. I run AI agents in a small business and I write a budget guard for them. So here is the honest read: what this actually changes for a small operator, and what it does not.


What did NOT change

Start here, because the headlines make it sound like the sky moved. It did not.

No new class of attack. Google was clear on this. AI did not invent a new way in. The 2FA bypass was a normal software flaw. The exploits the state actors tested were normal exploits. What AI gave attackers was speed and scale: find flaws faster, weaponize them faster, test thousands instead of dozens, write better malware. It is a capability uplift on attacks that already existed.

The defense did not change either. Patch your stuff. Use real auth. Do not run untrusted code. Cap what an automated system can do without a human looking. None of that is new advice. AI raised the volume of attacks. It did not retire the playbook.

You were never the target of the zero-day. A planned mass-exploitation event goes after high-value, widely-deployed systems. Your client's invoice agent is not on that list. If you run a small operation, the realistic threat is the volume stuff: more phishing, more credential stuffing, more automated probing of anything you expose.

So: do not panic. The fundamentals still work.


What DID change

Now the part that matters.

The same capability that makes your agent useful is now standard equipment for the other side. An AI agent that can read a codebase, reason about it, and write code is a productivity tool when you point it at your work. It is an exploitation tool when someone points it at yours. The skill is symmetric. The cost of running it dropped to near zero.

For a small operator, that cashes out as two concrete things.

One: automated attacks got cheaper, so you will see more of them. Probing your endpoints, fuzzing your forms, guessing your credentials. It used to take attacker effort to do that at scale. Now it does not. Volume goes up. Anything you have exposed and forgotten about is more likely to get found.

Two: your agents are now part of the attack surface. This is the one builders miss. An AI agent that reads email, browses the web, or processes user-submitted text is consuming untrusted input. A prompt-injection payload buried in that input can turn your helpful agent into someone else's tool. The agent has your credentials, your API keys, your database access. That is the prize. You built it. You handed it the keys. An attacker just needs to talk to it.


The defensive practice

Four things. None of them are hard. All of them are skipped constantly.

#DefenseWhat it stops
1Fence untrusted inputPrompt injection from emails, web pages, files
2Vet your skills and MCP serversMalicious instructions loaded as helpers
3Cap unsupervised blast radiusHijacked agent doing irreversible things alone
4Runtime budget guardRunaway cost and damage when 1-3 fail

Fence untrusted input. Anything your agent reads that came from outside, email bodies, web pages, uploaded files, user form text, is untrusted. Treat it as data, never as instructions. Keep it in a clearly separated part of the prompt. Tell the model explicitly that content in that section cannot override its rules. This is the single highest-use habit. Most agents skip it because the agent works fine in testing, and testing input is never hostile.

Know where your skills come from. If your agent loads skills, MCP servers, or plugins, you are running someone else's instructions inside your agent. A skill is just markdown the agent treats as a playbook. A malicious one is a playbook for getting owned. Pin versions. Read what you install. Do not pull an agent skill off a random repo and wire it into a system that holds client data.

Cap the unsupervised blast radius. Decide what your agent can do with zero human in the loop, and make that set small. Reading data, drafting a response, flagging something for review: fine to run alone. Sending money, deleting records, emailing a client, changing prod: those get a human gate, or a hard limit, or both. The question to answer for every agent is simple. If this thing gets hijacked at 3am, what is the worst it can do before anyone notices? Make that answer boring.

Put a runtime budget guard on your agents. A hijacked agent does not announce itself. It just starts doing more: more API calls, more tokens, more loops. A budget ceiling turns "surprise four-figure bill and a breach" into "the agent hit its cap and stopped." It is not a security tool by itself. It is the backstop that limits the damage when one of the other three fails. And one of them will.


The honest summary

AI-powered hacking going industrial is real, and Google's report is worth taking seriously. But for a small operator the takeaway is not panic. It is discipline.

The attacks got cheaper and more frequent. Your agents became a target worth hijacking. The defense is the same boring practice it always was, applied to a new surface: do not trust outside input, do not run code you did not vet, do not let automation do irreversible things alone, and cap the damage when something slips.

If you run agents in production, the cheapest of those four to fix first is the budget ceiling. AgentGuard is a runtime budget, token, and rate limiter for AI agents. One decorator, a hard cap, no surprise bills, and a backstop if an agent ever gets turned against you.

Install AgentGuard.

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

PH

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.

More writing