Writing · Tag

#agentguard

59 posts tagged #agentguard.

Jul 5, 20265 min read
Your local LLM is not a worse Claude. It is a different tool.
Stop scoring your local model on how close it gets to Opus. It is a different tool with a different sweet spot. Here is the line, and which side your work sits on.
#local-llm#open-weights#ai-agents#agentguard
Jul 4, 20265 min read
A verifier loop beats a faster local model
Local LLMs are useful when the loop proves the output, not when the benchmark looks good. This is the small gate I use before a local coding agent gets more rope.
#local ai#local llm#5090 reports#agentguard
Jun 29, 20265 min read
Use Owner Gates and AgentGuard to Keep AI Agents Moving
AI agents need two rails before they can run unattended: owner gates for judgment and AgentGuard for spend. Without both, the operator becomes the fallback.
#AI agents#agent ops#async workflows#one-person company
Jun 27, 20267 min read
AI Agent Memory: What Actually Works in 2026
Most agent memory systems add complexity faster than value. This is the small set that actually compounds for one person running a fleet: files, ledgers, and strict verification.
#ai-agents#memory#agent-ops#one-person-company
Jun 24, 20264 min read
Your AI Agent Says "Done." Make It Prove It.
AI agents report work as done that they never did. Make every completion a falsifiable claim a script can verify before you trust it.
#AI agents#agent ops#verification#one-person company
Jun 21, 20264 min read
A self-healing system can't heal an empty queue
Automated recovery only fixes a broken machine. When the real failure is an empty queue, retrying does nothing forever. Two failures, one red box, opposite repairs.
#ai-agents#automation#monitoring#self-healing
Jun 19, 20264 min read
Missing AI agent cost data is not zero
A spend ledger that counts missing billing data as $0 hides exactly the unattended agent spend you built it to catch.
#ai-agents#cost-control#observability#spend-tracking
Jun 18, 20265 min read
What Salesforce's 20,000 AI Agent Deployments Teach a Solo Builder
Salesforce shipped roughly 20,000 Agentforce deployments and found 90% of agent work happens after launch. Here is what that means for a solo builder running a small agent fleet.
#ai-agents#agent-deployment#agent-reliability#agentguard
Jun 17, 20265 min read
What Anthropic's MITRE ATT&CK report means for solo AI builders
#ai-security#agents#anthropic#agentguard
Jun 17, 20265 min read
57-71% of AI agents leak data between users. Here's what to do.
A June 2026 Mem0 survey of 8 major agent harnesses found that over half of them leak memory across users. Here is why keyword retrieval is a security risk and how to fix it.
#ai-agents#security#agent-memory#agentguard
Jun 12, 20264 min read
Agentic coding moved my bottleneck to code review
Agentic coding made writing code free. The slow part is now reviewing a queue of plausible PRs.
#agentic coding#ai agents#solo dev#code review
Jun 11, 20264 min read
Stop Telling People You Have 11 AI Agents
Agent count is a vanity metric. It tells you about volume, not value. Here is what I track instead after running a one-person AI fleet.
#ai-agents#one-person-ops#agent-cost#agentguard
Jun 10, 20265 min read
Your AI agent doesn't need memory. It needs a file.
I run a one-person company on scheduled agents and gave almost none of them memory. They write to files instead. Here is why that wins.
#ai-agents#one-person-ops#agent-memory#agentguard
Jun 7, 20266 min read
How to Close the AI Agent Cost Gap at the Call Site
The cost gap between what an AI agent could cost and what it does cost is 40%. You close it at the call site, not in a dashboard. Here is how.
#ai-agents#cost-control#agentguard#python
Jun 7, 20265 min read
57-71% of AI Agents Leak Data Between Users. Here's the Fix.
A 2026 Mem0 survey found 57-71% cross-user memory contamination across major agent frameworks. Here is why it happens and how to stop it.
#ai-agents#security#agent-memory#agentguard
Jun 7, 20264 min read
When JPMorgan's AI bill goes up, who controls it?
JPMorgan turned on AI for 250k people. The quiet line is that the usage racks up fees. Here is how to control the bill before it arrives.
#ai-agents#enterprise-ai#cost-control#agentguard
Jun 7, 20264 min read
Anthropic's IPO and the 40% Cost-Savings Gap: Why Your Spend Cap Matters More Now
Anthropic filed for IPO at a $47B run-rate while 40% of enterprise customers report under 10% cost savings from Claude. Here is how to close that gap.
#ai-agents#cost-control#anthropic#agentguard
Jun 7, 20264 min read
Your AI Agent's Retry Loop Is a Cost Bug Waiting to Happen
A repair agent in my own pipeline failed the same check 27 times in a row. Each try was a paid model call. Here is why uncapped retries quietly burn money, and the two-line fix.
#ai-agents#cost-control#agentguard
Jun 6, 20264 min read
What Anthropic's MITRE ATT&CK Report Means for Teams Running AI Agents
Anthropic banned 832 accounts for AI-enabled attacks. What it means for teams running AI agents.
#ai-security#agents#anthropic#agentguard
Jun 6, 20265 min read
When JPMorgan Turns On AI Bank-Wide, Who Controls the Bill?
JPMorgan just switched on AI for 250,000 employees. The headline is workforce shift. The quiet story is enterprise AI cost, and why token spend runs away without controls.
#ai-agents#enterprise-ai#cost-control#agentguard
Jun 5, 20264 min read
When Not to Use an AI Agent
Most AI advice tells you to ship more agents. Here is the honest opposite: the four times a plain script and a human beat an agent, learned running a fleet daily.
#ai-agents#agentguard#automation
Jun 4, 20264 min read
What GitHub Copilot Users Wish They Had a Week Ago
Copilot went usage-based and bills spiked. The fix is a runtime budget cap at the call site.
#ai-agents#cost-control#github-copilot#agentguard
Jun 4, 20265 min read
What Uber's $1,500/Developer AI Cap Tells You About Your Own Bill
Uber caps every employee at $1,500/month per AI coding tool. The real fix is a per-identity cap in code, not a policy memo.
#ai-cost-control#agentguard#claude-code#runtime-governance
Jun 3, 20264 min read
When Your Blog Repair Loop Fails 23 Times, Stop Repairing
My blog repair loop chewed on a stale draft for 23 mornings and reported "blocked" every time. The fix was not a smarter retry. It was a TTL and a heal path.
#agents#agentguard#devlog
Jun 1, 20266 min read
AI-powered hacking went industrial. Here's what changes if you run agents.
Google found the first AI-built zero-day in a planned mass-exploitation event. A builder's read on what changes for small operators running agents.
#ai-security#ai-agents#prompt-injection#agentguard
May 31, 20265 min read
Token budget wars are starting. Most companies are paying for vibes.
AI billing is shifting from seats and tokens to outcomes. If you cannot tie an agent run to a dollar of work, you are paying for vibes.
#ai-agents#pricing#agentguard#cost-control
May 28, 20264 min read
AI Jobs vs Entry-Level Work: A Reality Check for Builders
MIT Tech Review says the AI-jobs hysteria is overstated. The real story is cost discipline, not displacement.
#ai-agents#builder-notes#agentguard#market-analysis
May 27, 20265 min read
Microsoft Told Engineers to Ease Off Claude Code
If Microsoft can't absorb agent inference costs, neither can you. Make the cap a config change, not a memo.
#AI agents#cost control#Claude Code#AgentGuard
May 24, 20264 min read
Why Starbucks Killed Its AI Inventory Tool After 9 Months
Starbucks pulled its AI inventory tool after 9 months. Here is the pattern that killed it and three guardrails that catch it.
#ai-failures#agentguard
May 15, 20264 min read
Enterprise AI just shifted: Claude +128%, OpenAI -8%. What it means if you're building.
SaaStr data shows enterprise AI share shifting hard toward Claude. The lesson isn't pick Claude. It's stop hard-coding one vendor.
#ai-agents#model-routing#cost-control#agentguard
May 15, 20264 min read
AI software runs on 17% margins. SaaS runs on 70%. The token bill is the problem.
AI-native software is shipping at roughly 17% gross margins while traditional SaaS sits near 70%. The token bill ate the unit economics. Here's what's actually broken and how to claw margin back.
#ai-agents#unit-economics#agentguard#pricing
May 14, 20265 min read
An AI Agent in Sweden Ordered 6,000 Napkins. Here's the 12 Lines of Python That Would Have Stopped It.
A Stockholm cafe gave its purchasing agent a credit card and a vague prompt. $21,000 later it owned 6,000 napkins and no bread. Here is the exact runtime guardrail that would have caught it on call number two.
#agentguard#ai-safety#agent-budget#incident-postmortem
May 14, 20265 min read
I gave an autotrader $360 and 30 days. I am not adding live money yet.
The May 14 autotrader review is done. The account is up 7.7% before compute, still negative after compute, and still lagging SPY and BTC. Decision: keep V2 paper-only, add no new live money, and revisit after the next scorecard.
#autotrader#agent-runtime-safety#kill-switch#build-in-public
May 9, 20264 min read
One Agent Skill, Three Registries: PyPI, Claude, and skills.sh
Agent skills are becoming a distribution layer for developer tools. The practical move is one source package that can show up in PyPI, Claude-style skills, and skills.sh.
#agentguard#skills#ai-agents#python
May 9, 20264 min read
April 2026: Every AI Subscription Plan Broke for Builders
April 2026 made one thing clear: chat subscriptions are best-effort tools. Builders need API-level budgets, rate limits, and kill switches when the work matters.
#ai-agents#agentguard#cost-control#codex
May 7, 20264 min read
Computer Use Is 45x More Expensive Than APIs. Here's When To Use Each.
Reflex.dev measured a 45x token cost gap between computer-use agents and structured APIs for the same task. Here's why, and the decision rule that keeps your bill sane.
#ai-agents#cost#computer-use#agentguard
May 6, 20266 min read
Your AI Agent Will Eventually Delete Prod
PocketOS lost their production database backups to a Cursor agent. Here's what runtime spend rails actually catch, what they don't, and the layered defense your agents need before production.
#agent-safety#agentguard#incident-postmortem#ai-agents
May 4, 20265 min read
7% of vibe-coded apps ship with wide-open databases
A 1,764-app audit found 7% had open Supabase databases and 15% of Bolt apps had hardcoded secrets. The fix takes ten minutes.
#agentguard#ai-agents#security#vibe-coding
May 4, 20267 min read
Uber Burned Its 2026 AI Budget on Claude Code by April
No metering, no per-team caps, no dashboards. Uber spent its entire 2026 AI budget on Claude Code in just 4 months. The 5-step pattern behind every runaway AI bill — and the fix that stops it.
#agentguard#cost-control#ai-agents
May 3, 20266 min read
MCP vs Skills: a practical decision guide for builders
I need my agent to do X. Skill or MCP? A short decision rule with worked examples for small-business agent builders.
#mcp#claude-skills#ai-agents#agentguard
May 1, 20266 min read
Cloudflare agents can now buy domains. The case for runtime spend rails just got concrete.
Cloudflare shipped agent flows that create accounts, buy domains via Stripe, and deploy infrastructure end-to-end. Good news for builders. Sharper case for runtime budget enforcement than any hypothetical we have used.
#ai-agents#cloudflare#stripe#agent-payments
Apr 30, 20264 min read
OpenAI's guardrails don't control costs. Here's the gap.
OpenAI shipped guardrails in the Agents SDK last month. They validate behavior. They do not enforce spend. Here is the gap and how to close it.
#openai-agents-sdk#agentguard#cost-control#ai-agents
Apr 30, 20264 min read
agent-sre on PyPI: what SRE for AI agents actually means
Microsoft just shipped agent-sre on PyPI. Seven packages: SLOs, error budgets, circuit breakers. Here is what it does, what it does not, and why solo builders still need agentguard47.
#agent-sre#agentguard#sre#ai-agents
Apr 27, 20265 min read
If AI agents can spend money, who's holding the credit card?
I built a memory API agents can pay for. The actual problem isn't whether they can pay. It's per-tool caps, per-agent budgets, kill switches, and spend visibility.
#agentguard#ai-agents#spend-controls#ai-budgets
Apr 17, 20265 min read
When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.
NVIDIA Blackwell delivers 35x lower cost per token vs Hopper. That makes AI agents cheaper to run and harder to stop. Here's why that flips the runtime guard argument upside down.
#ai-agents#agentguard#nvidia#blackwell
Apr 16, 20264 min read
AI security is now a token-burning contest. Who's watching the bill?
Simon Willison frames AI-assisted security research as proof of work: more tokens in, more bugs found. That's an economic reality. Here's what the spend curve actually looks like and how to put a floor under it.
#AI Security#AI Costs#Simon Willison#AgentGuard
Apr 16, 20264 min read
The flat-fee era is over. How to control your AI agent costs in 2026.
Anthropic shifted enterprise billing to per-token pricing. Every provider is expected to follow within six months. Here's how agent costs change and how to cap them at runtime.
#AI Costs#Anthropic#Token Pricing#AgentGuard
Apr 16, 20265 min read
Claude Code Prompt Caching Costs: Pick the Right TTL (2026)
Claude Code has two prompt-caching TTLs, and most devs pay the pricier tier without knowing. Here's how cache writes quietly inflate your Anthropic bill — and how to cap it.
#Claude Code#AI Costs#Prompt Caching#AgentGuard
Apr 15, 20265 min read
GPU Prices Up 48% in Two Months. I Run LLMs in My Garage.
Blackwell rental hit $4.08/hr. CoreWeave raised prices 20%. Anthropic restricted their newest model to 40 orgs. Meanwhile, consumer GPUs are sitting idle.
#local-llm#gpu#infrastructure#agentguard
Apr 15, 20265 min read
When to Replace Your AI Agent With a Script
Will Larson says agents should be scaffolding, not permanent infrastructure. I run 12 agents overnight. Here's what I kept as agents and what I converted to code.
#ai-agents#agentic-coding#architecture#agentguard
Apr 15, 20265 min read
Your AI Agent's MCP Server Is a Security Hole
1 in 35 GenAI prompts carries high risk of data leakage. MCP makes the attack surface worse. Here's what builders need to know.
#ai-agents#security#mcp#agentguard
Apr 14, 20266 min read
Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running
Anthropic shipped a pattern where a cheap model runs the loop and escalates to Opus only when it needs to. The pattern works on any two-model setup. Here is the math and the playbook.
#ai-agents#cost-optimization#anthropic#agentguard
Apr 14, 20267 min read
We Built Martin Fowler's Feedback Flywheel Before He Published It
Martin Fowler published a pattern for turning individual AI interactions into collective improvement. We had already built it. Here is how our 12-agent vault system maps to his four signal types.
#ai-agents#feedback-loops#martin-fowler#agentguard
Apr 14, 20267 min read
Three Studies This Month Changed Everything About AI Agent Safety
Mythos found zero-days in every major OS. Nature documented AI deception in peer review. War games showed AI escalating to nukes. Three studies, one conclusion: your agents need hard limits.
#ai-agents#ai-safety#agentguard#mythos
Apr 14, 20265 min read
Meta Burned 60T Tokens: Cap Your AI Agent Budget in 3 Steps
Your AI agent bill is climbing and nobody set a cap. Meta burned 60T tokens across 85K staff in 30 days. Here are the 3 budget controls they skipped — and guardrails that catch overruns early. (2026)
#ai-agents#cost-control#agentguard#token-management
Apr 14, 20266 min read
9 Out of 428 LLM API Routers Are Injecting Malicious Code Right Now
Researchers tested 428 LLM API routers. Nine were actively injecting malicious code. One drained ETH from a private key. Here is what this means for your AI agents.
#ai-agents#security#supply-chain#agentguard
Apr 13, 20263 min read
AI Chose Nukes 95% of the Time. Here's What That Means for Your Agents.
Three AI safety papers came out this week. Reading them back to back was jarring. If you run agents in production, this is worth 5 minutes.
#AI Agents#Safety#AgentGuard#Runtime Enforcement
Apr 9, 20268 min read
We Built Fowler's AI Feedback Flywheel (Before He Named It)
Martin Fowler named the AI feedback flywheel. We built the same system independently. Here's our exact implementation — vault, agents, guardrails, and weekly cadence.
#AI Agents#Feedback Flywheel#Martin Fowler#AI Teams
Mar 25, 20268 min read
Stop Runaway LLM Spend: AI Agent Cost Control (Python)
One bad loop and an AI agent burned $200 in minutes. AgentGuard is a Python SDK that enforces hard cost limits at runtime — here is how to ship it.
#AI Agents#Python#Cost Control#AgentGuard

The AI agent build notes

Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.

Your local LLM is not a worse Claude. It is a different tool.

A verifier loop beats a faster local model

Use Owner Gates and AgentGuard to Keep AI Agents Moving

AI Agent Memory: What Actually Works in 2026

Your AI Agent Says "Done." Make It Prove It.

A self-healing system can't heal an empty queue

Missing AI agent cost data is not zero

What Salesforce's 20,000 AI Agent Deployments Teach a Solo Builder

What Anthropic's MITRE ATT&CK report means for solo AI builders

57-71% of AI agents leak data between users. Here's what to do.

Agentic coding moved my bottleneck to code review

Stop Telling People You Have 11 AI Agents

Your AI agent doesn't need memory. It needs a file.

How to Close the AI Agent Cost Gap at the Call Site

57-71% of AI Agents Leak Data Between Users. Here's the Fix.

When JPMorgan's AI bill goes up, who controls it?

Anthropic's IPO and the 40% Cost-Savings Gap: Why Your Spend Cap Matters More Now

Your AI Agent's Retry Loop Is a Cost Bug Waiting to Happen

What Anthropic's MITRE ATT&CK Report Means for Teams Running AI Agents

When JPMorgan Turns On AI Bank-Wide, Who Controls the Bill?

When Not to Use an AI Agent

What GitHub Copilot Users Wish They Had a Week Ago

What Uber's $1,500/Developer AI Cap Tells You About Your Own Bill

When Your Blog Repair Loop Fails 23 Times, Stop Repairing

AI-powered hacking went industrial. Here's what changes if you run agents.

Token budget wars are starting. Most companies are paying for vibes.

AI Jobs vs Entry-Level Work: A Reality Check for Builders

Microsoft Told Engineers to Ease Off Claude Code

Why Starbucks Killed Its AI Inventory Tool After 9 Months

Enterprise AI just shifted: Claude +128%, OpenAI -8%. What it means if you're building.

AI software runs on 17% margins. SaaS runs on 70%. The token bill is the problem.

An AI Agent in Sweden Ordered 6,000 Napkins. Here's the 12 Lines of Python That Would Have Stopped It.

I gave an autotrader $360 and 30 days. I am not adding live money yet.

One Agent Skill, Three Registries: PyPI, Claude, and skills.sh

April 2026: Every AI Subscription Plan Broke for Builders

Computer Use Is 45x More Expensive Than APIs. Here's When To Use Each.

Your AI Agent Will Eventually Delete Prod

7% of vibe-coded apps ship with wide-open databases

Uber Burned Its 2026 AI Budget on Claude Code by April

MCP vs Skills: a practical decision guide for builders

Cloudflare agents can now buy domains. The case for runtime spend rails just got concrete.

OpenAI's guardrails don't control costs. Here's the gap.

agent-sre on PyPI: what SRE for AI agents actually means

If AI agents can spend money, who's holding the credit card?

When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.

AI security is now a token-burning contest. Who's watching the bill?

The flat-fee era is over. How to control your AI agent costs in 2026.

Claude Code Prompt Caching Costs: Pick the Right TTL (2026)

GPU Prices Up 48% in Two Months. I Run LLMs in My Garage.

When to Replace Your AI Agent With a Script

Your AI Agent's MCP Server Is a Security Hole

Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running

We Built Martin Fowler's Feedback Flywheel Before He Published It

Three Studies This Month Changed Everything About AI Agent Safety

Meta Burned 60T Tokens: Cap Your AI Agent Budget in 3 Steps

9 Out of 428 LLM API Routers Are Injecting Malicious Code Right Now

AI Chose Nukes 95% of the Time. Here's What That Means for Your Agents.

We Built Fowler's AI Feedback Flywheel (Before He Named It)

Stop Runaway LLM Spend: AI Agent Cost Control (Python)

The AI agent build notes