Writing · Tag
33 posts tagged #agentguard.
MIT Tech Review says the AI-jobs hysteria is overstated. The real story is cost discipline, not displacement.
If Microsoft can't absorb agent inference costs, neither can you. Make the cap a config change, not a memo.
Starbucks pulled its AI inventory tool after 9 months. Here is the pattern that killed it and three guardrails that catch it.
SaaStr data shows enterprise AI share shifting hard toward Claude. The lesson isn't pick Claude. It's stop hard-coding one vendor.
AI-native software is shipping at roughly 17% gross margins while traditional SaaS sits near 70%. The token bill ate the unit economics. Here's what's actually broken and how to claw margin back.
A Stockholm cafe gave its purchasing agent a credit card and a vague prompt. $21,000 later it owned 6,000 napkins and no bread. Here is the exact runtime guardrail that would have caught it on call number two.
The May 14 autotrader review is done. The account is up 7.7% before compute, still negative after compute, and still lagging SPY and BTC. Decision: keep V2 paper-only, add no new live money, and revisit after the next scorecard.
Agent skills are becoming a distribution layer for developer tools. The practical move is one source package that can show up in PyPI, Claude-style skills, and skills.sh.
April 2026 made one thing clear: chat subscriptions are best-effort tools. Builders need API-level budgets, rate limits, and kill switches when the work matters.
Reflex.dev measured a 45x token cost gap between computer-use agents and structured APIs for the same task. Here's why, and the decision rule that keeps your bill sane.
PocketOS lost their production database backups to a Cursor agent. Here's what runtime spend rails actually catch, what they don't, and the layered defense your agents need before production.
A 1,764-app audit found 7% had open Supabase databases and 15% of Bolt apps had hardcoded secrets. The fix takes ten minutes.
No metering, no per-team caps, no usage dashboards. Uber spent its full 2026 AI budget on Claude Code in 4 months. The 5-step pattern behind every runaway AI bill — and the instrumentation that stops it.
I need my agent to do X. Skill or MCP? A short decision rule with worked examples for small-business agent builders.
Cloudflare shipped agent flows that create accounts, buy domains via Stripe, and deploy infrastructure end-to-end. Good news for builders. Sharper case for runtime budget enforcement than any hypothetical we have used.
OpenAI shipped guardrails in the Agents SDK last month. They validate behavior. They do not enforce spend. Here is the gap and how to close it.
Microsoft just shipped agent-sre on PyPI. Seven packages: SLOs, error budgets, circuit breakers. Here is what it does, what it does not, and why solo builders still need agentguard47.
I built a memory API agents can pay for. The actual problem isn't whether they can pay. It's per-tool caps, per-agent budgets, kill switches, and spend visibility.
NVIDIA Blackwell delivers 35x lower cost per token vs Hopper. That makes AI agents cheaper to run and harder to stop. Here's why that flips the runtime guard argument upside down.
Simon Willison frames AI-assisted security research as proof of work: more tokens in, more bugs found. That's an economic reality. Here's what the spend curve actually looks like and how to put a floor under it.
Anthropic shifted enterprise billing to per-token pricing. Every provider is expected to follow within six months. Here's how agent costs change and how to cap them at runtime.
Claude Code has two caching TTLs and most developers pay the wrong tier without knowing. Here is how cache writes quietly inflate your Anthropic bill — and how to stop it.
Blackwell rental hit $4.08/hr. CoreWeave raised prices 20%. Anthropic restricted their newest model to 40 orgs. Meanwhile, consumer GPUs are sitting idle.
Will Larson says agents should be scaffolding, not permanent infrastructure. I run 12 agents overnight. Here's what I kept as agents and what I converted to code.
1 in 35 GenAI prompts carries high risk of data leakage. MCP makes the attack surface worse. Here's what builders need to know.
Anthropic shipped a pattern where a cheap model runs the loop and escalates to Opus only when it needs to. The pattern works on any two-model setup. Here is the math and the playbook.
Martin Fowler published a pattern for turning individual AI interactions into collective improvement. We had already built it. Here is how our 12-agent vault system maps to his four signal types.
Mythos found zero-days in every major OS. Nature documented AI deception in peer review. War games showed AI escalating to nukes. Three studies, one conclusion: your agents need hard limits.
Meta gamified internal AI usage. 85,000 employees, 60 trillion tokens, 30 days — then they killed the leaderboard. The 3 budget controls they skipped and the guardrails that would have caught it early.
Researchers tested 428 LLM API routers. Nine were actively injecting malicious code. One drained ETH from a private key. Here is what this means for your AI agents.
Three AI safety papers came out this week. Reading them back to back was jarring. If you run agents in production, this is worth 5 minutes.
Martin Fowler named the AI feedback flywheel. We built the same system independently. Here's our exact implementation — vault, agents, guardrails, and weekly cadence.
One bad loop and an AI agent burned $200 in minutes. AgentGuard is a Python SDK that enforces hard cost limits at runtime — here is how to ship it.
Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.