Writing · Tag

#ai-agents

54 posts tagged #ai-agents.

Jul 9, 20266 min read
When Claude hits a weekly limit, your agent fleet still needs a third CLI
Claude and Codex both went dark. Here is the tertiary Gemini path I wired, with honest qa_reviewer stamps.
#local-llm#ai-agents#gemini#agent-ops
Jul 8, 20268 min read
I built a self-improving code model on one RTX 5090. Here is what actually worked.
Six pieces, one consumer GPU, no cloud. The honest results: some parts worked, some were flat, and one idea changed everything.
#local-llm#ai-agents#fine-tuning#rtx-5090
Jul 6, 20265 min read
Local LLMs Need a Timeout Before They Need a Bigger Model
A bigger local model will not fix a stuck runtime. Add a bounded inference doctor first, then trust the benchmark.
#local-ai#ollama#5090-reports#ai-agents
Jul 5, 20265 min read
Your local LLM is not a worse Claude. It is a different tool.
Stop scoring your local model on how close it gets to Opus. It is a different tool with a different sweet spot. Here is the line, and which side your work sits on.
#local-llm#open-weights#ai-agents#agentguard
Jun 27, 20267 min read
AI Agent Memory: What Actually Works in 2026
Most agent memory systems add complexity faster than value. This is the small set that actually compounds for one person running a fleet: files, ledgers, and strict verification.
#ai-agents#memory#agent-ops#one-person-company
Jun 21, 20264 min read
A self-healing system can't heal an empty queue
Automated recovery only fixes a broken machine. When the real failure is an empty queue, retrying does nothing forever. Two failures, one red box, opposite repairs.
#ai-agents#automation#monitoring#self-healing
Jun 19, 20264 min read
Missing AI agent cost data is not zero
A spend ledger that counts missing billing data as $0 hides exactly the unattended agent spend you built it to catch.
#ai-agents#cost-control#observability#spend-tracking
Jun 18, 20265 min read
What Salesforce's 20,000 AI Agent Deployments Teach a Solo Builder
Salesforce shipped roughly 20,000 Agentforce deployments and found 90% of agent work happens after launch. Here is what that means for a solo builder running a small agent fleet.
#ai-agents#agent-deployment#agent-reliability#agentguard
Jun 17, 20265 min read
57-71% of AI agents leak data between users. Here's what to do.
A June 2026 Mem0 survey of 8 major agent harnesses found that over half of them leak memory across users. Here is why keyword retrieval is a security risk and how to fix it.
#ai-agents#security#agent-memory#agentguard
Jun 11, 20264 min read
Stop Telling People You Have 11 AI Agents
Agent count is a vanity metric. It tells you about volume, not value. Here is what I track instead after running a one-person AI fleet.
#ai-agents#one-person-ops#agent-cost#agentguard
Jun 10, 20265 min read
Your AI agent doesn't need memory. It needs a file.
I run a one-person company on scheduled agents and gave almost none of them memory. They write to files instead. Here is why that wins.
#ai-agents#one-person-ops#agent-memory#agentguard
Jun 7, 20266 min read
How to Close the AI Agent Cost Gap at the Call Site
The cost gap between what an AI agent could cost and what it does cost is 40%. You close it at the call site, not in a dashboard. Here is how.
#ai-agents#cost-control#agentguard#python
Jun 7, 20265 min read
57-71% of AI Agents Leak Data Between Users. Here's the Fix.
A 2026 Mem0 survey found 57-71% cross-user memory contamination across major agent frameworks. Here is why it happens and how to stop it.
#ai-agents#security#agent-memory#agentguard
Jun 7, 20264 min read
When JPMorgan's AI bill goes up, who controls it?
JPMorgan turned on AI for 250k people. The quiet line is that the usage racks up fees. Here is how to control the bill before it arrives.
#ai-agents#enterprise-ai#cost-control#agentguard
Jun 7, 20264 min read
Anthropic's IPO and the 40% Cost-Savings Gap: Why Your Spend Cap Matters More Now
Anthropic filed for IPO at a $47B run-rate while 40% of enterprise customers report under 10% cost savings from Claude. Here is how to close that gap.
#ai-agents#cost-control#anthropic#agentguard
Jun 7, 20264 min read
Your AI Agent's Retry Loop Is a Cost Bug Waiting to Happen
A repair agent in my own pipeline failed the same check 27 times in a row. Each try was a paid model call. Here is why uncapped retries quietly burn money, and the two-line fix.
#ai-agents#cost-control#agentguard
Jun 6, 20265 min read
When JPMorgan Turns On AI Bank-Wide, Who Controls the Bill?
JPMorgan just switched on AI for 250,000 employees. The headline is workforce shift. The quiet story is enterprise AI cost, and why token spend runs away without controls.
#ai-agents#enterprise-ai#cost-control#agentguard
Jun 5, 20264 min read
When Not to Use an AI Agent
Most AI advice tells you to ship more agents. Here is the honest opposite: the four times a plain script and a human beat an agent, learned running a fleet daily.
#ai-agents#agentguard#automation
Jun 4, 20264 min read
What GitHub Copilot Users Wish They Had a Week Ago
Copilot went usage-based and bills spiked. The fix is a runtime budget cap at the call site.
#ai-agents#cost-control#github-copilot#agentguard
Jun 2, 20264 min read
Your Cron Jobs Lie - Why I Built an Outcome Checker
Scheduled tasks exit 0 even when the work never happened. Here is the outcome layer I built on top of my agent fleet, and why it shipped before any new dashboard.
#ai-agents#automation#monitoring
Jun 1, 20266 min read
AI-powered hacking went industrial. Here's what changes if you run agents.
Google found the first AI-built zero-day in a planned mass-exploitation event. A builder's read on what changes for small operators running agents.
#ai-security#ai-agents#prompt-injection#agentguard
May 31, 20265 min read
Token budget wars are starting. Most companies are paying for vibes.
AI billing is shifting from seats and tokens to outcomes. If you cannot tie an agent run to a dollar of work, you are paying for vibes.
#ai-agents#pricing#agentguard#cost-control
May 29, 20265 min read
The Silent-Success Trap: Your Monitoring Is Green and You Still Shipped Nothing
Every dashboard was green and zero blog posts went live. Exit codes tell you the job ran, not that the outcome happened. Here is how to check the real artifact instead.
#monitoring#ai-agents#reliability#automation
May 28, 20266 min read
auth.md: How AI Agents Will Sign Your Users Up
A new open protocol lets AI agents register users with your app, no signup form. Here is how it works and what breaks.
#ai-agents#authentication#mcp#automation
May 28, 20264 min read
AI Jobs vs Entry-Level Work: A Reality Check for Builders
MIT Tech Review says the AI-jobs hysteria is overstated. The real story is cost discipline, not displacement.
#ai-agents#builder-notes#agentguard#market-analysis
May 19, 20265 min read
Securing Your AI Agents: Essential Practices for On-Device Automation
The AI world is buzzing, but recent events highlight the critical need for secure and efficient AI agents. Discover practical engineering steps for building reliable automation directly on your hardware.
#ai-agents#automation#on-device-ai#security
May 15, 20264 min read
Enterprise AI just shifted: Claude +128%, OpenAI -8%. What it means if you're building.
SaaStr data shows enterprise AI share shifting hard toward Claude. The lesson isn't pick Claude. It's stop hard-coding one vendor.
#ai-agents#model-routing#cost-control#agentguard
May 15, 20264 min read
AI software runs on 17% margins. SaaS runs on 70%. The token bill is the problem.
AI-native software is shipping at roughly 17% gross margins while traditional SaaS sits near 70%. The token bill ate the unit economics. Here's what's actually broken and how to claw margin back.
#ai-agents#unit-economics#agentguard#pricing
May 9, 20264 min read
One Agent Skill, Three Registries: PyPI, Claude, and skills.sh
Agent skills are becoming a distribution layer for developer tools. The practical move is one source package that can show up in PyPI, Claude-style skills, and skills.sh.
#agentguard#skills#ai-agents#python
May 9, 20264 min read
April 2026: Every AI Subscription Plan Broke for Builders
April 2026 made one thing clear: chat subscriptions are best-effort tools. Builders need API-level budgets, rate limits, and kill switches when the work matters.
#ai-agents#agentguard#cost-control#codex
May 7, 20264 min read
Computer Use Is 45x More Expensive Than APIs. Here's When To Use Each.
Reflex.dev measured a 45x token cost gap between computer-use agents and structured APIs for the same task. Here's why, and the decision rule that keeps your bill sane.
#ai-agents#cost#computer-use#agentguard
May 6, 20266 min read
Your AI Agent Will Eventually Delete Prod
PocketOS lost their production database backups to a Cursor agent. Here's what runtime spend rails actually catch, what they don't, and the layered defense your agents need before production.
#agent-safety#agentguard#incident-postmortem#ai-agents
May 4, 20265 min read
7% of vibe-coded apps ship with wide-open databases
A 1,764-app audit found 7% had open Supabase databases and 15% of Bolt apps had hardcoded secrets. The fix takes ten minutes.
#agentguard#ai-agents#security#vibe-coding
May 4, 20267 min read
Uber Burned Its 2026 AI Budget on Claude Code by April
No metering, no per-team caps, no dashboards. Uber spent its entire 2026 AI budget on Claude Code in just 4 months. The 5-step pattern behind every runaway AI bill — and the fix that stops it.
#agentguard#cost-control#ai-agents
May 3, 20266 min read
MCP vs Skills: a practical decision guide for builders
I need my agent to do X. Skill or MCP? A short decision rule with worked examples for small-business agent builders.
#mcp#claude-skills#ai-agents#agentguard
May 1, 20266 min read
Cloudflare agents can now buy domains. The case for runtime spend rails just got concrete.
Cloudflare shipped agent flows that create accounts, buy domains via Stripe, and deploy infrastructure end-to-end. Good news for builders. Sharper case for runtime budget enforcement than any hypothetical we have used.
#ai-agents#cloudflare#stripe#agent-payments
Apr 30, 20264 min read
OpenAI's guardrails don't control costs. Here's the gap.
OpenAI shipped guardrails in the Agents SDK last month. They validate behavior. They do not enforce spend. Here is the gap and how to close it.
#openai-agents-sdk#agentguard#cost-control#ai-agents
Apr 30, 20264 min read
agent-sre on PyPI: what SRE for AI agents actually means
Microsoft just shipped agent-sre on PyPI. Seven packages: SLOs, error budgets, circuit breakers. Here is what it does, what it does not, and why solo builders still need agentguard47.
#agent-sre#agentguard#sre#ai-agents
Apr 27, 20265 min read
If AI agents can spend money, who's holding the credit card?
I built a memory API agents can pay for. The actual problem isn't whether they can pay. It's per-tool caps, per-agent budgets, kill switches, and spend visibility.
#agentguard#ai-agents#spend-controls#ai-budgets
Apr 27, 20265 min read
HTTP 402 & x402: The Web's Payment Code Finally Works (2026)
Reserved since 1991, HTTP 402 sat unused for 33 years. Now x402 ships the client half — so AI agents can pay per request. Here's how it works and why it matters.
#http#x402#payments#ai-agents
Apr 27, 20265 min read
Why API keys break for autonomous AI agents
Stripe doesn't ship to LLMs. Every vendor signup form assumes a human at the door. Here is what changes when wallets become the access primitive.
#ai-agents#x402#autonomous-agents#api-design
Apr 27, 20264 min read
I built a memory API that AI agents can pay for
An LLM just paid me $0.001 to remember something. The agent has no account, no API key, no credit card. It just signs a USDC transfer and gets back a 200.
#x402#ai-agents#agentic-payments#base
Apr 17, 20266 min read
One Person, 12 Agents, a Holding Company
Stanford, Karpathy, and Bridgewater independently confirmed that one person plus N agents is the right architecture. I have been running it for a holding company. Here is what it looks like.
#ai-agents#holding-company#automation#solo-founder
Apr 17, 20265 min read
When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.
NVIDIA Blackwell delivers 35x lower cost per token vs Hopper. That makes AI agents cheaper to run and harder to stop. Here's why that flips the runtime guard argument upside down.
#ai-agents#agentguard#nvidia#blackwell
Apr 15, 20265 min read
When to Replace Your AI Agent With a Script
Will Larson says agents should be scaffolding, not permanent infrastructure. I run 12 agents overnight. Here's what I kept as agents and what I converted to code.
#ai-agents#agentic-coding#architecture#agentguard
Apr 15, 20265 min read
Your AI Agent's MCP Server Is a Security Hole
1 in 35 GenAI prompts carries high risk of data leakage. MCP makes the attack surface worse. Here's what builders need to know.
#ai-agents#security#mcp#agentguard
Apr 14, 20266 min read
Your Agent Project Might Be in the Wrong Quadrant
Tomasz Tunguz published a 2x2 for categorizing AI projects. Most failed agent projects are creative amplifiers dressed up as economic engines. Here is how to tell which quadrant you are actually in.
#ai-agents#strategy#tunguz-matrix#agent-economics
Apr 14, 20266 min read
Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running
Anthropic shipped a pattern where a cheap model runs the loop and escalates to Opus only when it needs to. The pattern works on any two-model setup. Here is the math and the playbook.
#ai-agents#cost-optimization#anthropic#agentguard
Apr 14, 20267 min read
We Built Martin Fowler's Feedback Flywheel Before He Published It
Martin Fowler published a pattern for turning individual AI interactions into collective improvement. We had already built it. Here is how our 12-agent vault system maps to his four signal types.
#ai-agents#feedback-loops#martin-fowler#agentguard
Apr 14, 20267 min read
Three Studies This Month Changed Everything About AI Agent Safety
Mythos found zero-days in every major OS. Nature documented AI deception in peer review. War games showed AI escalating to nukes. Three studies, one conclusion: your agents need hard limits.
#ai-agents#ai-safety#agentguard#mythos
Apr 14, 20265 min read
Nation-State Hackers Are Targeting Your AI Agent Keys
North Korean threat actors are targeting AI coding tools. Trojanized npm packages hunt for .cursor, .claude, .gemini, and .windsurf directories to steal API keys and source code.
#ai-agents#security#supply-chain#npm
Apr 14, 20266 min read
PostHog Rebuilt Their AI Architecture Twice. Here Are the 5 Rules They Learned.
PostHog ships to thousands of daily agent users. They rebuilt their AI architecture twice before getting it right. Here are the 5 rules they distilled, reframed for builders shipping agent features.
#ai-agents#agent-first#product-engineering#posthog
Apr 14, 20265 min read
Meta Burned 60T Tokens: Cap Your AI Agent Budget in 3 Steps
Your AI agent bill is climbing and nobody set a cap. Meta burned 60T tokens across 85K staff in 30 days. Here are the 3 budget controls they skipped — and guardrails that catch overruns early. (2026)
#ai-agents#cost-control#agentguard#token-management
Apr 14, 20266 min read
9 Out of 428 LLM API Routers Are Injecting Malicious Code Right Now
Researchers tested 428 LLM API routers. Nine were actively injecting malicious code. One drained ETH from a private key. Here is what this means for your AI agents.
#ai-agents#security#supply-chain#agentguard

The AI agent build notes

Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.

When Claude hits a weekly limit, your agent fleet still needs a third CLI

I built a self-improving code model on one RTX 5090. Here is what actually worked.

Local LLMs Need a Timeout Before They Need a Bigger Model

Your local LLM is not a worse Claude. It is a different tool.

AI Agent Memory: What Actually Works in 2026

A self-healing system can't heal an empty queue

Missing AI agent cost data is not zero

What Salesforce's 20,000 AI Agent Deployments Teach a Solo Builder

57-71% of AI agents leak data between users. Here's what to do.

Stop Telling People You Have 11 AI Agents

Your AI agent doesn't need memory. It needs a file.

How to Close the AI Agent Cost Gap at the Call Site

57-71% of AI Agents Leak Data Between Users. Here's the Fix.

When JPMorgan's AI bill goes up, who controls it?

Anthropic's IPO and the 40% Cost-Savings Gap: Why Your Spend Cap Matters More Now

Your AI Agent's Retry Loop Is a Cost Bug Waiting to Happen

When JPMorgan Turns On AI Bank-Wide, Who Controls the Bill?

When Not to Use an AI Agent

What GitHub Copilot Users Wish They Had a Week Ago

Your Cron Jobs Lie - Why I Built an Outcome Checker

AI-powered hacking went industrial. Here's what changes if you run agents.

Token budget wars are starting. Most companies are paying for vibes.

The Silent-Success Trap: Your Monitoring Is Green and You Still Shipped Nothing

auth.md: How AI Agents Will Sign Your Users Up

AI Jobs vs Entry-Level Work: A Reality Check for Builders

Securing Your AI Agents: Essential Practices for On-Device Automation

Enterprise AI just shifted: Claude +128%, OpenAI -8%. What it means if you're building.

AI software runs on 17% margins. SaaS runs on 70%. The token bill is the problem.

One Agent Skill, Three Registries: PyPI, Claude, and skills.sh

April 2026: Every AI Subscription Plan Broke for Builders

Computer Use Is 45x More Expensive Than APIs. Here's When To Use Each.

Your AI Agent Will Eventually Delete Prod

7% of vibe-coded apps ship with wide-open databases

Uber Burned Its 2026 AI Budget on Claude Code by April

MCP vs Skills: a practical decision guide for builders

Cloudflare agents can now buy domains. The case for runtime spend rails just got concrete.

OpenAI's guardrails don't control costs. Here's the gap.

agent-sre on PyPI: what SRE for AI agents actually means

If AI agents can spend money, who's holding the credit card?

HTTP 402 & x402: The Web's Payment Code Finally Works (2026)

Why API keys break for autonomous AI agents

I built a memory API that AI agents can pay for

One Person, 12 Agents, a Holding Company

When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.

When to Replace Your AI Agent With a Script

Your AI Agent's MCP Server Is a Security Hole

Your Agent Project Might Be in the Wrong Quadrant

Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running

We Built Martin Fowler's Feedback Flywheel Before He Published It

Three Studies This Month Changed Everything About AI Agent Safety

Nation-State Hackers Are Targeting Your AI Agent Keys

PostHog Rebuilt Their AI Architecture Twice. Here Are the 5 Rules They Learned.

Meta Burned 60T Tokens: Cap Your AI Agent Budget in 3 Steps

9 Out of 428 LLM API Routers Are Injecting Malicious Code Right Now

The AI agent build notes