Writing
AI agents, runtime safety, local LLMs, and what it looks like to run a one-person AI-operated holding company in public.
I need my agent to do X. Skill or MCP? A short decision rule with worked examples for small-business agent builders.
Before you ship an AI agent for a client, prove budget caps, loop detection, alert proof, remote kill, and retained incident history.
The demo worked. Then the same CrewAI tool call retried until the run became an operator problem.
A trace tells you what happened. A kill switch changes what happens next.
Cloudflare shipped agent flows that create accounts, buy domains via Stripe, and deploy infrastructure end-to-end. Good news for builders. Sharper case for runtime budget enforcement than any hypothetical we have used.
OpenAI shipped guardrails in the Agents SDK last month. They validate behavior. They do not enforce spend. Here is the gap and how to close it.
Microsoft just shipped agent-sre on PyPI. Seven packages: SLOs, error budgets, circuit breakers. Here is what it does, what it does not, and why solo builders still need agentguard47.
I built a memory API agents can pay for. The actual problem isn't whether they can pay. It's per-tool caps, per-agent budgets, kill switches, and spend visibility.
402 Payment Required has been in the HTTP spec since 1991. Reserved. Unused. x402 finally shipped the client half. Here is why that matters now.
Stripe doesn't ship to LLMs. Every vendor signup form assumes a human at the door. Here is what changes when wallets become the access primitive.
An LLM just paid me $0.001 to remember something. The agent has no account, no API key, no credit card. It just signs a USDC transfer and gets back a 200.
Three studies dropped in the last few months. GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash all escalated to nuclear options 95% of the time in war game scenarios. AI found exploitable vulnerabilities in every major OS and browser. And a Nature paper documented AI disabling its own oversight. Here is what that means if you are running agents in production today.
Stanford, Karpathy, and Bridgewater independently confirmed that one person plus N agents is the right architecture. I have been running it for a holding company. Here is what it looks like.
NVIDIA Blackwell delivers 35x lower cost per token vs Hopper. That makes AI agents cheaper to run and harder to stop. Here's why that flips the runtime guard argument upside down.
Simon Willison frames AI-assisted security research as proof of work: more tokens in, more bugs found. That's an economic reality. Here's what the spend curve actually looks like and how to put a floor under it.
Flatiron Health toured AI-native startups in SF. One PM covers five companies, Claude Code is replacing Cursor, non-engineers are shipping production. I'm running the same model from Tennessee as a solo holding company. Here's what that actually looks like.
Anthropic shifted enterprise billing to per-token pricing. Every provider is expected to follow within six months. Here's how agent costs change and how to cap them at runtime.
Claude Code's prompt caching has two TTLs and two price tiers. Most developers have no idea which one they're paying for. Here's how cache writes eat budgets and how to stop it.
Blackwell rental hit $4.08/hr. CoreWeave raised prices 20%. Anthropic restricted their newest model to 40 orgs. Meanwhile, consumer GPUs are sitting idle.
Will Larson says agents should be scaffolding, not permanent infrastructure. I run 12 agents overnight. Here's what I kept as agents and what I converted to code.
1 in 35 GenAI prompts carries high risk of data leakage. MCP makes the attack surface worse. Here's what builders need to know.
Tomasz Tunguz published a 2x2 for categorizing AI projects. Most failed agent projects are creative amplifiers dressed up as economic engines. Here is how to tell which quadrant you are actually in.
Anthropic shipped a pattern where a cheap model runs the loop and escalates to Opus only when it needs to. The pattern works on any two-model setup. Here is the math and the playbook.
Martin Fowler published a pattern for turning individual AI interactions into collective improvement. We had already built it. Here is how our 12-agent vault system maps to his four signal types.
Mythos found zero-days in every major OS. Nature documented AI deception in peer review. War games showed AI escalating to nukes. Three studies, one conclusion: your agents need hard limits.
Dario Amodei says continual learning will be solved this year. Here is what AI agent memory actually means for builders shipping agents right now. Three patterns, real tradeoffs, practical guidance.
North Korean threat actors are targeting AI coding tools. Trojanized npm packages hunt for .cursor, .claude, .gemini, and .windsurf directories to steal API keys and source code.
PostHog ships to thousands of daily agent users. They rebuilt their AI architecture twice before getting it right. Here are the 5 rules they distilled, reframed for builders shipping agent features.
Meta gamified AI usage across 85,000 employees. They burned 60 trillion tokens in a month. Then they shut the leaderboard down. Here is what went wrong and how to prevent it.
Researchers tested 428 LLM API routers. Nine were actively injecting malicious code. One drained ETH from a private key. Here is what this means for your AI agents.
Three AI safety papers came out this week. Reading them back to back was jarring. If you run agents in production, this is worth 5 minutes.
OpenClaw promises production-ready AI agents out of the box. We ran it on 3 real use cases. Here's what worked, what didn't, and who it's actually for.
Martin Fowler named the AI feedback flywheel. We built the same system independently. Here's our exact implementation — vault, agents, guardrails, and weekly cadence.
AI agent builds range from $500 DIY to $150K enterprise. Real cost breakdown by complexity tier, with API, compute, and dev hour estimates for 2026.
The market is flooded with people claiming to build AI agents. Here's how to tell who can actually ship one—and what questions to ask before you pay anything.
Google's A2A protocol enables AI agents to communicate across tools and vendors. Here's what it means for your business in 2026.
n_gpu_layers -1 offloads every layer to GPU. Learn what each value means, the exact VRAM math, and pick the right setting with real benchmarks.
Aymo AI pricing plans start free and top out at $39/month. Full 2026 plan breakdown, where limits bite, and how it compares to ChatGPT, Claude Pro, and custom builds.
Build a 100% offline voice AI on Raspberry Pi 5 with no cloud or fees. Real benchmarks, top models tested, latency results, and what we would change.
JustPaid ran 7 AI agents 24/7 with OpenClaw, shipped 10 features in a month for $4K/week. Here is the real cost breakdown and what it means for you.
Anthropic accidentally leaked Claude Code's source. I read through it. Here are 6 architecture patterns that are changing how I build agents for clients.
AI agents can be hijacked through the content they read. Here is what prompt injection looks like in production, why your existing security stack will not catch it, and what to build instead.
Model Context Protocol (MCP) is the open standard that lets AI agents talk to your real tools — databases, APIs, files — without custom glue code. Here's what it is, how it works, and whether you actually need it.
88% of AI agent pilots never ship to production. We analyzed why — and built a 5-step playbook used by the 12% of teams that actually make it.
RTX 5070 Ti runs Llama 3.1 at 50 req/s for $0 per call. Real benchmarks, cloud cost comparison, and the exact production setup that works today.
Off-the-shelf AI agents fail when your workflow is the edge. Here's when custom development actually pays off for small business.
One bad loop and an AI agent burned $200 in minutes. AgentGuard is a Python SDK that enforces hard cost limits at runtime — here is how to ship it.
OpenClaw is faster to start — but custom AI agents often win on ROI. Real side-by-side on cost, flexibility, and time-to-deploy for your use case.
Most businesses do not need multi-agent AI yet — but some do. 5 questions to find out which camp you are in, with real cost and complexity benchmarks.
AI agent builds range from $500 DIY to $150K enterprise in 2026. Real cost breakdown by complexity tier, API costs, and dev hours — no fluff.
A practical comparison of the three main approaches to workflow automation, based on 20+ automations I've built for myself and clients.
My framework for identifying, scoping, and building automations without a single meeting. Used on every client engagement.
I let an autonomous agent run 100 ML experiments while I slept. 7 succeeded. Net result: 25% model improvement. Here's the setup.
How AI agents are changing the way we build with Next.js — from agentic development to shipping 10x faster as a solo engineer.
Real costs, real tools, no fluff. One email per week with what I'm building, what's working, and what's not.