Your Agent Project Might Be in the Wrong Quadrant
Tomasz Tunguz published a 2x2 for categorizing AI projects. Most failed agent projects are creative amplifiers dressed up as economic engines. Here is how to tell which quadrant you are actually in.
Tomasz Tunguz just published a 2x2 matrix for categorizing AI projects. Most people will read it, nod, and file it under "interesting frameworks I will never use."
Here is what it actually means for builders shipping agent features this month.
The Matrix
Two axes. Vertical: is the loop closed (can the system verify its own output) or open (needs a human to judge quality). Horizontal: is demand for the task infinite (always more work to do) or finite (fixed scope).
Four quadrants:
Q1: Economic Engine. Closed loop + infinite demand. The agent can verify its own work, and there is always more to do. Example: coding agents. Tests pass or fail. There is always more code to write. This is where the 10-20x productivity claims come from.
Q2: Efficiency Play. Closed loop + finite demand. The output is verifiable, but the volume is capped. Example: bookkeeping reconciliation. The numbers either match or they do not. But there are only so many transactions to reconcile. Solid ROI, but ceiling-limited.
Q3: Creative Amplifier. Open loop + infinite demand. Unlimited work to do, but no automated way to know if the output is good. Example: marketing content generation. You can always write more blog posts. But "is this post good?" requires human judgment.
Q4: Utility Tool. Open loop + finite demand. Human judgment needed, and the scope is bounded. Example: contract review. A lawyer still has to read the output. And there are only so many contracts.
The Misclassification Problem
Most failed agent projects are Q3 (creative amplifiers) dressed up as Q1 (economic engines). They get funded with automation budgets and fail because the loop does not close without a human.
Here is what this looks like in practice.
A team builds an agent that writes marketing emails. They pitch it as "automated outbound at scale" (sounds like Q1: infinite demand, closed loop). But the actual loop is: agent writes email, human reviews email, human edits email, human sends email. The agent did not automate outbound. It generated drafts.
Draft generation is useful. But it is Q3, not Q1. The pricing, staffing, and ROI math are completely different.
The diagnostic question: Can a test pass or fail to decide if the output is correct?
If the answer is "well, kind of, depending on the situation," you are Q3. The loop is not closed. You need a human in it. Plan accordingly.
Worked Examples
Let me walk through each quadrant with a concrete project.
Q1 (Economic Engine): Cursor's Bugbot. Bugbot reviews pull requests and flags bugs. Tests pass or fail. CI is green or red. The loop is closed: the system can verify its own suggestions by running the test suite. And there are always more PRs. Cursor reports 80% resolution rate across 110,000+ repos. That is a real economic engine.
Q2 (Efficiency Play): Invoice matching. An agent matches purchase orders to invoices. The numbers either reconcile or they do not. Verifiable. But the volume is fixed by the number of invoices your company processes. Good ROI for the accounts payable team. Not a growth vector.
Q3 (Creative Amplifier): AI-generated social media posts. Infinite demand (always more content to post). But "is this post on-brand and actually good?" requires a human. I know this one personally. We run 12 AI agents that draft content. Every draft goes through a human review step. The agents get better over time (see our feedback flywheel post), but the loop is not fully closed. We are Q3 and we know it.
Q4 (Utility Tool): Contract clause extraction. An agent highlights risky clauses in NDAs. A lawyer still reviews every flagged clause. The scope is bounded by deal flow. Harvey (reportedly at ~$200M ARR in 3 years according to a16z data) is building Q4 into something bigger, but most contract tools are capped utility.
Where We Bet
Coding-agent runtime safety (AgentGuard) is a Q1 picks-and-shovels play.
Here is why. The loop is closed: a budget guard either fires or it does not. A loop detector either catches the repeat or it does not. Tests verify the guards. And the demand is infinite: every coding agent that ships needs cost controls and safety rails. More agents, more demand.
We are not building the agents. We are building the thing that keeps them from burning your budget at 3am.
from agentguard import Tracer, BudgetGuard tracer = Tracer(guards=[ BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8), ])
Q1. Closed loop. Infinite demand. Verifiable output.
The Uncomfortable Question
If you are building an agent feature right now, ask yourself:
- Can the system verify its own output without a human? (closed vs. open loop)
- Is the volume of work theoretically unlimited? (infinite vs. finite demand)
- Which quadrant does that put you in?
- Is your budget, team, and timeline sized for THAT quadrant, or for the quadrant you wish you were in?
Q1 and Q2 projects can justify automation budgets. Q3 projects need augmentation budgets (smaller, different ROI model). Q4 projects need to be honest about their ceiling.
Most of the pain in AI product development comes from misclassifying Q3 as Q1. The technology works. The business model does not.
Know your quadrant. Build for it.
AgentGuard: runtime safety for coding agents. Zero dependencies. MIT license. Budget guards, loop detection, and cost tracking that verify themselves.
Patrick Hughes
Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Fifteen agents.
Want more like this?
AI agent builds, real costs, what works. One email per week. No fluff.
More writing
- 5 min
When to Replace Your AI Agent With a Script
Will Larson says agents should be scaffolding, not permanent infrastructure. I run 12 agents overnight. Here's what I kept as agents and what I converted to code.
- 5 min
Your AI Agent's MCP Server Is a Security Hole
1 in 35 GenAI prompts carries high risk of data leakage. MCP makes the attack surface worse. Here's what builders need to know.
- 6 min
Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running
Anthropic shipped a pattern where a cheap model runs the loop and escalates to Opus only when it needs to. The pattern works on any two-model setup. Here is the math and the playbook.