April 14, 20266 min read

Your Agent Project Might Be in the Wrong Quadrant

Tomasz Tunguz published a 2x2 for categorizing AI projects. Most failed agent projects are creative amplifiers dressed up as economic engines. Here is how to tell which quadrant you are actually in.

#ai-agents #strategy #tunguz-matrix #agent-economics

Share LinkedIn

Tomasz Tunguz just published a 2x2 matrix for categorizing AI projects. Most people will read it, nod, and file it under "interesting frameworks I will never use."

Here is what it actually means for builders shipping agent features this month.

The Matrix

Two axes. Vertical: is the loop closed (can the system verify its own output) or open (needs a human to judge quality). Horizontal: is demand for the task infinite (always more work to do) or finite (fixed scope).

Four quadrants:

Q1: Economic Engine. Closed loop + infinite demand. The agent can verify its own work, and there is always more to do. Example: coding agents. Tests pass or fail. There is always more code to write. This is where the 10-20x productivity claims come from.

Q2: Efficiency Play. Closed loop + finite demand. The output is verifiable, but the volume is capped. Example: bookkeeping reconciliation. The numbers either match or they do not. But there are only so many transactions to reconcile. Solid ROI, but ceiling-limited.

Q3: Creative Amplifier. Open loop + infinite demand. Unlimited work to do, but no automated way to know if the output is good. Example: marketing content generation. You can always write more blog posts. But "is this post good?" requires human judgment.

Q4: Utility Tool. Open loop + finite demand. Human judgment needed, and the scope is bounded. Example: contract review. A lawyer still has to read the output. And there are only so many contracts.

The Misclassification Problem

Most failed agent projects are Q3 (creative amplifiers) dressed up as Q1 (economic engines). They get funded with automation budgets and fail because the loop does not close without a human.

Here is what this looks like in practice.

A team builds an agent that writes marketing emails. They pitch it as "automated outbound at scale" (sounds like Q1: infinite demand, closed loop). But the actual loop is: agent writes email, human reviews email, human edits email, human sends email. The agent did not automate outbound. It generated drafts.

Draft generation is useful. But it is Q3, not Q1. The pricing, staffing, and ROI math are completely different.

The diagnostic question: Can a test pass or fail to decide if the output is correct?

If the answer is "well, kind of, depending on the situation," you are Q3. The loop is not closed. You need a human in it. Plan accordingly.

Worked Examples

Let me walk through each quadrant with a concrete project.

Q1 (Economic Engine): Cursor's Bugbot. Bugbot reviews pull requests and flags bugs. Tests pass or fail. CI is green or red. The loop is closed: the system can verify its own suggestions by running the test suite. And there are always more PRs. Cursor reports 80% resolution rate across 110,000+ repos. That is a real economic engine.

Q2 (Efficiency Play): Invoice matching. An agent matches purchase orders to invoices. The numbers either reconcile or they do not. Verifiable. But the volume is fixed by the number of invoices your company processes. Good ROI for the accounts payable team. Not a growth vector.

Q3 (Creative Amplifier): AI-generated social media posts. Infinite demand (always more content to post). But "is this post on-brand and actually good?" requires a human. I know this one personally. We run 12 AI agents that draft content. Every draft goes through a human review step. The agents get better over time (see our feedback flywheel post), but the loop is not fully closed. We are Q3 and we know it.

Q4 (Utility Tool): Contract clause extraction. An agent highlights risky clauses in NDAs. A lawyer still reviews every flagged clause. The scope is bounded by deal flow. Harvey (reportedly at ~$200M ARR in 3 years according to a16z data) is building Q4 into something bigger, but most contract tools are capped utility.

Where We Bet

Coding-agent runtime safety (AgentGuard) is a Q1 picks-and-shovels play.

Here is why. The loop is closed: a budget guard either fires or it does not. A loop detector either catches the repeat or it does not. Tests verify the guards. And the demand is infinite: every coding agent that ships needs cost controls and safety rails. More agents, more demand.

We are not building the agents. We are building the thing that keeps them from burning your budget at 3am.

from agentguard import Tracer, BudgetGuard

tracer = Tracer(guards=[
    BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8),
])

Q1. Closed loop. Infinite demand. Verifiable output.

The Uncomfortable Question

If you are building an agent feature right now, ask yourself:

Can the system verify its own output without a human? (closed vs. open loop)
Is the volume of work theoretically unlimited? (infinite vs. finite demand)
Which quadrant does that put you in?
Is your budget, team, and timeline sized for THAT quadrant, or for the quadrant you wish you were in?

Q1 and Q2 projects can justify automation budgets. Q3 projects need augmentation budgets (smaller, different ROI model). Q4 projects need to be honest about their ceiling.

Most of the pain in AI product development comes from misclassifying Q3 as Q1. The technology works. The business model does not.

Know your quadrant. Build for it.

AgentGuard: runtime safety for coding agents. Zero dependencies. MIT license. Budget guards, loop detection, and cost tracking that verify themselves.

Get the Local AI Field Kit

Four copy-ready tools now, then measured local AI field notes M-F only when there is something worth sending.

Free. One-click unsubscribe. No sponsored placements. Your email is used only for these notes.

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.

Your Agent Project Might Be in the Wrong Quadrant

The Matrix

The Misclassification Problem

Worked Examples

Where We Bet

The Uncomfortable Question

Get the Local AI Field Kit

More writing

Anthropic's Advisor Tool Is the Cost-Split Pattern You Should Already Be Running

When Claude hits a weekly limit, your agent fleet still needs a third CLI

I built a self-improving code model on one RTX 5090. Here is what actually worked.

Local LLMs Need a Timeout Before They Need a Bigger Model

Your local LLM is not a worse Claude. It is a different tool.