bmdpat
All writing
7 min read

We Built Martin Fowler's Feedback Flywheel Before He Published It

Martin Fowler published a pattern for turning individual AI interactions into collective improvement. We had already built it. Here is how our 12-agent vault system maps to his four signal types.

Share LinkedIn

Martin Fowler just published Feedback Flywheel, a pattern for converting individual AI interactions into collective team improvement.

Four signal types feed back into shared artifacts. Four cadences keep them fresh. The key metric is not speed. It is declining instances of "why did the AI do that?"

We read it and thought: we already built this.

Not because we are smarter than Fowler. Because we have been running 12 AI agents in production for two months, and the feedback problem hit us so hard we had to solve it or shut the system down.

Here is what our implementation looks like, mapped to his framework.

Fowler's Four Signal Types

Fowler identifies four categories of feedback that make AI systems improve:

  1. Context Signal. Background information the AI needs before acting. Priming documents, team standards, domain knowledge.
  2. Instruction Signal. Direct commands and corrections that tell the AI what to do differently.
  3. Workflow Signal. Process-level patterns about when and how AI fits into team workflows.
  4. Failure Signal. What went wrong, why, and what guardrails prevent recurrence.

Every signal feeds into a shared artifact that every future AI session reads. The flywheel spins because each session starts smarter than the last.

Our Implementation

We run BMD Pat LLC as a one-person operation with 12 scheduled AI agents. Each agent runs on a cadence (nightly code agents, daily content agents, weekly review). The vault they all share is an Obsidian knowledge base with ~200 files.

Here is how each of Fowler's signals maps to what we actually built.

Context Signal: Playbook/

Every agent reads Playbook/ before acting. It contains our ICP (ideal customer profile), voice rules, pricing, and current metrics.

When Patrick (the human) says "that post was too soft," the system updates Playbook/voice.md. The next day, every agent that writes content reads the updated voice rules.

Fowler calls this "priming documents." We call it Playbook.

Instruction Signal: Agent Prompts + Memory Feedback Files

Each agent has a role-specific prompt (Prompts/Daily/morning brief.md, Prompts/Nightly/queue sweep.md, etc.). These prompts are updated weekly based on what worked and what did not.

When the human corrects something ("stop showing me investment signals"), the correction goes into a memory feedback file. Future sessions load those files automatically.

Fowler calls this "commands." We use prompts and persistent memory files.

Workflow Signal: Evolution Protocol + Review Cadences

Our Evolution Protocol.md defines the multi-cadence review system:

  • Daily: Morning brief compiles overnight results. Human decides in 20 minutes.
  • Weekly: Review agent analyzes the full week. Updates Playbook/ and Knowledge/.
  • Ad-hoc: When something breaks or a pattern emerges mid-week.

Each cadence produces artifacts that the next cadence reads. The daily feeds the weekly. The weekly feeds the quarterly.

Fowler calls this "playbooks." Our review cadence IS the playbook update mechanism.

Failure Signal: AgentGuard + Knowledge Wiki

This is where it gets concrete.

When an agent loops, burns budget, or produces garbage, two things happen:

  1. AgentGuard (our open-source runtime safety SDK) fires a guard. BudgetGuard stops spend. LoopGuard stops repeated tool calls. RetryGuard stops retry storms. The agent stops mid-run.
  2. The failure gets logged to Knowledge/sources/ as a write-once signal. The weekly review synthesizes patterns from those failures into Knowledge/syntheses/ (our opinion layer).

Fowler says the failure signal feeds into guardrails. Our guardrails are literal Python guards that raise exceptions.

from agentguard import Tracer, BudgetGuard, LoopGuard tracer = Tracer(guards=[ BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8), LoopGuard(max_repeats=3), ])

No model can talk its way past a dollar limit. That is the point.

The Three Layers

Our system has three layers that change at different speeds:

Layer 1: Principles (change annually). Mission, ethics, financial discipline. Decision filters that constrain everything below.

Layer 2: Knowledge (compounds monthly). Customer patterns, market intelligence, operational lessons. This is the brain that gets smarter over time.

Layer 3: Implementation (changes weekly). Agent prompts, Playbook rules, and the actual code. This is what ships.

Principles constrain Knowledge. Knowledge informs Implementation. Implementation produces signals that feed back into Knowledge.

Fowler's flywheel runs at the Implementation layer. Our system adds the Knowledge layer above it: a persistent, structured wiki that survives prompt rewrites and technology changes.

What Actually Compounds

The compounding timeline looks like this:

  • Week 1: Agents are generic. Patrick edits heavily.
  • Week 4: Playbook is tuned. Edits are decreasing.
  • Month 3: Agents know voice and ICP. Minimal edits.
  • Month 6: System output exceeds manual capacity. Knowledge is an asset.
  • Year 1: Playbook is a moat. A year of market data no competitor can replicate.

We are at Week 8. The edit rate has dropped significantly. The morning brief now surfaces insights Patrick did not ask for. The weekly review catches patterns across 7 different product queues.

Fowler's metric ("declining instances of why did the AI do that") is real. We track it implicitly through the volume of memory feedback files created per week. It is going down.

The Honest Part

We did not design this by reading Fowler. We designed it because agents kept making the same mistakes. The Playbook/ folder exists because the LinkedIn drafter kept using the wrong tone. The memory files exist because corrections were getting lost between sessions. The Knowledge wiki exists because we kept re-researching things we had already learned.

Fowler gave it a name and a clean framework. We built the messy version that works at 2am when the nightly sweep is running and you need the system to be better tomorrow than it was today.

If you are running AI agents in production and your system is not getting smarter over time, you do not have a feedback flywheel. You have a treadmill.

Start With the Failure Signal

If you are building this for your own agents, start with the failure signal. It is the most concrete, the most automatable, and the highest leverage.

AgentGuard is a zero-dependency Python SDK that adds budget guards, loop detection, and retry protection to any AI agent. It stops the agent mid-run when something goes wrong. That is your failure signal, automated.

pip install agentguard47 agentguard demo

The flywheel starts spinning when failures get captured, not when someone writes a strategy doc about capturing failures.

PH

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Fifteen agents.

Want more like this?

AI agent builds, real costs, what works. One email per week. No fluff.

More writing