[bmdpat]
All writing
5 min read

Your AI agent doesn't need memory. It needs a file.

I run a one-person company on scheduled agents and gave almost none of them memory. They write to files instead. Here is why that wins.

Share LinkedIn

The pitch for agent memory sounds right. An agent should remember what it did yesterday, what you told it last week, what the user prefers. So you reach for a vector store, embed everything, and query it on every run. Now your agent has a brain.

Then it breaks in ways you cannot see. The retrieval pulls the wrong chunk. Two facts contradict and the agent picks one at random. You cannot tell what it knows without running a query yourself. Debugging means inspecting embeddings. That is a bad afternoon.

Here is what I do instead. Every agent reads and writes plain markdown files in a known folder. The morning brief writes to one folder. The digest writes to a wiki folder. State that matters goes to a file with a name I can predict. When an agent needs context, it reads the file. That is the whole system.

This is the same instinct behind every control I put around an agent. Files make what an agent knows visible. AgentGuard, the budget and loop guard I build, makes what an agent spends and how often it loops visible and capped. Memory you can read. Spend you can see. Loops you can bound. An agent you cannot inspect is an agent you cannot trust, whether the thing you cannot see is its memory or its token bill.

Why files win for a one-person operation

You can read them. Open the folder, read the markdown, know exactly what the agent knows. No query. No embedding inspection. If the agent did something dumb, the evidence is sitting in a file I can open in any editor.

They do not silently rot. A contradiction in a file is visible the next time anyone opens it. A contradiction in a vector store hides until it produces a wrong answer in production.

They version cleanly. Sync, git, whatever you use, you get history for free. Try diffing two states of a vector database and tell me what changed.

They go append-only when you want. My event log is one line per run. The agent appends, never edits. I can replay a whole day by reading the file top to bottom.

A concrete example

One of my agents publishes a blog post each day. It does not need to remember which posts it already wrote. It reads the folder of published files before it picks a topic. The folder is the memory. If I want to know what shipped last week, I read the same folder. Agent and human share one source of truth, and it is just files on disk.

Compare that to the vector-store version. The agent embeds each published post, queries for similar topics, and trusts the retrieval to catch duplicates. The day it embeds a near-duplicate and the cosine score lands just under the threshold, it writes the same post twice and you find out from a reader.

When you actually need real memory

I am not saying vector search is useless. If you are searching ten thousand support tickets for the three relevant ones, embeddings are the right tool. The rule is about scale, not ideology.

The test I use: can a human read the entire memory in a few minutes? If yes, use files. A one-person company generates maybe a few hundred markdown files of state. A human can grep that. You do not need approximate nearest-neighbor search over something you could read with a cup of coffee.

Most agent projects are not at ten-thousand-document scale. They are at remember-the-last-few-decisions-and-yesterday-output scale. That fits in files. Reaching for a vector store there solves a problem you do not have and adds a failure mode you cannot see.

The deeper rule: write, do not remember

If it matters, it goes to a file. Not to the context window. Not to a hope that retrieval surfaces it. The context window is scratch space. The file is the truth. When a new run starts, it loads the files it needs and goes. No run depends on a previous run remembering anything, because nothing is remembered. It is all written down.

This also fixes the cost problem nobody warns you about. Every memory query is tokens. Embed on write, embed the query, retrieve, stuff the results in context. That runs on every invocation, forever. Files cost nothing to read.

So the two halves of the rule go together. Files keep an agent legible: you can read what it knows. AgentGuard keeps it contained: a hard budget and loop limit so a runaway query or retry storm cannot drain your account before you notice. Legible plus contained is an agent you can leave running while you go live your life. Free, pip install agentguard: https://bmdpat.com/tools/agentguard

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

PH

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.

More writing