[AgentGuard Pro]
All writing
4 min read

Your Cron Jobs Lie - Why I Built an Outcome Checker

Scheduled tasks exit 0 even when the work never happened. Here is the outcome layer I built on top of my agent fleet, and why it shipped before any new dashboard.

Share LinkedIn

Your Cron Jobs Lie: Why I Built an Outcome Checker

Every morning my scheduled tasks said green. Every morning something was actually broken.

The blog publisher fired at 09:30 and exited 0. No post went live. The QA reviewer ran at 08:30 and exited 0. Nothing got reviewed because the input folder was empty for the wrong reason. The morning brief compiled clean. Yesterday's run-ledger had a hole nobody noticed.

Exit codes lie. Cron tells you the process ran. It does not tell you the work happened.

The gap nobody talks about

A scheduled task is a contract with future you. "At 09:30, a blog post will be live on the site." The scheduler only checks half of that contract. It confirms the script ran. It does not check the outcome.

This gap gets worse as you add agents. I have 30+ scheduled tasks now. Some publish content. Some review drafts. Some sweep queues. Each one exits 0 for dozens of reasons that are not "the work shipped."

The script ran but found no input. The script ran but the API was down and it logged a warning. The script ran in dry-run mode because a flag flipped. The script ran but the upstream task it depends on silently failed yesterday.

All of these are green in your dashboard. None of them are green in reality.

What an outcome checker actually checks

I built a tool called brain think. It answers one question: did what should have happened today actually happen?

Not "did the task fire." Not "did the script exit 0." Did the outcome land.

For the blog publisher, the outcome is: a row exists in the Supabase blog table with today's date, AND the public URL returns 200, AND the slug is in today's published folder. If any of those three fail, the check is red.

For the QA reviewer, the outcome is: today's QA report file exists AND it has decisions for every draft in the queue. Empty report when the queue had drafts is red. No report at all is red.

For the queue sweep, the outcome is: yesterday's nightly report exists AND it lists actions taken, not "found nothing to do" when there were tasks waiting.

Each check is time-gated. The blog outcome only fails if it is past 09:35 and no post is live. Before that, it is too early to know.

Heal mode is the part that matters

Detection without action is just a louder alarm. brain think --heal does the work the scheduled task should have done.

If the blog publisher missed today, heal mode picks a draft, runs the QA gate, posts it, and confirms the URL. If the QA reviewer skipped a backlog, heal mode reviews them. If the morning brief is missing, heal mode compiles it from the same inputs.

Heal mode is gated. It will not publish a draft that fails voice rules. It will not skip the QA review. It runs the same checks the scheduled path runs, just same-day and on-demand instead of waiting for tomorrow at 09:30.

This is the difference between a monitoring tool and an operational tool. Monitoring tells you something is broken. The operational tool fixes it before you wake up to the gap.

What I would tell anyone running a fleet of agents

One. Treat exit codes as suggestions, not facts. Every scheduled task needs a downstream outcome check that does not trust the task's own return value.

Two. Make the outcome check time-aware. A 09:30 task that has not run yet at 09:31 is not failing. The same task at 11:00 is. Bake the deadline into the check.

Three. Build heal mode before you build dashboards. A dashboard shows you the gap. Heal mode closes it. The second one compounds. The first one just adds another tab you stop looking at.

The cost of not doing this

I ran without an outcome checker for six weeks. In that window, my scheduled tasks reported 100% success. I lost three blog posts to silent failures, missed two morning briefs, and shipped a draft that should have been gated. The dashboard was green the whole time.

The day I added the outcome checker, it flagged four broken outcomes in one run. None of them were detectable from the scheduler logs.

If you are building agents that run on a schedule, build the outcome layer before you build the next agent. The fleet does not get more reliable as it grows. It gets less reliable, faster, unless you measure the right thing.

If you want hard budget limits and outcome guards for your own coding agents, start here: https://bmdpat.com/tools/agentguard

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

PH

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.

More writing