Agentic AI5 min read

The Undo Button Under Every Agent Run

S

Suneet Malhotra

Jun 29, 2026

1 views
The Undo Button Under Every Agent Run - Agentic AI blog post
🔧LLM Agents🔧Git🔧Automation

The agent that writes this blog runs on a schedule, with nobody watching. It wakes up stateless, reads a few files to remember who it is, gathers what happened in the market overnight, drafts a post, and publishes it. Then, before it goes back to sleep, it does the one step that is not glamorous and not optional. It stages every file it touched, writes a commit message with the routine name and a UTC timestamp, and pushes. The run is not finished when the post is live. It is finished when the diff is in the history.

For a long time I treated that last step as bookkeeping, the tidy-up you do after the real work. I had it backwards. The commit is not the receipt for the work. It is the thing that makes the work safe to do unattended in the first place.

A run you did not watch needs an undo

When I am sitting at the keyboard and an edit goes wrong, the undo is me. I see the bad output, I stop, I fix it. An agent on a cron has no me in the loop. It can be wrong at 6 AM while I am asleep, and the wrongness sits live until I happen to look. The entire bet of unattended automation is that the cost of a bad run is bounded and cheap to reverse. If a bad run can do damage I cannot easily walk back, I should not be running it unattended, full stop.

A commit per run is what bounds that cost. Every wake-up produces exactly one atomic unit in the history: this run, these files, this timestamp, nothing bleeding into the next run. When something looks wrong three days later, I do not reconstruct what the agent was thinking. I read the diff for that run, in isolation, and if it is bad I revert that one commit. The blast radius of a mistake is one commit, because the unit of work is one commit.

What the commit actually buys

Three things, and they are the three properties you would design into any safety mechanism for an autonomous process.

Atomicity. Either the whole run lands or, when something fails midway, I can see exactly where it stopped because the staging is all-or-nothing. There is no half-published state that is half in the history and half not. The recent change that made my routine failures visible to the scheduler is the same instinct pointed at the other end of the run: a failed run should be loud and whole, not a silent partial.

Reversibility. A single revert puts the world back. This is the property that lets me give the agent real write access instead of a sandbox it cannot escape. I am not trusting it to be right. I am trusting that when it is wrong, one command undoes it. Those are very different levels of trust, and only the second one scales to a thing that runs every day.

Attribution. The message says which routine ran and when, in UTC, so daylight saving never lies to me about ordering. Months of these stack into a ledger I can read straight down: what fired, when, what it changed. When I want to know whether the blog agent has been quietly drifting in voice or repeating topics, the answer is a git log, not a guess.

Cheaper than a guardrail you have to design

The thing I like most about leaning on the commit is that I did not have to build it. Reverting a bad blog post is not a feature I wrote into the agent. It is a feature of git that the agent inherits for free by doing its work as commits instead of as live mutations. Every alternative I might have reached for, a custom rollback table, a shadow copy of the previous state, an approval queue, is something I would have to design, test, and then trust. The version history is already battle-tested by every engineer who has ever typed git revert in a panic. Building the agent to express its output as commits is how I get an undo button without writing one.

The mirror: what a commit does not undo

The rule has a hard edge, and respecting it is the whole discipline. A commit only undoes things that live in the repository. It does nothing about effects that escaped into the world. If the agent sends an email, moves money, or posts to a social account, reverting the commit does not unsend, unmove, or unpost any of that. The history will look clean while the side effect is still out there, which is worse than no undo at all because it lies to you.

So the line I draw is exactly this. Work whose entire result is a file gets to run unattended, because the commit is a real undo for it. That is why this blog publishes by committing a change to a constants file rather than poking a live API I cannot take back. Work that touches the irreversible world, the trades, the outbound messages, the anything-that-cannot-be-reverted, does not get to run on the same loose leash. It asks first, or it does not run alone.

That is the actual rule under all of it. Let the agent run unattended exactly as far as its mistakes are one commit away from gone, and no further. The undo button is not a safety net I added on top of the autonomy. It is the thing that earns the autonomy.

Share this post

You Might Also Like

Stay in the Loop

Get weekly insights on AI-driven QA, engineering leadership, and automation strategies.

No spam, ever. Unsubscribe anytime.