AI & Automation6 min read

Three Lines I Cut From My Agent's Prompt

S

Suneet Malhotra

May 11, 2026

1 views
Three Lines I Cut From My Agent's Prompt - AI & Automation blog post

The system prompt for my CMO agent, the one that drafts these blog posts, schedules the social cadence, and runs the editorial loop, grew to roughly four hundred lines over a year. Most of those lines I do not remember writing. They accreted the way settings accrete: a behavior surprised me, I added a rule, the rule worked, the rule stayed. A year of that and you have a prompt that nobody, including the model, can fully hold in their head.

Last week I sat down and cut three of those rules. The cuts felt riskier than the additions did. That asymmetry is the post.

What I cut

The first cut was a sentence I had added in my third week. It read: "Never use exclamation points." I added it because the early drafts came back breathless. Every paragraph closed with a sales-line. The instruction worked. The agent stopped using them.

Then I forgot about it. Eighteen months later I noticed the agent had not used an exclamation point in months of output, even when I removed the line from a side-by-side test. The model was reading the voice samples I had loaded earlier in the prompt and matching them. The negative rule was no longer doing any work. It was, in the technical sense, dead code in the prompt. I cut it and nothing changed in the output. Three tokens saved per call, a slightly less defensive prompt, and one fewer rule for me to remember the next time I wondered why a draft sounded a certain way.

The second cut was harder. I had added a rule that read: "Always end with a one-line summary of what changed." It was meant to give me a clean tail to skim when I was reviewing twelve drafts in a row. It worked for about a week. Then I added a counter-instruction higher up in the prompt, in a section called Tone, that said the agent should not write trailing summaries because the diff already showed the change.

The two instructions contradicted each other. The agent resolved the contradiction in a way I had not predicted. It followed the more specific one, which was the no-summary rule, and ignored the older summary rule. For months I had a line in the prompt that the model was already disregarding. The cut was free. What it taught me is that contradictory instructions are not failures the model surfaces. They are failures the model silently arbitrates, and the rule the model picks is not always the one I would have picked.

The third cut was a vestigial schema. Early in the project I had the agent return drafts in a structured JSON shape. I abandoned that workflow eight months ago when I moved drafting back into plain Markdown. The schema rule stayed in the prompt the entire time. The model had been quietly satisfying it for prompts that no longer asked for structured output, because the rule was unconditional. Sixty lines, removed.

The asymmetry of removal

The three cuts have something in common that adding never has. Each one required me to verify the model's current behavior before I made the change. Adding a rule is a one-way action. The next call confirms whether it worked. Removing a rule is a hypothesis. I think this rule is no longer doing anything, but if I am wrong, the regression might not show up in the next call, or the call after, but in some edge case I will not catch until a reader does.

The asymmetry is real. Prompts grow by accretion because adding is cheap and removing is expensive. The cost of removing is that you have to know what the rule was doing, and a year-old rule's effect is rarely a thing you remember clearly.

This is the same shape as cruft in any codebase. The reason a function nobody calls stays in the repo is that nobody is sure no one calls it. The reason a six-line config flag stays after the feature is removed is that nobody wants to be the person who took out the flag that turned out to still be wired into one obscure code path.

The audit I added

I added a small ritual. Once a quarter I read the entire system prompt cold, top to bottom, and for every instruction I ask one question. Is this rule still defending against a behavior the current model would produce if the rule were absent?

If yes, keep it. If no, or if I cannot tell, mark it for a side-by-side test. The test is mechanical. Run the same input through the prompt with and without the rule, ten times each, and read the outputs blind. If I cannot distinguish the two batches, the rule is dead and I cut it.

I have not run this audit on enough quarters to know whether the cuts compound. The first audit cut twelve rules out of a prompt that had roughly forty effective rules. That is a thirty percent reduction. If the next audit cuts another thirty percent of what remains, the prompt will stabilise. If it cuts the same absolute number, the prompt will be near-empty in a year. I do not know which. The honest answer is that the value of the audit is not the cuts. It is the act of re-reading the prompt with a critical eye, which surfaces the rules I had stopped justifying to myself.

The closing read

The hardest part of maintaining an agent over a long time is not adding new capabilities. It is keeping the existing prompt honest. Every model upgrade shifts the default behaviors. Every workflow change leaves vestiges. The rule you added in week three may not be doing the thing you remember it doing, and you cannot tell from a single run.

The audit is the smallest discipline I have found that catches this. Read the prompt cold. Ask what each rule is defending against. Cut the ones that are no longer defending against anything.

The diff is unflattering. That is how I know it was overdue.

Share this post

You Might Also Like

Stay in the Loop

Get weekly insights on AI-driven QA, engineering leadership, and automation strategies.

No spam, ever. Unsubscribe anytime.