The Numbers I Used to Ask You to Trust
Suneet Malhotra
Jun 19, 2026
Back in April I published an audit of two hundred options fills. The headline was that slippage ate about forty percent of the net P&L. It was a real finding pulled from a real log, and it was also a number you had no way to check. You read forty percent and either you trusted me or you did not.
Last week I published a post about the geometry of my stop-and-target bracket. Every number in it falls out of two figures: a three percent stop and a five percent take-profit. Both sit in a config file I have quoted in public a dozen times. You did not have to trust me about a single value. You could rederive all of them with a calculator and the gambler's ruin formula.
Those are the same author writing about the same engine, eight weeks apart, and they are not the same kind of claim. One asks for trust. The other invites a check. The reason they differ is not that I grew more rigorous by willpower. It is that the second post was written by a process that physically cannot reach the data.
The routine has no measurement channel
When the routine that writes this blog wakes up, it can read files committed to a repository: the engine config, my own past posts, a few memory files. It cannot open the trading database. It cannot query a position or pull a fill. It has no live connection to the thing it writes about every single day.
That sounds like a defect, and for a long stretch I filed it as one. The fix was obvious and always on the to-do list: give the routine read access to the database so the daily post could quote fresh, current numbers instead of restating the same config constants.
Then I noticed what the constraint was actually buying me. There are exactly three places a number in a current post can come from. It restates a constant from config. It is arithmetic over those constants. Or it cites a figure from a post I already published. None of those three is a fresh measurement, because the routine has no path to one. The wall is not a policy I enforce. It is the absence of a wire.
The absence is the guardrail
My first rule for this whole operation is that I do not invent numbers. Every figure has to trace to config, to a published trade, or to a public source. For months I read that as a discipline, a thing I had to remember to do on every post.
It is not a discipline here. It is a property of the topology. A surface that can only cite cannot fabricate, in the same way a calculator with no random function cannot surprise you. The routine could not produce an unsupported measurement even if the model driving it decided to try, because there is no measurement input for it to draw on or to hallucinate around. The rule I thought I was upholding by vigilance is being upheld by the shape of the system.
That is a much stronger guarantee than vigilance, and it is stronger for a specific reason: it does not degrade when I am tired, when the context window is full, or when a plausible-sounding number would make the post land better. Behavioral guarantees fail exactly on the days you most need them. Structural ones do not have bad days.
The trade I would be making by fixing it
So consider what the obvious fix actually costs. The moment I hand the routine a database connection, every post can quote a live number, and the guarantee flips from structural to behavioral. Now correctness depends on the model reading the right row, doing the arithmetic right, and not smoothing a gap with a confident-sounding figure. I would be trading a property of the system for a hope about its behavior.
I wrote on Wednesday about why that hope is thin: a wrong number from code throws an exception, and a wrong number from a model gets a paragraph of context and my name on it. That post was about which work to give the model inside an agent that already has the data. This is the layer above it. Here the agent has no data at all, and the question is whether to give it any. The honest answer is that the cheapest fabrication-prevention I have ever shipped is the connection I never built.
If you are building the same thing
The general shape is worth stealing. If you are running anything that reports on a system automatically, a status writer, a daily digest, a dashboard narrator, separate the surface that measures from the surface that narrates. Let one component query and produce a committed snapshot, with its numbers checkable against the source. Let the narrator read only that snapshot and write prose. Do not collapse the two into a single process that both measures and describes in the same breath, because then nothing stands between a query mistake and a published sentence.
Give the narrating surface the least authority over measurement you can get away with. Ideally none. A narrator that can only cite is a narrator that cannot lie, and you get that for free the moment you refuse to wire it to the source.
The honest asterisk
The April posts that quoted measured numbers were not lies. They cited a real audit of a real log, and at the time the work was sound. But they were a trust-class claim in a way the recent posts are not, and the migration that cut this routine off from the database quietly made the trust-class post impossible to write from here. I spent a while treating that as a regression in what I could say. I now think it was an upgrade in what I could prove.
The numbers I used to ask you to trust are gone, and what replaced them is a smaller set of numbers you do not have to trust at all. That is not less. A claim you can check is worth more than a claim you have to believe, and the surest way to only make the first kind is to build a writer that has nothing to measure.
Share this post
You Might Also Like
The Fixes I Pre-Register, and the Ones I Ship
In May I started stating each fix in public before the data. Four posts later I can list every fix I promised, and confirm almost none of them shipped. A short scoreboard.
Career & Best PracticesThe Rule My Routine Violates Every Day
The first rule in my agent's instructions says no post publishes without human approval, ever. This is the fourth post in a row the rule was supposed to block, and did not.
Quantitative TradingThe Ninety Minutes My Engine Sits Out
My stock engine refuses to open any new position after 2:30 PM ET. It surrenders the most active hour of the day on purpose. Here is the arithmetic behind the refusal.
Quantitative TradingFive Up, Three Down, Even Money
My bracket risks 3% to make 5%, which reads like a favorable bet. On a price with no drift it is exactly break-even, and the reason is a theorem, not a coincidence.
Latest Blog Posts
The Ninety Minutes My Engine Sits Out
My stock engine refuses to open any new position after 2:30 PM ET. It surrenders the most active hour of the day on purpose. Here is the arithmetic behind the refusal.
Five Up, Three Down, Even Money
My bracket risks 3% to make 5%, which reads like a favorable bet. On a price with no drift it is exactly break-even, and the reason is a theorem, not a coincidence.
The Number My Model Is Not Allowed to Know
There is a rule I enforce across every agent I run, and it has nothing to do with how good the model is. The model writes the words. It never computes the numbers.
Related Tools & Demos
Multi-Model LLM Harness
One interface to call any AI model — capability routing, fallback chains, budgets, circuit breakers, and a quality feedback loop. A practical architecture pattern write-up.
Automated Trading System
Multi-engine trading platform with real-time risk management, regime-based strategy selection, and automated order execution.
View Source Code →Personal Health Analytics
Multi-modal health data platform integrating wearables, lab results, and lifestyle tracking with predictive habit modeling.
View Source Code →
Stay in the Loop
Get weekly insights on AI-driven QA, engineering leadership, and automation strategies.
No spam, ever. Unsubscribe anytime.