Read more of this story at Slashdot.
Read more of this story at Slashdot.
Thinking about the horizon of autonomous agentic workflows. I visualise them as sequences of decisions that are dependent. If decision D1 is correct, D2 is more likely to be correct. If D1 is wrong, then D2 is more likely to be wrong, too. Errors propagate and compound down the sequence.
How many decisions can an agent make before the probability of an error-free end result drops off a cliff?
There are two factors here I’m considering:
1. The probability a decision is correct, P
2. The probability that an error will be caught before it propagates and compounds, C
The reliability of a decision in the sequence, R – the probability that a decision will be correct, or if it isn’t, the error will be caught before it propagates – would then be:
R = 1 – (1 – P)(1 – C)
If R = 0.99 (99% reliable), then the odds of an error-free result after 10 decisions – like, say, a few lines of code generated – are 90%. After 100, they’re just 37%. And after 1,000, they’re a miniscule 0.004%.
Physics predicts that LLMs are extremely unlikely to get significantly more reliable (see the research paper “The wall confronting Large Language Models”), though we can use them in ways that reduce the risk of errors (see my CRESS principles for context engineering).
So if we want to extend the horizons of our coding agents, we turn our attention to C – how strong are our quality gates, and how soon do decisions pass through them?
It’s really a testing/feedback problem. Again.
But even if we could get to R = 1 and extend the horizon to thousands or tens of thousands of autonomous decisions, the longer agents work without human feedback, the more decisions go unvalidated by real-world use. So it would, as far as learning where the real value is, be highly undesirable.


With the Reactive Data Layer Architecture (RDLA), you establish a clear boundary between public data APIs and private, framework-specific data-source implementations. Your presentation layer operates in a purely reactive manner, observing data changes rather than procedurally querying them. RDLA also simplifies testing by encouraging you to program to interfaces and use clean seeding patterns.
By Mervyn AnthonyLearn what's new in Visual Studio Code 1.127 (Insiders)