huge pile-ups of error, say bots
The shiny claim: AI agents are a threat to millions of jobs. So pony up now for AI company public offerings.
Reality check: AI agents have a horrendous problem with error cascades. No one has come close to solving this drawback, despite application of numerous techniques.
A major question for policymakers is whether the time is really ripe for unleashing these agents across government. How reliable are they really? Should they be coupled to major military systems?
Ironically, one of the methods for easing this huge error hassle is: human supervision. AI bots in general can't see when an output looks suspicious, while trained humans often aquire a "feel" for something funny going on.
Even with a great deal of technical adjustment, agents typically show blunders in a third of decisions. Such an error rate is unsustainable. The hope is that recurrent or iterative AI will do the trick. These systems, which are coming into play now, design a second-generation AI system, which then designs a third-generation system, and so on, ad infinitum.
Part of the problem is that AI agents layer a decision system around a large language model. When an LLM, which uses a probabilistic next-token (or really, -word) method, makes a bad guess, that error tends not to balloon (tho it can). But an agentic system can, for example, receive a small error from its LLM core and greatly magnify it.
Also, the chance that the agent won't make an error decreases with every step in its chain of processessing. So a small possibility of error on the first step balloons into a high probability 20 steps along.
I asked various chatbots about this issue, and all are agreed: The agent error problem is severe.
In the first query below, Perplexity gives an overview of the agentic method, including a discussion of the error problem.
Perplexity on agentic error
https://tubealloys979.blogspot.com/2026/06/perplexity-on-agentic-error.html
The other chatbots were all asked to answer this question about agent error:
https://tubealloys979.blogspot.com/2026/06/claude-on-agentic-error.html
Grok on agent error
https://tubealloys979.blogspot.com/2026/06/grok-on-agentic-error.html
Gemini on agent error
https://tubealloys979.blogspot.com/2026/06/gemini-on-agentic-error.html
Deepseek on agent error
https://tubealloys979.blogspot.com/2026/06/deepseek-on-agentic-error.html
ChatGPT on agent error
https://chatgpt.com/share/6a26ea45-0630-83ea-babb-557be2105871
Reality check: AI agents have a horrendous problem with error cascades. No one has come close to solving this drawback, despite application of numerous techniques.
A major question for policymakers is whether the time is really ripe for unleashing these agents across government. How reliable are they really? Should they be coupled to major military systems?
Ironically, one of the methods for easing this huge error hassle is: human supervision. AI bots in general can't see when an output looks suspicious, while trained humans often aquire a "feel" for something funny going on.
Even with a great deal of technical adjustment, agents typically show blunders in a third of decisions. Such an error rate is unsustainable. The hope is that recurrent or iterative AI will do the trick. These systems, which are coming into play now, design a second-generation AI system, which then designs a third-generation system, and so on, ad infinitum.
Part of the problem is that AI agents layer a decision system around a large language model. When an LLM, which uses a probabilistic next-token (or really, -word) method, makes a bad guess, that error tends not to balloon (tho it can). But an agentic system can, for example, receive a small error from its LLM core and greatly magnify it.
Also, the chance that the agent won't make an error decreases with every step in its chain of processessing. So a small possibility of error on the first step balloons into a high probability 20 steps along.
I asked various chatbots about this issue, and all are agreed: The agent error problem is severe.
In the first query below, Perplexity gives an overview of the agentic method, including a discussion of the error problem.
Perplexity on agentic error
https://tubealloys979.blogspot.com/2026/06/perplexity-on-agentic-error.html
The other chatbots were all asked to answer this question about agent error:
What is being done to curb the ballooning error cascades of agentic ai, and how effective are these methods (try to be quantitative as well as qualitative)Claude on agent error
https://tubealloys979.blogspot.com/2026/06/claude-on-agentic-error.html
Grok on agent error
https://tubealloys979.blogspot.com/2026/06/grok-on-agentic-error.html
Gemini on agent error
https://tubealloys979.blogspot.com/2026/06/gemini-on-agentic-error.html
Deepseek on agent error
https://tubealloys979.blogspot.com/2026/06/deepseek-on-agentic-error.html
ChatGPT on agent error
https://chatgpt.com/share/6a26ea45-0630-83ea-babb-557be2105871

