Why Does AI Fail in Real Life? The Limits of Mathematical Optimization

Samuel Fernández Lorenzo
6 days ago
5 min read

Imagine you train a robot to be a perfect chef. You teach it thousands of recipes, show it impeccable techniques, and in cooking tests it achieves 95% accuracy. Then you take it to a real restaurant and... disaster strikes. It doesn't know what to do when an ingredient runs out, when a customer orders something off the menu, or when the kitchen is on fire (literally). Your robot lacks the capacity for improvisation and adaptation that a human possesses.

This, in essence, is what's happening with artificial intelligence in 2026. Why does this occur?

The Problem of Evaluations in AI

If you follow the AI world closely, you'll have heard of the famous benchmarks: those standardized tests where models demonstrate their capabilities. MMLU for general knowledge, SWE-bench for programming, GPQA for reasoning... The numbers are impressive. Models achieve increasingly higher scores.

But people have already noticed a problem: those numbers aren't translating into real utility.

A model can score 95% on a lab test and fail spectacularly in 95% of real cases. Why? Because benchmarks are controlled environments, with multiple-choice questions, synthetic tasks, and without the chaos, ambiguity, and surprises that characterize the real world.

This is called benchmaxxing: training models to maximize scores on specific tests, without guaranteeing they'll work when you release them into the wild. It's like studying for an exam by memorizing answers without truly understanding the subject matter.

Structured vs. Unstructured Problems

To understand why this happens, we need to distinguish between two fundamental types of problems.

Structured problems are like chess, sudoku, or any Nintendo game. They have clear rules, a well-defined state space, and we know exactly which variables and relationships are involved. We know the system we're working with.

But real life rarely presents us with such friendly problems.

Unstructured problems are a completely different story. Imagine a business owner wondering what price to set for their product. Or a mayor trying to improve the quality of life in their city. Or any person trying to decide how to invest their money. In all these cases, not only do we not know the correct answer, but we're not even clear on what the problem space is in the first place.

In a structured problem, the uncertainty comes from not knowing which of the known options to choose. In an unstructured problem, the uncertainty is much deeper: we don't even know what options truly exist. There are genuinely unprecedented aspects, beyond all prediction.

The litmus test: if even brute-force search doesn't make sense, you're probably facing an unstructured problem.

A Simple but Revealing Example

Look at these three triangles. Each has numbers at its vertices and another in the center. But one of them has a question mark in the center. What number is missing?

This type of puzzle is fascinating because it cannot be solved through pure logical deduction. You have to imagine a pattern that connects the data together. You could add numbers from the same triangle, or combine equivalent vertices from different triangles, or use divisions, or... infinite possibilities.

None are explicitly permitted or prohibited. There's no predefined search space. You have to imagine it first, structure the problem yourself, before you can even search for a solution. Therefore, not even a brute-force search constitutes a general solution to this type of problem.

This is the core of an unstructured problem: the search space depends on your imagination. And only after having defined it does it make sense to apply some kind of systematic solution search.

How AI Learns (and Why It Has Limits)

Here comes something that surprises many non-technical people: when an AI algorithm "learns," what it's really doing is solving an optimization problem, that is, a search within a structured space.

The process works like this:

You have data (concrete examples).
The programmer introduces an ansatz: a family of possible models within which to search. For example, "let's assume the relationship is linear" or "let's use a neural network of such architecture."
A cost function is defined that measures how well each model explains the data.
The algorithm searches for the model that minimizes that cost function.

That's the "learning." The optimal model is simply the minimum point of a mathematical function.

And here's the crux of the matter: this entire process requires the problem to already be structured. You need to define the space of possible models, the relevant variables, the cost function. You're solving an optimization problem within a framework you yourself have established.

The Fundamental Limit of Optimization

Sure, an AI trained on an enormous database can try to "tame" unstructured problems approximately, seeking similarities with past situations. And in fact, it does so with some success. Even with so much success that it may be sufficient in a vast majority of situations.

But the devil is in the details. The capacity for adaptation and imagination in complex environments will always have fundamental limitations.

It's no surprise that AI systems fail at real-life tasks, because fundamentally, an AI system is the result of a structured optimization. In short:

A structured optimization problem can never be a general solution to the unstructured problems that life presents us.

Unstructured problems represent a fundamentally different order of complexity. It's not just that they're "harder" than structured problems; they belong to a qualitatively distinct category.

No matter how large the model is, no matter how many parameters it has, no matter how much data you train it with... if the problem requires imagining a completely new search space, one that isn't implicit in any of your training data, the AI will fall short.

This is where the real problem comes in: AI promoters sell us a promise of a future solution. Influential figures in the sector (including critical voices like Yann LeCun) suggest that each current limitation will be resolved with the next breakthrough, whether it's the next training cycle, with more data, with the next innovative architecture... And of course, behind all these people appears a whole mass of techno-believer acolytes whose critical capacity is as broad as the length of a tweet.

But this technological optimism increasingly resembles an act of faith. It's almost a revival of the positivist faith of the 20th century from authors like Rudolf Carnap, here expressed in a belief that all problems can be reduced to an optimization problem, or a very large .csv file ("God will be a .csv file", Elon Musk).

And what few are saying out loud is this: if the limit is fundamental, not technical, then more scale isn't the solution. There simply is no perfect algorithmic solution.

Want to delve deeper into algorithmic limits and the differences between structured and unstructured problems? I recommend reading the third part of Everything I Can Imagine: The Algorithm of Understanding.