The next $T AI race has barely started.

AI is not trustworthy enough to bet your company on. That's the opportunity.

Mar 15, 2026

Everyone is debating which model is biggest, which company wins the arms race, which wrapper app survives.

This is an important debate, but it misses the larger opportunity.

Not because the models aren’t impressive. They are. Not because the wrappers aren’t impressive businesses. They are, for now. But in their current state, neither are individually sufficient to capture the largest opportunities created by AI.

The real question is where value accumulates once foundation models and software become ubiquitous utilities.

That value accretion layer is determinism. And it remains an unsolved problem, and therefore one of the largest opportunities in the market today.

Here is what nobody is saying out loud.

LLMs like ChatGPT, Claude, and Gemini are inherently non-deterministic. They are effectively Plinko at hyperscale.

Making LLMs bigger doesn’t solve the fundamental problems that prevent businesses from trusting them with real decisions.

Look into the vector space of an LLM and tell me why it responded the way it did. Oh wait, you can’t scalably or programmatically do this.

The science of understanding their internal logic is still in its early stages. The two most serious efforts to solve this are mechanistic interpretability and constitutional AI, both led by Anthropic. Both are still in their infancy.

To date, Anthropic is the only foundation model company that has made understanding the vector space a core part of their brand identity. We are not seeing a similar investment from OpenAI, DeepSeek, xAI, or any of the other foundation model providers, especially in open source. That means there is no serious industry-level effort to make these models verifiably deterministic.

This will only get worse as the models are getting geometrically larger. The next generation of models are forecast to require orders of magnitude more energy and GPUs, so the vector space is only going to become even more complicated, not less. As these models grow larger, the problem of determining the logic behind the response gets harder, not easier.

Wait, but what about evals?

Yes, evals can reduce errors. That’s why teams are running evaluations on top of evaluations, trying to measure and reduce error rates. This approach burns a lot of tokens without producing a 99.999% reliable result, which the CFO of a foundation model company loves, but your CFO hates.

But this approach has a fundamental flaw: this is simply layering determinism on top of non-determinism. And every time the underlying model updates, the evaluations break and have to be rebuilt from scratch, because the models are not deterministic.

That is not a solution. That is a workaround for a problem that has not been solved.

But LLMs can produce consistent results!

People familiar with LLMs will say: “You can get an LLM to reproduce the same result!”

Reproducibility and accuracy are not the same thing. You can set the temperature to zero and get the exact same wrong answer every single time with perfect consistency. That is just a consistent failure.

But I’m building an Agent built on LLMs and Evals!

Then you are building LLMs built on LLMs, or non-determinism on non-determinism. Every step introduces variance and token cost. When you stack them, those variance and costs compound. And to hope that something deterministic emerges from a non-deterministic technical foundation is not a strategy. It is a wish. And I wish you the best.

We are scaling belief in AI faster than we are scaling trustworthy AI.

The leaders of foundation model companies will tell you this themselves.

Ask any foundation model CEO how the interior of their model works, and they will tell you they do not know. Anthropic’s CEO recently told the Pentagon their model is not ready to make life and death decisions. If the model is not ready for life and death decisions, it is not ready for multimillion dollar business decisions either.

Here is what that means:

The truly massive value creation will come from the next generation of technology that makes AI trustworthy enough for full automation of high-stakes decisions.

The layer that does not exist yet is the determinism layer. The system that sits above the model and can verify, audit, and guarantee outputs for the decisions that actually matter. Formal verification. Real-world grounding. Confidence estimation. Auditable autonomous execution.

That is what 99.999% reliability looks like in AI. And five-nines is what’s required for full automation.

Current language models, no matter how large, are not sufficient to achieve this. It is a fundamentally different capability. It has not been invented yet. And this is a tremendous opportunity.

The Opportunity

This gap between what AI can do and what AI can be trusted to do in production represents the next major layer of value creation in technology. The companies and platforms that solve for deterministic reasoning, auditability, formal verification, and trusted autonomous execution will occupy the scarcest and most valuable position in the AI stack. Not because they built the biggest model. Because they built the layer that makes any model usable for the decisions that actually matter.

The deployment of this layer will not come from a single company. It will emerge across the market in several forms:

Internally within sophisticated enterprises that have the process clarity and data foundation to move quickly
As a foundational design principle in AI-native companies being built today
Through solution providers and integrators serving enterprises whose data is locked in existing platforms
Through a new generation of highly vertical SMB and midmarket software companies serving markets that were previously too narrow to attract venture capital, now being built by a new class of entrepreneur using AI to make those economics viable for the first time

Vertical specificity is the moat. The determinism layer for a healthcare credentialing workflow looks nothing like the one for financial trade execution, and that domain depth is what incumbents can’t replicate at scale.

We are looking at the emergence of an entirely new asset class of vertical AI software. And this is where the global $1.1 trillion enterprise software budget begins to shift to new players at scale.

I first saw this problem at Amazon in 2005 when I used machine learning to automate fraud detection. I’ve spent the twenty years since watching it go unsolved, and the last nine trying to solve it at scale with HowDo.

I’m still trying because this is a multi-trillion dollar opportunity.

The barrier is not technology. It’s your data, culture, and process. You can build reliable and deterministic systems in your company today.

But most businesses will wait for someone else to build that layer rather than doing the hard work of transforming their own data, culture, and process. The company that builds it captures the market.

The race everyone is watching is not the only race that matters. The race that matters is the one that has barely started.

I’ll be exploring this opportunity in great detail so if you’d like to learn more, please subscribe.

Where do you see the determinism layer emerging first?

Happy building,
West

Business Evolution by West Stringfellow

Discussion about this post

Ready for more?