AI Models

The future of private, small, specialized AI Models — And Why They May Be the Real Moat

Jia Chen

02 Mar 2026 • 5 min read

Just like our society are powered by general management and specialists, the AI world will experience the same structure

Larry Ellison recently said something that stuck with me:

"All the large language models—OpenAI, Anthropic, Meta, Google, xAI—they're all trained on the same data. It's all public data from the internet. So they're all basically the same. And that's why they're becoming commoditized so quickly." — Larry Ellison

We believe this deserves more attention than it’s getting. Right now the AI race looks like everyone is throwing every technique they can find at the wall — novel architectures, reinforcement learning tricks, clever post-training recipes — all to squeeze out a few more points on reasoning benchmarks, coding benchmarks, needle-in-the-haystack tests. And sure, those gains matter. But zoom out a bit and you realize all these models are learning from the same pile of public internet data. The differences are real but they’re at the margins.

The Same Textbook Problem

Here’s how I think about it. Imagine a group of brilliant students all studying from the same textbooks. Each one gets their own tutoring after school, and they might hit different SAT or GRE scores because of slightly different preparation methods and natural aptitude. But the results are more or less the same, and their knowledge is converging. They didn’t come from diverse backgrounds with wildly different upbringings or exposure to different disciplines — they all read the same material. The graduates are, more or less, the same product.

That’s where we are with general-purpose LLMs today. GPT, Claude, Gemini, Llama, Grok — they all draw from the same well of public internet data. The post-training and alignment strategies differ, the inference optimizations vary, but the foundational knowledge is converging. It’s commoditizing fast.

The Technical College Analogy

Now imagine something different. Instead of putting every student through the same general curriculum, you teach them specialized skills: some learn welding, some learn woodworking, some learn laser cutting. They can enter technical fields quickly and become highly productive right away. They’re not writing dissertations, but they’re building things that matter.

And here’s the thing — training these students at a technical college is a lot cheaper than sending them through four years of undergrad, then grad school just to produce a Master of General Studies who’s decent at many things but great at nothing in particular.

This maps directly onto AI. Training a frontier model costs hundreds of millions, sometimes billions of dollars. Massive compute, months of training, essentially all public data on the internet. But fine-tuning a small model on a focused, proprietary dataset? That’s an order of magnitude cheaper and faster — and for many real-world use cases, it’s actually more effective.

A Concrete Example: Healthcare

Let’s make this tangible. Imagine a medical system sitting on years of patient records, doctor notes, treatment plans, and pharmacy data. A general-purpose LLM trained on generic medical knowledge from the internet can answer textbook questions, sure. But its context window is limited, and more importantly, its attention is spread thin across everything it’s ever seen. It doesn’t understand the specific patterns of your patient population — the regional environment, the local food consumption habits, the cultural factors that influence treatment adherence, the economic realities that shape which medications get prescribed.

A small model trained on that system’s actual data — the real doctor notes, the real treatment outcomes, the real pharmacy records — can pick up on patterns that a general model simply cannot. It’s not smarter in some abstract sense. It just knows your world better because it was trained on your world.

It’s Not About Replacing Big Models

Now, we want to be clear: we are not saying the future is all small models and general-purpose LLMs go away. That’s not the point. General LLMs are incredibly capable and they’re going to keep handling a huge portion of the work — content generation, open-ended reasoning, natural language interfaces, the kind of flexible thinking that you need a broad model for.

But what we’re seeing in practice is that agentic systems are becoming the dominant architecture for serious enterprise AI work. And modern agentic systems increasingly look like swarms — an architect agent that plans and coordinates, and specialist agents that execute on specific tasks. The general LLM is great as the architect and great for the generalist work. But the specialist agents? They need specialized, expert models to do their jobs well. A general model trying to do everything is like asking your architect to also do the plumbing and electrical work. They could probably figure it out, but you’d get much better results with someone who does it every day.

The real power comes from these models working together. The general LLM handles orchestration, reasoning, and user-facing interaction. The small specialist models handle the domain-specific heavy lifting — extracting structured data from messy documents, classifying things according to company-specific rules, generating outputs in industry-specific formats. That combination is where you get the biggest bang for the buck.

Why This Matters for SaaS Companies

For SaaS companies whose products are Systems of Record, this is especially relevant. These platforms sit on massive amounts of proprietary data — transaction histories, workflow patterns, client communications, operational metrics — data that simply doesn’t exist on the public internet. That data is the raw material for building specialist models that no competitor can replicate.

A general LLM can draft you a generic email. But it can’t understand the nuances of a specific client’s underwriting guidelines, or how a particular manufacturing line flags defects, or the exact format a compliance team needs for regulatory filings. A specialist model trained on that proprietary data can. And the training cost is a fraction of what it takes to build a frontier model.

Where the Moat Actually Is

If general-purpose LLMs are commoditizing — and they are — then the moat isn’t who has the biggest model. The moat is how fast you can take proprietary data, train or fine-tune specialist models on it, and wire those models into agentic workflows that actually get work done. It’s the combination of data access, model development speed, and the architecture to make specialists and generalists work together effectively.

We’re touching on territory that overlaps with foundation models here, but I’d argue the concepts are still different. Foundation models want to be general-purpose bases. What we’re talking about is purpose-built models for narrow domains, using data nobody else has.

Looking Ahead

The era of “bigger is always better” is fading. What’s replacing it is a more practical reality: the companies that win the next phase of AI aren’t necessarily those with the largest models. They’re the ones with the best proprietary data, the fastest fine-tuning pipelines, and the most thoughtful agentic architectures that let specialist and generalist models work together.

For those of us building in this space, the takeaway is pretty straightforward: the moat isn’t the model. It’s the data, the domain expertise, and how fast you can turn those into specialist models that plug into agentic systems and make real workflows more productive. That’s where the value is.