The Efficiency Paradox: Why Smaller, Smarter AI May Outcompete the Giants
When the asteroid struck the Yucatán Peninsula 66 million years ago, it wasn’t size that determined survival. The largest, most dominant creatures on Earth—dinosaurs that had ruled for 165 million years—vanished within months. Meanwhile, small, adaptable mammals that could adjust their behavior, alter their diets, and modify their metabolic strategies inherited the planet. The lesson was clear: in times of rapid environmental change, adaptability trumps raw power.
We may be watching that same pattern unfold in artificial intelligence.
Sara Hooker spent years at the heart of the scaling revolution. As VP of Research at Cohere, she watched the industry pour billions into ever-larger language models, each new generation requiring exponentially more computing power than the last. GPT-3 used 175 billion parameters. GPT-4’s architecture remains secret, but estimates suggest it’s an order of magnitude larger. The assumption driving this arms race is simple: bigger is better. More compute equals more capability.
Now Hooker has walked away from that world to start Adaptable Intelligence, a company betting on the opposite thesis. She’s wagering that the future belongs not to the largest models, but to the smartest ones—systems that can learn continuously, restructure themselves on the fly, and evolve without the metabolic costs of constant retraining.
“The question isn’t whether we can build bigger models,” Hooker argues in her recent paper critiquing compute-based governance frameworks. “It’s whether we should—and whether size is even the right metric for measuring intelligence.”
The Metabolic Crisis of Scale
The scaling paradigm has a problem that ecologists would recognize immediately: it’s metabolically unsustainable. Training GPT-3 reportedly consumed 1,287 MWh of electricity and produced roughly 552 tons of CO2. GPT-4’s training costs remain undisclosed, but industry insiders whisper figures in the tens of millions of dollars for compute alone. And unlike biological organisms that grow once and then maintain themselves relatively cheaply, current AI models must be completely retrained to incorporate new information.
Imagine if every time a human needed to learn a new fact, their brain had to be rebuilt from scratch. It’s an absurd proposition in biology, yet it’s exactly how we’ve been building artificial intelligence.
The economic implications are staggering. Only a handful of organizations on Earth can afford to train frontier models. This concentration of capability mirrors the dynamics of megafauna in prehistoric ecosystems—only environments rich enough in resources could support creatures of that scale. But just as those ecosystems proved fragile, the computational monoculture emerging around massive models may be more vulnerable than it appears.
The Mammals in the Room
While the giants battle for supremacy, something different is emerging from research labs. At MIT, a team has developed Self-Adapting Language Models (SEAL) that challenge the very architecture of how we build AI. Rather than freezing knowledge at a training cutoff date, SEAL systems continuously update their internal representations as they encounter new information. The model learns not just during an initial training phase, but throughout its operational life.
The innovation borrows directly from neuroscience. Human brains don’t work like static databases. They use a process called memory reconsolidation—every time we recall information, we have the opportunity to update it, strengthen it, or revise it based on new context. SEAL implements a similar principle: a meta-learning framework that allows the model to recognize when its existing knowledge is insufficient and autonomously integrate new information.
The efficiency gains are remarkable. Instead of retraining a 175-billion-parameter model from scratch to update its knowledge—a process that might cost millions of dollars and weeks of compute time—a SEAL-based system updates only the relevant portions of its knowledge graph. It’s the difference between renovating a single room and rebuilding an entire house.
Liquid Architecture and the Shape-Shifting Mind
If SEAL represents an evolution in how models learn, Liquid AI frameworks represent something more radical: architecture that refuses to stay fixed. Traditional neural networks have a structure determined at design time—the number of layers, the connections between neurons, the overall topology of the system.
That structure might be optimized during training, but it remains fundamentally static during deployment.
Liquid neural networks throw out that assumption entirely. Inspired by biological brains that continuously rewire themselves in response to experience, liquid architectures use continuous-time models and adaptive computation graphs. The network can grow new connections, prune unnecessary ones, or fundamentally restructure its organization based on the problems it encounters.
The biological analogue is striking. Consider the brain of a London taxi driver. Studies show that their posterior hippocampus—the region involved in spatial navigation—is significantly larger than average. This isn’t something they were born with; it developed through years of navigating the city’s intricate street network. Their brains physically adapted to the cognitive demands of their environment.
Liquid AI aims for the same kind of structural plasticity in artificial systems. A model deployed to analyze medical imaging might develop more elaborate visual processing pathways. One focused on language translation might strengthen cross-linguistic mapping structures. The architecture becomes a reflection of the actual problems it needs to solve, rather than a best-guess structure imposed during design.
The Competitive Landscape Shifts
The strategic implications are profound. If adaptability can match or exceed the capabilities of brute-force scaling, the competitive landscape of AI transforms overnight. Suddenly, organizations that can’t afford to spend $50 million training a frontier model have another path forward. Universities, research labs, and startups that have been priced out of the scaling race could leapfrog the incumbents with smaller, smarter systems.
The venture capital world is starting to notice. Hooker’s Adaptable Intelligence isn’t alone in betting against scale. A cluster of startups is exploring efficient architectures, continual learning systems, and models that achieve strong performance with a fraction of the parameters. They’re essentially placing an evolutionary bet: in the rapidly changing environment of AI deployment, adaptability will prove more valuable than size.
The bet looks increasingly sound when you examine what’s happening in deployment contexts. Large models are expensive to run and slow to respond. They require specialized hardware and massive bandwidth. They’re difficult to customize for specific domains without fine-tuning that approaches the cost of training a smaller model from scratch. And they can’t update themselves when they encounter new information or make mistakes without human intervention.
Adaptive models sidestep many of these constraints. They can run on modest hardware, update themselves autonomously, and specialize for particular domains without expensive retraining. They’re closer to what intelligence looks like in nature: responsive, efficient, and continuously evolving.
The Dinosaur’s Dilemma
The analogy to evolutionary history isn’t perfect, but it’s instructive. Dinosaurs didn’t disappear because mammals were inherently superior. They disappeared because the environment changed suddenly and dramatically, and mammals had the metabolic and behavioral flexibility to adapt. Their smaller size meant lower energy requirements. Their more sophisticated thermoregulation let them function in a wider range of temperatures. Their varied diets allowed them to exploit new food sources as ecosystems collapsed and reformed.
The AI industry isn’t facing an asteroid strike, but it is confronting multiple pressures that favor adaptability over scale. Energy costs and environmental concerns are making massive training runs increasingly untenable. Regulatory frameworks are beginning to scrutinize the environmental and social costs of frontier models. The practical limitations of deploying models that require data center infrastructure for every inference are becoming apparent.
Meanwhile, the problems we need AI to solve are becoming more dynamic. Medical knowledge evolves daily. Financial markets shift by the second. Scientific understanding advances in unpredictable leaps. We need AI systems that can keep pace with those changes, not models that crystallize knowledge from six months ago and then remain frozen until their next expensive retraining cycle.
The Synthesis Question
Perhaps the most interesting possibility is that this isn’t a binary choice. Evolution rarely produces clean winners and losers; it produces ecosystems with niches. We might be heading toward an AI landscape where massive, broadly capable foundation models coexist with smaller, specialized adaptive systems that can modify themselves for particular contexts.
The foundation models could provide general capabilities and serve as initialization points, while adaptive layers allow rapid specialization and continuous learning without rebuilding the entire stack. It’s analogous to how mammals share a common body plan but have radiated into an extraordinary diversity of forms—each optimized for its particular ecological niche through adaptive modification.
What’s becoming clear is that the scaling paradigm, taken to its logical extreme, leads to a dead end. You can’t keep doubling model size and training compute indefinitely. At some point, the costs—financial, environmental, and practical—become prohibitive. The question isn’t whether we’ll hit those limits, but whether we’ll develop alternatives before we do.
The researchers working on adaptive systems, liquid architectures, and continual learning frameworks are essentially placing a bet on the laws of physics and economics. They’re betting that intelligence, in the long run, is about efficiency rather than raw power. About doing more with less. About adapting rather than overwhelming.
In evolutionary terms, they’re betting that the mammals will inherit the Earth.
Sixty-six million years of history suggests that’s not a bad bet to make.



