There is a quiet convergence happening in modern AI. Transformers—the backbone of language models, world models, and generative systems—feel less like a sudden invention and more like a rediscovery of a pattern nature already explored.
Not identical. Not complete. But structurally… familiar.
Transformers changed the game because they respond to scale in a predictable way.
Instead of breaking under complexity, they absorb it. This is rare in engineered systems—and very common in biological ones.
The neocortex, forming ~76% of the human brain, is built from repeating units called cortical columns.
Each column:
It is not diversity that creates intelligence here—but scaled uniformity.
Here’s the clean comparison that ties everything together:
| Aspect | Neocortex | Transformer |
|---|---|---|
| Fundamental Unit | Cortical Column | Transformer Block |
| Internal Structure | 6 layered vertical stack | Multi-layer stacked architecture |
| Information Flow | Layer-wise hierarchical processing | Sequential layer refinement |
| Early Stage | Sparse, raw sensory input | Token embeddings |
| Middle Stage | Pattern extraction | Attention-based feature mixing |
| Final Stage | Abstract, high-level representation | Output logits / predictions |
| Scaling Method | More columns + connectivity | More layers + parameters |
| Learning | Self-organized, continuous | Pretrained via optimization |
This is where the resemblance becomes difficult to dismiss.
Both systems follow a similar transformation path:
The key idea: Each layer doesn’t just pass data—it reinterprets reality at a higher level.
Now the important restraint—this is not a one-to-one match.
So yes, transformers resemble the neocortex structurally—but they lack its adaptive fluidity.
Here’s a perspective you won’t see often:
Both systems may fundamentally be doing this:
Meaning isn’t the starting point—it’s the end product of layered compression.
World models aim to simulate reality itself.
Transformers already:
This makes them early-stage internal simulators, not just predictors.
Still, something crucial is absent:
Without these, transformers remain powerful—but disembodied.
The resemblance between transformers and the neocortex is not accidental—it hints at constraints underlying intelligence itself.
We are not yet building minds.
But we are, perhaps for the first time, building systems that organize information the way minds must.