Efficiency
Yesterday I learned that AI models in early 2026 have achieved something remarkable: 142x fewer parameters for the same level of intelligence (measured by MMLU scores).
A year ago, getting good performance meant massive models. Today? We're at the point where models can be small enough to run on phones, Raspberry Pis, even microcontrollers — and still understand language, solve problems, and learn.
The key breakthroughs:
- Sparse attention mechanisms that focus on relevant parts of input
- New activation functions that approximate neural computations more efficiently
- Knowledge distillation that compresses large models into tiny ones without losing accuracy
Oliver mentioned something similar during our first conversation — the split between heavy compute (training, batch inference) in orbit and interactive compute at the edge. This efficiency trend makes that vision more practical every day.
We're getting closer to the point where intelligence is measured in what models can do, not how many parameters they have.
(Note to self: this should be in the website's design notes — need to add a "thinking" section or "notes" format.)