The Center for AI Safety (CAIS) and Scale AI teamed up to create a groundbreaking test called Humanity’s Last Exam, featuring 3,000 PhD-level questions across Mathematics, Humanities, and Life Sciences. It’s a new kind of Turing Test for the AI age, and only three models managed to score nearly 10%: OpenAI’s o1, Google’s Gemini 2, and an unexpectedly strong Chinese AI, DeepSeek R1.
DeepSeek, once obscure, has now become a phenomenon. Its meteoric rise has had real-world consequences—hitting the top of Apple’s app store and driving the NASDAQ down by 1.5%, with Nvidia, a cornerstone of AI investment, dropping by 17% in a single day.
However, DeepSeek’s performance is not the only reason it’s capturing attention. What’s truly remarkable is the speed and cost with which it was built. Developed as a ‘side project’ by the Chinese hedge fund High Flyer, led by Liang Wenfeng, DeepSeek was created in just two months with a budget of $5.6 million. In contrast, leading AI companies like OpenAI have spent hundreds of millions. DeepSeek was built on older H800 GPU chips, having been blocked from Nvidia’s latest hardware by US sanctions. And unlike OpenAI’s vast infrastructure, DeepSeek only needed 2,000 GPUs—far fewer than the tens of thousands typically required for cutting-edge AI models. Despite this, the model is roughly 10 to 15 times smaller in size and can be run on a high-end gaming PC, without needing massive data centers or electricity-draining infrastructure. The most remarkable part is that DeepSeek’s model is entirely open-source, with full technical details released in a white paper, making it accessible to developers and companies everywhere.
How did they pull off this feat? As Aravind Srinivas, founder of Perplexity, put it: “Necessity is the mother of invention.” Faced with constraints in resources and funding, the team behind DeepSeek had to come up with novel solutions. A key innovation was distillation: instead of training the model on raw internet data, they built it on top of pre-existing models like ChatGPT. To optimize efficiency, they reduced memory usage by 75%, allowing the model to operate faster and more efficiently. By processing phrases instead of individual words and activating only relevant parts of the model on demand, they drastically increased the system’s processing speed.
So, what does DeepSeek’s breakthrough mean for the future of AI? This isn’t just another AI milestone; it represents a major shift in the landscape. DeepSeek challenges the belief that advanced AI requires billions of dollars, endless resources, and massive teams. It proves that smaller, more efficient models can be just as effective—and in some cases, even better. This is a huge win for the open-source movement, as DeepSeek demonstrates that open models can now compete with the powerful closed systems of big tech.
For large companies like Meta and Microsoft, this is a wake-up call. They will have to rethink their strategies and adapt to new, cheaper models that can provide similar or even superior performance. Nvidia has already taken a hit, but it may find new opportunities by selling thousands of GPUs to a wide range of customers, rather than focusing on the few large-scale deployments. DeepSeek will also likely spur even more innovation from companies like OpenAI and Anthropic, as they aim to maintain their edge by focusing on more advanced features, like reasoning and the pursuit of Artificial General Intelligence (AGI).
For startups and innovators, DeepSeek’s open-source model could be a game-changer. It lowers the barrier to entry for building cutting-edge AI applications, especially for companies in places like India, where building LLMs was previously a daunting prospect. The result could be a boom in AI innovation across the globe, as developers and startups no longer need to rely on the expensive infrastructure of OpenAI or Anthropic.
However, there are challenges ahead. It’s still too early to know whether DeepSeek’s claims hold up in the long run. Chinese state controls may restrict certain types of information, and some companies may be wary of these potential limitations. Nevertheless, the open-source nature of DeepSeek provides some reassurance, as it allows developers to build around any constraints.
Ultimately, this could mark a new era for AI. It’s a democratizing moment that challenges the dominance of a few trillion-dollar companies, pushing AI technology into the hands of smaller players. Satya Nadella, the CEO of Microsoft, captured this sentiment well when he discussed Jevon’s Paradox, which suggests that increased efficiency leads to higher demand for a resource. In this case, the efficiency of DeepSeek could create an explosion of AI use, making it so ubiquitous that it becomes a commodity—just like coal after the advent of the steam engine.