AI's Sputnik moment

Has DeepSeek ignited an AI arms race that will reshape productivity and investment worldwide?

AI's Sputnik moment
Photo by SpaceX / Unsplash

A lot of you have been sending in questions about DeepSeek after I briefly covered it last week, only for Nvidia's share price to crash and an obscure Chinese company to cop the blame, so here's a more comprehensive update for you.

A few weeks ago, a Chinese company called DeepSeek released its latest AI model, DeepSeek-V3, to the world. Last week it followed that up with DeepSeek-R1, which included a 'train of thought' mode (DeepThink) previously exclusive to paid versions sold by the likes of US-based competitors OpenAI and Anthropic.

DeepSeek itself appears to be an offshoot of a quant company, which "happened to own a lot GPU for trading/mining purpose[s], and DeepSeek is their side project for squeezing those GPUs".

More on the background here for those interested. No doubt the Chinese government is now also deeply invested given the recent attention, so all your data are belong to us etc. Don't tell it anything sensitive!

There are also accusations flying from OpenAI that DeepSeek researchers were "exfiltrating a large amount of data using the OpenAI application programming interface [API]", which should be very familiar given that's basically what they did to get started (a group of news organisations including the NYT are currently suing OpenAI for data exfiltration):

Necessity is the mother of invention

All of that aside, DeepSeek-R1 is a great model and I've been using it for most of my mundane needs for the past week. Definitely the best available for free. And that's because what's especially remarkable about DeepSeek is that it's cheap, with the cost to run an API query ~95% less than its rivals. It was also much cheaper to train the model itself:

"Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data."

That's just for the final run and doesn't include all the other costs of running the business. But compared to what OpenAI and Anthropic spent on their final runs, it's a decent improvement (we don't have precise data, but Anthropic's CEO said Claude 3.5 Sonnet cost them "a few $10M's to train (I won't give an exact number)".

So really, the big savings are in operational costs. And how do they achieve those? In graphical form, here's how:

Basically, because the latest and greatest Nvidia chips were difficult to come by through legal means, DeepSeek had to use what it had available – a huge array of lower-capacity Nvidia H800 chips – and come up with a solution. Essentially, it rethought the whole process and found ways to economise at various points along the training path. Those then compounded to create huge savings in the end model, R1.

The US government's export controls on Nvidia chips, at least for proponents like OpenAI, may have backfired (more on the winners and losers below). As AGI-sceptic Gary Marcus wrote:

"[T]he truth is that so far those controls have not been terribly successful; and may even have had paradoxical effects. China just largely caught up despite existing controls, and indeed were perhaps forced harder towards important efficiency gains precisely because necessity is the mother of invention."

A new Space Race

On Monday, venture capitalist Marc Andreessen tweeted that "DeepSeek R1 is AI's Sputnik moment". For those not familiar, Sputnik 1 was the first satellite successfully launched into space. Owned by the former USSR, at the time a fierce geopolitical rival of the US, it launched a rocket (excuse the pun) under the US and effectively triggered the Space Race:

"[President] Eisenhower greatly underestimated the reaction of the American public, who were shocked by the launch of Sputnik and by the televised failure of the Vanguard Test Vehicle 3 launch attempt. The sense of anxiety was inflamed by Democratic politicians, who portrayed the United States as woefully behind. One of the many books that suddenly appeared for the lay-audience noted seven points of 'impact' upon the nation: Western leadership, Western strategy and tactics, missile production, applied research, basic research, education, and democratic culture. As public and the government became interested in space and related science and technology, the phenomenon was sometimes dubbed the 'Sputnik craze'."

China's launch of DeepSeek may well have triggered an AI craze, if we weren't already in one. On Tuesday, Donald Trump responded in a way that would appear to confirm Andreessen's comparison:

Former IMF chief economist Olivier Blanchard described the arrival of DeepSeek as:

"Probably the largest positive one day change in the present discounted value of total factor productivity growth in the history of the world".

I doubt that. But perhaps Blanchard will be correct in the post-internet period, where productivity growth has been especially difficult to achieve.

Essentially, AI is rapidly becoming a marginal cost commodity cheap enough that entrepreneurs can build on it for just about everything, improving productivity in areas we previously thought impenetrable.

Possible winners and losers

Much more efficient and easily trained AI is bad news for the incumbent AI companies themselves, who were probably hoping to become something like an Apple, Google, Meta, or Microsoft, i.e. quasi-monopolists with trillion-dollar market caps. But as I've been writing about for years, there never was a moat for these AI companies:

"[I]n the arms race of AI, there are no barriers to slow down potential competitors. The technology itself is not new, and every day people are coming up with new ways to do "with $100 and 13B params that we struggle with at $10M and 540B".

The biggest winners will be consumers (⬆️ consumer surplus), followed by companies that can build on AI, including the tech giants I mentioned in the previous paragraph. That's perhaps why Microsoft's CEO Satya Nadella appeared to be stoked at the release of DeepSeek-R1:

"Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of."

The Jevons paradox, for those unfamiliar, describes the process by which efficiency gains can cause people to use more of the product that was made more efficient, not less. For example, if a driver replaces a gas guzzling car with a hybrid, the cheaper running costs might lead them to drive further and more frequently.

Applied to AI, cheaper running costs will make software development more efficient, so people will demand more and better software, which will in turn increase demand for more AI.

From that perspective, the market backlash against Nvidia might have been a bit extreme:

Nvidia is best thought of as a company that sells picks and shovels to gold miners. If you think the gold rush will run out of steam, then so too will Nvidia.

However, if you take Nadella's view, more economical AI will only increase demand for gold – meaning Nvidia will have plenty of buyers for its shovels, supporting its lofty valuations.

AI journalist Tim Lee even suggested that the Nvidia rout had more to do with Trump than China:

"DeepSeek's models were trained using Nvidia chips, so it's not obvious why DeepSeek's success would be bad news for Nvidia. And it's even harder to explain why it took a week for Wall Street to react to the January 20 release of R1.

A more plausible explanation is that someone tipped traders off to Donald Trump's plans to slap tariffs on chips made in Taiwan—which Trump announced later in the day. I can't prove this theory, but I think it fits the facts better than the DeepSeek theory. Interestingly, we didn't see a second selloff in Nvidia or TSMC shares after Trump's announcement, suggesting that markets had already 'priced in' the news."

There are other theories out there, too. Sticking with the gold metaphor, perhaps DeepSeek has shown that we don't need to dig as deep for gold, and we can use lighter, cheaper shovels than those being made by Nvidia. Those shovels might even be made by Apple, whose chips running DeepSeek-R1 are "4x more cost efficient per unit of memory than AMD MI300X and 12x more cost efficient than NVIDIA H100".

Apple's share price, for what it's worth, is up 7.7% this week.

Basically, we're living in very uncertain times, and no one knows who or what is going to profit from the AI Space Race. Given that stock prices measure the present value of future cash flows and profits, expect plenty of volatility as people try to figure it all out!

Have a great day.