Monday, January 27, 2025

Did the Leaders in AI Get It All Wrong?

By Jeff Brown, Editor, The Bleeding Edge

It's absolute chaos right now in Silicon Valley.
Something extraordinary happened… that has completely caught an entire industry off guard.
And the implications of this latest development are forcing the entire industry to question its approach to developing artificial intelligence (AI).
Many are now in full-blown panic mode, and most are trying to figure out the implications…
The implications… of DeepSeek.
DeepSeek's Bombshell
The bombshell announcement over the weekend was the release of some brand new AI research out of Hangzhou, China-based hedge fund, High-Flyer Capital Management.
High-Flyer Capital released a brand-new large language model (LLM) called DeepSeek-R1. For those interested, you can find the research here.
What shocked the industry and the tech markets today – aside from it not coming from Silicon Valley – is the incredible performance of DeepSeek-R1, as shown below, compared to OpenAI's o1 foundational model.
Benchmark performance of DeepSeek-R1 | Source: DeepSeek
As we can see above, DeepSeek-R1 performed on par with OpenAI o1 – OpenAI's current production model. On three of the metrics shown above, it was slightly better. And on the other three, slightly lower. Either way, the results were impressive.
But the story behind DeepSeek is far more interesting…
Did DeepSeek Just Kill Big Tech?
The founder of DeepSeek, Liang Wenfeng, founded High-Flyer Capital Management (HFCM) in 2015.
HFCM specializes in using machine learning to identify patterns in the movement of stock prices. The firm has been very successful, and Liang became a billionaire with it.
Because the firm was AI-centric, he and his team became very skilled at working with NVIDIA chips.
In 2021, HFCM acquired 10,000 H800 NVIDIA semiconductors… and then in 2023, Liang launched DeepSeek.
The goal of Liang and DeepSeek was to develop human-level AI. But the team at DeepSeek had a major challenge, due to U.S. export restrictions: It couldn't access the same kind of bleeding-edge semiconductors as companies like OpenAI, Microsoft, Meta, xAI, and Anthropic.
DeepSeek had no choice but to get creative.
And here's why there is absolute chaos right now…
DeepSeek was able to develop DeepSeek-R1, a language model capable of reasoning with performance on par with OpenAI o1, for roughly $5.5 million.
For context to this number, consider the amount of capital U.S.-based AI companies have raised for the development of their own foundational models:
  • OpenAI has raised almost $24 billion to date
  • Anthropic has raised $15.75 billion
  • xAI has raised more than $12 billion
  • Perplexity has raised almost $1 billion
  • And the industry as a whole has spent hundreds of billions building out AI factories in the race to build an artificial general intelligence (AGI).
But wait a minute…
DeepSeek was able to create a best-in-class AI for $5.5 million?!?
That's why many in Silicon Valley are in panic mode right now. Did they get it all wrong? Did a small, China-based company outsmart them all? Are they using the wrong architecture? Is the entire logic and approach to AI development fundamentally wrong?!
And worse: Is Big Tech dead?
If we look at a chart of NVIDIA (NVDA), we might think so…
1-month chart of NVIDIA (NVDA) | Source: Bloomberg
NVIDIA was hammered overnight with the news and is trading down more than 16% today. That's incredible, considering it's a $3 trillion company.
The reason is simple.
If a small, unknown company in China can develop a best-in-class AI model for just $5.5 million, then maybe the industry doesn't need all of those GPUs – the likes of which are NVIDIA's and AMD's. Hence, the sudden sell-off.
Maybe they don't need the hyperscale data centers… the outsized energy requirements… the whole shebang?!?
The implications of this have heads spinning right now in Silicon Valley. And it's not just the software engineers going crazy – it's the management teams of those that have invested so heavily in AI, as well as the venture capital firms that have been funding the absolutely insane levels of investment.
So what was the secret? How did DeepSeek do it?
Recommended link

These strange facilities are popping up all over the country. Nvidia's CEO calls them "AI factories." And he believes they'll kick off… “A new AI industrial revolution [that] will be as transformative and incomprehensible to many as the electricity revolution was.” That's why I recently traveled to Wisconsin to investigate one of those facilities that belongs to Bill Gates.

(Click here to see what I uncovered)

What's Novel About DeepSeek's Approach?
It's technical, but at a high level, here's the approach the team at DeepSeek took:
  • DeepSeek used 8-bit floating point numbers as opposed to 32-bit floating point numbers. This isn't as precise, but it is far more efficient with computational resources.
  • The model uses a multi-token prediction model rather than a single-token prediction model. We can think of the tokens as the outputs of the model (i.e. the answer to a prompt). By using a multi-token model, it doubles the inference efficiency while being almost as accurate as a single-token model.
  • They used a mixture-of-experts (MOE) architecture with some innovation around load balancing. This allows them to have a massive model… but only a small portion of that model is active at any time. The active portion of the model depends on the task it is trying to solve. Doing so reduces computational resources and improves efficiency.
  • DeepSeek uses a form of reinforcement learning called Group Relative Policy Optimization (GRPO) to improve reasoning capabilities. This technology compares different responses and selects the best one, which greatly reduces computational overhead compared to the prevailing critic model.
  • Compression of the key value indices – which is how individual tokens are represented in the architecture – results in more than 90% compression ratios and dramatically lower memory requirements.
It's not necessary to understand all of the technical details as to how DeepSeek accomplished the feat for so little. What's key to understand is the result – DeepSeek is about 45 times more efficient than OpenAI o1.
And this means that it is dramatically cheaper to use:
OpenAI o1 DeepSeek-R1
Pricing 1M input tokens $15 $0.14
Pricing 1M cached tokens $7.50 $0.55
Pricing 1M output tokens $60 $2.19
Tokens are basically small blocks of words, part of a word, or software code, etc. which are used in both inputs and outputs of large language models. And DeepSeek's pricing is more than 95% cheaper than OpenAI's o1.
And that's not all. The software has been open-sourced.
This raises the question again: Why are companies spending hundreds of billions of dollars to build these foundational models… when the software is open-sourced and can be built for a few million?
And this is where there are some large caveats that the market is missing entirely.
The Story No One Is Telling You About DeepSeek
As I've been tracking the finance and tech community's response to this news over the weekend, here's the part of the story I'm not seeing being communicated…
  • DeepSeek V3 may have only cost around $5.5 million to build and train – and it is more efficient to run – but they absolutely spent a lot more money developing earlier versions as they ramped up the technology. I suspect they spent tens of millions to get to V3 – none of it mentioned. Still, yes, it's a lot less than hundreds of millions. But it's not as little as is being proclaimed.
  • DeepSeek also benefited from other open-source models made available, such as Alibaba's Qwen 2.5 and Meta's Llama3, which both open-sourced some of their large language models and were used in some of the development of DeepSeek's models.
  • DeepSeek is not actually open-source. They have released the weights of the model so that developers can work with DeepSeek, but they have not released any details about the training data or the processing software code. Said another way, there is no transparency.
Many were also quick to test DeepSeek-R1 for any influence by the Chinese Community Party (CCP). After all, any meddling here could imply a far greater concern.
Sadly, when the model is queried about Tiananmen Square, China's human rights abuses with the Uyghurs in Xinjiang, or China's efforts to erase Tibetan culture, we simply won't get an answer.
DeepSeek demonstrates the same kind of political bias that both Meta's and Google's models have demonstrated.
And there's an important point we'll have to save for another day regarding the accuracy of the model, as well.
The point is that sacrifices were made in the name of efficiency for DeepSeek… and much of those sacrifices had to do with the accuracy of the outputs.
We haven't yet seen how DeepSeek performs on the most difficult AI benchmarks – that's the most critical piece on the road to building an AGI.
But none of this has stopped people from downloading DeepSeek – the AI Assistant smartphone app. In a matter of days, DeepSeek overtook ChatGPT on the App Store:
This all happened basically overnight. And I can't help but find the timing suspicious.
Is It a Coincidence?
On January 17, the U.S. Supreme Court ruled to uphold a ban of TikTok in the U.S. – a China-based application that has been deemed a foreign adversary-controlled application.
TikTok has not only been a tool for China to conduct a massive psyops on Western civilization, it is the single largest intelligence gathering tool of the CCP as it collects data from U.S. consumers' phones and sends that information back to Beijing.
While President Trump has paused the ban for 75 days to pursue a resolution that can protect U.S. national security, the working assumption of the Chinese government has been that TikTok would be banned.
Is it that much of a leap to think that the CCP funded the development of a new, wildly popular app that would be downloaded as a new data surveillance tool to replace TikTok? While I'm just connecting the dots and speculating, this would make perfect sense to me.
And China – which has been way behind in the development of artificial intelligence – wants two things to happen:
  • It wants to catch up, even if it means the theft of Western intellectual property and software models, a well-known practice with a long history
  • It wants to find a way to slow down U.S. development in artificial intelligence
As we don't have access to the data used for training DeepSeek or the software code, despite being called "open source," it's likely that DeepSeek wasn't just the result of great ingenuity.
And what better way to disrupt the capital formation in Silicon Valley around artificial intelligence than to present a model that was "built for just $5.5 million", suggesting that Project Stargate and all the efforts of Amazon, OpenAI, Microsoft, xAI, Anthropic, Apple, Perplexity, and so many others are the "wrong," wasteful approach.
China just threw a curveball to Silicon Valley and its backers, wanting them to hit "pause" and question their approach to AI.
But let's think about that for a moment…
Will it work? Will they slow down? Will they stop buying NVIDIA and AMD semiconductors?
Will the U.S. pivot… reduce investment… and adopt a more frugal approach to research and development instead like DeepSeek?
I think not.
The sheer threat that a small China company might catch up to the best-in-class models out of the U.S. will only light a fire under the U.S. government and Big Tech to lean in even more. It is a matter of national security. And the more legitimate a threat becomes, the more focus the technology will receive.
Whether the radical curveball came from DeepSeek or some other company… whether it's something to fear or not…
This is a no-holds-barred race for AI supremacy. Anything goes. This technology will not only unleash the greatest productivity boom in history… but it can also be used for both offensive and defensive capabilities against adversaries.
Never before has so much focus been on a single technology. So much mindshare, capital, and technical talent are why it's inevitable we're going to see spikes of innovation and big leaps forward.
It's evolving quickly in real-time.
And we shouldn't forget that in the weeks ahead, we'll be seeing the latest releases of Anthropic, xAI, Google, and OpenAI, at a minimum.
As a reminder, the end game isn't just a fabulous AI assistant (w/agentic AI). It's artificial general intelligence.
And the more capable AI becomes only means that utilization is going to skyrocket. The focus will soon shift from training to inference, which will benefit AMD, Cerebras, and Groq even more due to their unique semiconductor architectures.
The growth in AI adoption will, by far, outweigh any efficiency gains in training and running large language models.
It's not like we haven't seen this story before.
Computing used to be centered around mainframe computers. And as semiconductors got more powerful and software code improved, computers not only got smaller, they got a lot cheaper – cheap enough to have several in every home and one in every hand (smartphones).
Did that shrink the market for computing… or grow it?
It grew it… exponentially.
And that's precisely what's going to happen with AI.
Jeff
P.S. I've put my boots on the excavators clearing land for the next hyperscale data centers across the country. I've seen the chips and tested the software and have been researching very closely, for years now.
The AI buildout is a trend I've been tracking very closely, and one we're heavily invested in. Do I believe this is a buying opportunity for many of the stocks getting hammered right now? Absolutely.
For more as we track DeepSeek in the days and weeks to come, I encourage all my readers to join our flagship investment advisory,The Near Future Report, for more.

Keep reading
If this week has shown anything, it's that we have an exciting year ahead in AI and digital assets…
The commissioning of the most powerful supercomputer in history has barely evoked a yawn…
Nothing like Stargate had ever been suggested before…

Like what you're reading? Send your thoughts to feedback@brownstoneresearch.com.

Brownstone Research
1125 N Charles St, Baltimore, MD 21201
www.brownstoneresearch.com

To ensure our emails continue reaching your inbox, please add our email address to your address book.

This editorial email containing advertisements was sent to riku221199@gmail.com because you subscribed to this service. To stop receiving these emails, click here.

Brownstone Research welcomes your feedback and questions. But please note: The law prohibits us from giving personalized advice.

To contact Customer Service, call toll free Domestic/International: 1-888-512-0726, Mon-Fri, 9am-7pm ET, or email us here.

© 2025 Brownstone Research. All rights reserved. Any reproduction, copying, or redistribution of our content, in whole or in part, is prohibited without written permission from Brownstone Research.

No comments:

Post a Comment