AI Industry Stunned as DeepSeek Shocks Everyone with DeepSeek R1

So, if you haven’t heard yet, there may be a new AI king on the horizon, and that’s the AI ​​assistant DeepSeek. This tool has taken over social media, and as someone who runs an AI YouTube channel dedicated to AI content, I wanted to give you my thoughts on this trend. A lot of people are missing a few important points, and this story is a little deeper than you might think.

By now most people have heard about the AI ​​assistant DeepSeek that has taken over the internet. The system is said to be more efficient than ChatGPT, and even more impressively, about 95% cheaper. The app has actually surpassed ChatGPT as the most downloaded free app in the iPhone App Store. This is quite an achievement considering ChatGPT has dominated the market since its release in November 2022. If a company could surpass them, it would definitely make headlines – and that’s exactly what has happened.

Some may think this is more than just the beginning, and I agree. This is a serious problem for OpenAI, and let me explain why. If you look at the data, you’ll see that, literally, in the last day, DeepSeek challenged OpenAI for market dominance in the AI ​​services space for the first time. This is a big deal because one of the points many of us in the AI ​​community make is that while other companies can sometimes get cutting-edge models, and may even create them themselves, they don’t have as much branding as ChatGPT. However, after today’s events, DeepSeek’s user base appears to be growing rapidly.

For those of you who think this is just a one-day trend, if we look back at the last seven days, we can see that the search volume has been consistently high. Of course, this is just Google Trends, so take this data point with a grain of salt, but it should show you the scope of how big this situation truly is. This entire situation has been absolutely incredible because if a company can make exactly what you’re making for 95% of the cost and it’s even faster, that is going to have some serious forward-looking implications.

Data Privacy vs. AI Innovation: The DeepSeek Dilemma

Some of OpenAI’s employees, of course, are not too happy about this. For example, one OpenAI employee, Steven Hille, said, “Americans sure love giving their data away to the CCP in exchange for free stuff.” A community note clarified that DeepSeek can be run locally without an internet connection, unlike OpenAI’s models. This is quite true, although I doubt most people will have the know-how and the will to do that. In his defense, when we look at a tweet that gained 2.3 million views earlier today, it spoke about how DeepSeek AI collects IP, keystroke patterns, device info, and more, storing it in China where all that data is vulnerable to arbitrary requests from the Chinese state. From their privacy policy, we can see that they are indeed collecting data:

“Automatically collected information: We automatically collect information from you when you use the services, including internet or other network activity information such as your IP address.”

This statement tends to scare a lot of people because it says, “The personal information we collect from you may be stored on a server located outside of the country where you live, and we store the information we collect in secure servers located in the People’s Republic of China.” I know a lot of people aren’t too happy about this. Whether or not you think they’re storing our data in a particular way, we cannot deny that this is a monumental achievement in the AI industry. When you take a look at the benchmarks that the system has been able to achieve, it’s clear that there is some kind of disruption going on, and this is some kind of innovation we must pay attention to.

The craziest thing about all of this is that it cost a fraction of what other companies are spending—billions and billions of dollars every single year—to be able to pump out models that are currently on par with other companies for marginal gains. When we take a closer look at these benchmarks, even the distilled models managed to outperform some of the standard models like GPT-40. This means that many people could realistically run a 70B model at the level of GPT-40, and all that means is that someone can literally run ChatGPT, a really smart model, on their home device, not connected to the internet, with pretty low latency and all the privacy in the world. If that isn’t a big problem, I don’t know what is.

It’s actually quite crazy because this has had some incredible effects worldwide. For example, Nvidia was down 7% today. That’s billions and billions of dollars in value for that company, and its valuation has been slashed by 17%. Nvidia was a trillion company at one point, and now the market is reacting to this news unfavorably. If you can train a model and make a model that is just as effective as one that you need billions of dollars for, people are wondering if we’re about to move to a new paradigm where we don’t need as many GPUs. Of course, if that’s the case, we know Nvidia is the world’s biggest GPU supplier.

The Semiconductor Slump: Unpacking the Industry-Wide Impact and AI Market Forces

This was a surprising thing for me, but I’m going to have a little bit more to say about this later because I think there are some market forces that most people aren’t thinking about. In addition, this actually stunned the industry because it wasn’t just Nvidia that was at peril. We saw that it was a chip bloodbath. If we look at the heatmap for semiconductors and related devices, we can see that AMD was down 5%, AVGO down, Broadcom down 15%, and Nvidia down 14% at the time of making the screenshot. Other semiconductor and related devices were also down. Being able to wipe billions and trillions of dollars off an entire market is really significant. This is the news that impacted the entire industry, and for the first time, that probably isn’t clickbait.

We have to understand what other individuals within the AI space were saying. Someone I’m actually going to give credit to here is Gary Marcus. If you aren’t familiar with Gary Marcus, he is a controversial figure in the AI community. I’m 50/50 on this guy because he makes incredible points about AI companies, but sometimes the way he does them isn’t in the best frame, and it can seem like he’s hating a little bit. In a blog post, he actually called this a year early and has been early when predicting a variety of different things within the AI industry. One year ago today, he said, “OpenAI actually lacks profits and the systems that they have built are hugely expensive to operate because they require massive amounts of compute. At the same time, the general principles for building them have become fairly well-known in the industry. Large language models such as ChatGPT may quickly become commodities, which means we can expect price wars, and profits may continue to be elusive or even modest at least.”

He’s basically saying the secret of ChatGPT is out, and we have tons of open-source models, so what is the defining thing that’s going to keep OpenAI afloat and all these other American AI companies if someone can go on the App Store, download something that is faster, cheaper, and better? He talks about how they require massive amounts of compute for training, estimated in the tens or hundreds of millions of dollars for GPT-4, and that this is going to be a problem, with price wars and profits being elusive at best.

He wasn’t the only one that had something to say. The AI czar David Sacks said that DeepSeek R1 shows that the AI race will be very competitive and that President Trump was right to rescind the Biden EO, which hamstrung American companies without asking whether China would do the same, which they obviously wouldn’t. He’s confident in the U.S. but states that we can’t be complacent. He’s basically stating that we need to ensure that we win this race because if we’re not, China can race ahead, and the AI landscape is becoming increasingly competitive with China’s rapid advancements exemplified by DeepSeek R1.

He’s supporting the decision to rescind Biden’s executive order, implying that it imposed unilateral constraints on U.S. AI companies without guaranteeing reciprocal actions from China. He basically just says we need to lock in here because if we don’t, it’s going to be a terrible scenario. Another thing, the President of the United States was actually surprised. He said the release of DeepSeek AI from a Chinese company should be a wake-up call for our industries that we should be laser-focused on competing to win. We have the best scientists in the world, and this is very unusual—we always have the ideas, and we are always first, so what on earth has happened this time?

Wake-Up Call: The U.S. Must Reclaim AI Dominance Amidst Rising Chinese Competition

I think the U.S. maybe just got too complacent. The dynamics in these two different countries are remarkably different. In China, you have people that are willing to work incredibly hard—not that they aren’t working incredibly hard in Silicon Valley, but those salaries are definitely a lot higher, and those jobs definitely seem a lot more comfortable. This is something that someone actually did speak about a few days ago. It was leaked with regards to Meta, though I’m not sure if I have the information, but it was pretty crazy. They were basically saying that these AI companies are floundering at the moment; they’re currently worried because this has just thrown a complete spanner in the works.

Take a listen to what Trump said because what he said is important—it’s the President: “Last week, I signed an order revoking Joe Biden’s destructive artificial intelligence regulations so that AI companies can once again focus on being the best, not just being the most woken. Today and over the last couple of days, I’ve been reading about China and some of the companies in China. One, in particular, is coming up with a faster method of AI and a much less expensive method. It’s good because you don’t have to spend as much money. I view that as a positive, as an asset. So, I really think if it’s fact and if it’s true—and nobody really knows if it is—but I view that as a positive because you’ll be doing that too. So, you won’t be spending as much, and you’ll get the same result, hopefully.”

The release of DeepSeek AI from a Chinese company should be a wake-up call for our industries that we need to be laser-focused on competing to win because we have the greatest scientists in the world. Even Chinese leadership told me they said you have the most brilliant scientists in the world in Seattle and various places, but Silicon Valley—they said there’s nobody like those people. This is very unusual when you hear DeepSeek, when you hear somebody come up with something. We always have the ideas; we’re always first, so I would say that’s a positive. That could be very much a positive development, so instead of spending billions and billions, you’ll spend less and come up with, hopefully, the same solution. Under the Trump administration, we’re going to unleash our tech companies, and we’re going to dominate the future like never before.”

The crazy thing about this is that Eric Schmidt, a former Google CEO, actually confirmed and said a few months ago that it’s actually critical that we win this race. So, regardless of whether you think about AI from OpenAI or DeepSeek, you have to understand that the reason Eric Schmidt said it’s critical that we win this race is because this is not the situation of “Okay, China just created another chat model that is better than OpenAI’s.” They’ve actually done something really incredible. This is a wake-up call for the United States because they initially thought they were maybe six to twelve months ahead, times they were thinking they were two years ahead, but if they are on par with where they currently are, that means the United States needs to speed up their production in terms of the AI that they’re building.

You have to understand that AGI or ASI (Artificial Super Intelligence) is absolutely game-changing. That is going to be the number one military asset in future societies, and whichever society, in terms of the country or the state, whichever you are, you need—and I cannot state how important this is—in terms of military defenses, ASI is going to be the ultimate thing that these countries are going to be using for their national defense. This is why he said it’s critical that we win this race, and most people don’t realize that. Now, we are literally racing towards superintelligence and AGI.

I’ve done this for 50 years, and I’ve never seen innovation at this scale. This is a literally remarkable human achievement of intelligence, and the things that we can do and the advances in science are on an unprecedented level. There’s a point at which maybe in the next year or two where the systems can begin to do their own research; they’re called AI scientists as opposed to human scientists. So, you go from having a thousand human scientists to a million AI scientists. I think that increases the slope when you’re moving at this pace. It’s very, very hard for your competitors to catch up—that’s the race. It is crucial that America wins this race globally and, in particular, ahead of China.

So, this is something that I think we need to pay attention to as a potential side effect of the current race. I think this year is probably going to move the quickest as companies are going to start deploying more rapidly and iterating even more efficiently than they have before. One of the things that I did want to know was if this is actually real. I was doing some digging and came across a tweet that speaks about how they actually replicated the DeepSeek R10 and DeepSeek R1 training on a 7B model with only 8,000 examples, and they said the results are surprisingly strong.

The reason I’m talking about this research is because it seems like the DeepSeek method was a lot more efficient than OpenAI’s method, and if that’s the case, they should be able to use similar methods on smaller models, and that’s essentially what they did. They used Quin 2.5 math as a base model and then performed some reinforcement learning on it directly—no reward model, no fine-tuning, just 8K math examples for verification—and achieved a pass at 133% on a really decent benchmark, 62%, and then 77.2% at performing the other model. They’re basically stating that this is something that actually works, and the self-reflection thing that we spoke about in the DeepSeek paper actually emerges.

When we take a look at these results, they may look a little bit confusing, but basically, they’re stating that the technique that DeepSeek has employed here does work with these models. It’s pretty crazy. Another thing that I think most people did miss—I mentioned this in a previous video—but the fact that DeepSeek was just a side project for a quant company where they essentially just had some spare GPUs. I think this is a giant wake-up call. If China can do something like this with spare GPUs, it’s going to be a real shock to the industry.

The crazy thing about all of this is that it’s still not over—they are not done. They actually recently dropped Janos, which is essentially a multimodal model that produces images in stunning resolution at a very cheap rate. I think these AI companies are definitely concerned. The AI industry is one that is rife with innovation—innovation that is so crazy that you almost can’t keep up with it. Trust me, I’m coming from a place where sometimes I just think I have eight video topics that I need to upload today, and it is almost impossible for people to keep up with AI news, let alone myself.

This is something that is going to be more and more prevalent as time goes on. I want to talk about how there are similar things going on in the electric vehicle market. Many people may know that electric vehicles are in a situation where people are starting to realize that Chinese cars are costing 0,000 and are remarkable in terms of technology, usability, efficiency, and all these gizmos and gadgets that we get. However, over in the EU and over in the West, we have to pay a huge markup for these cars due to tariffs. What happens when the software is going to be at a stage where it’s just a commodity due to China ramping those costs down?

You have to remember the user is always going to be doing what’s best for themselves, and it’s going to be increasingly hard to prevent you from being able to download open-source software, especially if it’s remarkably more effective than the current software that you’re using. Nvidia, the big dog in the AI industry, has officially spoken. They’ve said that DeepSeek is an excellent AI advancement and a perfect example of test-time scaling. DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export control compliant.

You have to understand that for Nvidia, this isn’t necessarily bearish information. There are rumors, which I probably will get into in another video, that they actually have significantly more GPUs than they are letting on. If that information does come to surface, then the market may adjust its expectations on what they think about these models. Overall, it goes to show that Nvidia has been paying attention, and they show that Nvidia isn’t backing down—they are fully invested in the space and are like, “Look, this is, of course, innovation, and innovation is great, and we’re going to continue selling these GPUs.”

The craziest thing about all of this is that labs are starting to freak out. I actually saw this post from Meta. Meta is a company that heavily invested in AI and has reaped the rewards; their stock has been soaring through the roof. The problem is that most people don’t realize that DeepSeek may have just stolen Meta’s entire pie. Most people don’t realize that there is this entire open-source industry. If you’re watching this and you’re just a normal person who’s interested in the AI industry, Meta has a huge hold on the open-source industry because they release open-source models all the time; they’re pretty small and very nifty to run on a variety of different devices.

But you have to understand that Meta is a billion-dollar company; you can’t have Chinese companies making things super cheap and super efficient for a fraction of the cost—that’s going to eat into their bottom line. Meta was hoping that the Llama 4, which is their next iteration of their open-source model, was going to be the thing that just crushes the open-source industry, and they would basically be the industry leaders and could make a ton of money from partnerships and other collaborations.

However, this post from a few days ago, which I didn’t really want to comment on because it was pure speculation—but I will show you that it actually is true—talks about how DeepSeek V3 rendered Llama 4 already behind in benchmarks, and this was basically DeepSeek’s prior model, so this wasn’t even the R1 model that everyone is freaking out about now. It says, “Adding insult to injury was the fact that it managed to put Llama 4 behind the benchmarks, and an unknown Chinese company with a .5 million training budget.”

It also says, “The engineers are moving frantically to dissect DeepSeek and copy anything and everything we can from it, and I’m not even exaggerating. Management is worried about justifying the massive cost of a generative AI org. How would they face leadership when every single leader of a gen AI org is making more than what it costs to train DeepSeek entirely, and we have dozens of such leaders?”

Most people don’t realize just how much people in the AI industry are getting paid. Honestly, I think it’s worth it because these people are ridiculously smart, and AI talent is truly scarce. These companies have billions of dollars, so they’re not going to skimp out on salaries when they’re looking to ensure they have the brightest minds for securing the AI future.

The crazy thing now is that they’re like, “Wait a minute, we pay 12 people maybe million a year collectively, and now we look at that and think, what on earth are we doing when we can literally just get a model done for a fraction of the cost of that training run? What on earth are we doing here?” So, this is something that is freaking them out, and the crazy thing about this is that they didn’t even realize that DeepSeek R1 was even coming, and DeepSeek R1 basically made things even scarier.

DeepSeek’s AI Breakthrough Shakes Up the Industry: Meta Scrambles as Nvidia Praises Cost-Effective Innovation

It says, “I can’t reveal confidential information, but it should be soon public info anyway.” You can see right here that this article that has come out today says, “Meta is reportedly scrambling war rooms of engineers to figure out how DeepSeek’s AI is beating everyone at a fraction of the price.” This is something that I think is a serious, serious issue. You can see Mark Zuckerberg has a stern face here; of course, that’s probably what he looks like right now, but you have to understand that Meta is a billion-dollar company—they staked their entire future on this.

Meta isn’t completely screwed because they do have the distribution, but I personally think the entire AI industry is about to change in a way that you don’t think. Nvidia has actually spoken; they’ve said that DeepSeek is an excellent AI advancement and a perfect example of test-time scaling. That’s apparently what an Nvidia spokesperson told CNN on Monday. DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export control compliant.

Basically, what they’re stating here is that this is good for them because they realize that, despite being more efficient, you still need more compute. Most people don’t realize that what Nvidia is saying is actually the truth. Remember how I said that they made DeepSeek incredibly good but did it at a fraction of the cost? Because they did it at a fraction of the cost, people were like, “Wait a minute, we don’t need these stocks anymore; why are we buying so many chips if you can achieve it for like a fifth of that cost?”

But here’s the thing. I’m going to show you a tweet from Andrej Karpathy, someone who is incredibly smart and was one of the original people who worked on the ChatGPT team in terms of the GPT-22 team and those early research papers. He said, “I don’t have too much to add on top of this earlier post on V3, and I think it applies to R12, which is more on the recent thinking equivalent, but I will say that deep learning has a legendary ravenous appetite for compute like no other algorithm that has ever been developed in AI.”

He continues, “You may not always be utilizing it fully, but I would never bet against compute as the upper bound for achievable intelligence in the long run—not just for an final individual training run but for the entire innovation experimentation engine that silently underlines all the algorithmic innovations.” He’s basically stating that while it’s crazy that they managed to do this on a shoestring budget, he would never bet against the fact that you’re still going to need loads and loads of compute if you truly want to achieve artificial superintelligence; there’s simply probably no way around this.

The crazy thing about this is that
Source: https://www.youtube.com/watch?v=Te6fBKg-ovE

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *