$NVIDIA (NVDA.US)$ saw its shares decline by over 10.9% to $126.97 on Monday morning, following the launch of DeepSeek, a Chinese generative AI program. Despite the dip, Nvidia's stock still closed higher for the week.
DeepSeek, backed by Chinese quant firm High-Flyer, has made waves in the AI community by open-sourcing its R1 Large Language Model (LLM) and publishing a paper detailing how advanced LLMs can be developed on significantly smaller budgets. The company claims to have had access to about 50,000 Nvidia H100 AI GPUs, though the extent of their utilization remains unclear.
The most striking aspect of DeepSeek R1 is its ability to achieve competitive results compared to well-funded rivals like OpenAI's ChatGPT and Meta's Llama, despite operating with more limited resources. This development raises questions about the necessity of the massive capital expenditures undertaken by major tech companies for AI advancements.
The emergence of DeepSeek R1 has sparked debate in the AI community, as the Chinese-developed Large Language Model (LLM) has demonstrated competitive performance despite having access to fewer resources compared to industry giants like OpenAI's ChatGPT and Meta's Llama. This development is particularly significant given recent U.S. export controls on chip technology to China and President Donald Trump's announcement of $500 billion in private spending for U.S. AI infrastructure.
DeepSeek R1's success has prompted a reevaluation of the massive capital expenditures made by major tech companies in AI development. Among the "Magnificent 7" stocks, Microsoft, Meta, and Alphabet have reported substantial increases in AI-related spending, a trend expected to continue in their upcoming December quarter earnings reports.
Yardeni Research suggests that the Magnificent 7 could potentially benefit from DeepSeek's approach: "It might be good news for the Mag-7 that can learn from DeepSeek to design AI systems with cheaper GPUs. That would reduce their capital spending and boost their profits. It might not be a happy development for Nvidia."
Nvidia, which has seen exponential growth from increased AI investment over the past two years, may face challenges if AI developers shift towards more efficient models requiring less advanced chip technology.
However, JPMorgan analyst Joshua Meyers offers a nuanced perspective, stating that concerns over higher AI budgets are "overdone." He suggests that DeepSeek's efficiency might be born out of necessity, given Chinese firms' limited access to advanced U.S. chip technology.
“If DeepSeek can reduce the cost of inference, then others will have to as well, and demand will hopefully more than make up for that over time,” Meyers wrote in a short note dated to Saturday.
Comment(56)
Lol! Can anyone trust CCP propaganda cost numbers? Nothing brilliant innovating on American foundational models like ChatGPT…
No Deep seek won’t
Yes
I dont think so
Is it Joever chat ?
Likely and unlikely at the same time. I find ChatGPT has better usage and database storage. Plus chatgpt has voice command that could allow you to use voice and it recognise it so it auto scribe what you say.
But it’s open source, there is nothing to hide. Nvda holders should seriously be concerned. Hopefully the market will grow even faster now training cost is down, so NVDA can still grow with demand for inference compute.
Likewise, this comment is soaked heavily with western propaganda
Arrogance will reward you with huge investment loss congrats
Cannot be, they open sourced their methods. That’s how confident they are. Bb Nvidia bag holder. Compounded by the fact that 5000 series aren’t as powerful as their 4000S equivalents. Not much people will be keen on an upgrade
lol!u still believe in amarica propaganda
Sorry! Nothing coming out of CCP China can be believed! Please sell ur Nvidia stock & buy Chinese stocks! Lol!
It’s open source. You don’t have to take Chinas word. It’s available to everyone
You are right, it’s unbelievable, incredible, ly good!
First they say China is spying on people. Then when China launches something, people start taking their money out preparing to invest on that next new something. The whole cycle just keeps repeating. Not forgetting it's AI. China will need a lot of people's data to feed that system.
Why would it disrupt. If anything Deepseek has proven that H800 GPUs are still very valuable for training large scale models
I guess the logic is that the demand for training compute would be dramatically reduced if the big techs follow DeepSeek’s path to make training much more efficient. In the long term, I think the real demand is inference compute and that’s the thesis for holding nvda. Although I won’t be surprised by a short term downfall, hopefully not too bad.
If the way is working, mean that the requirement of Nvidia card will reduce , if in same performance. But due to cost reduce a lot, many AI company/calculation will appear afterwards. Or more higher performance of AI will occur and will cause higher requirements of Nvidia card chips ?
I see the point and echo your view on inference. For training , in my view efficient training will allow companies to iterate on designs faster and/or come up with even larger scale models, so I doubt it will have any measurable impact to gpu demand
Hope that’s the case.
Just trust the Labor is cheaper, utility and more efficient workforce, for doing same thing, China can be cheaper by 50%
True but has little relevance to this case though
can we ask deepseek what it says about Tianenmen square
Chinese tech is way ahead so anythings possible
stupig!
Tried DeepSeek. Cannot answer my simple question of whether SG market is open today. Ask me to check online for real-time data.
Lol! 祝你全家新年快乐!心想事成!🙃🙃😃😃
Yeah they should really try it before being scared. It's information update is only up to October 2023, and cannot access live data.
Yes. Nvidia and AI are bubbles for sure
从技术上来看,deepseek底层模型还是transformer,
1)他们世界首创在大规模LLM训练中系统性部署fp8(8位浮点)量化技术,这大大降低训练对显卡内存的需求,也加快了训练过程;
2)为了正确使用fp8的矩阵乘法,他们优化并改进了CUDA Kernal的调用方式,甚至给NVDA提出了诸多Tensor Core方面的设计建议
3)他们开发了自己的训练框架DualPipe,实现了16/64通道的流水线和专家(MOE)并行,极大改善了并行训练中的通信和计算冲突问题,解决了调度瓶颈。
最终,DeepSeek实现了在2048个H800上的集群训练。
其次,文章中大部分改进是渐进式的,而非革命性的:
1)对MTP(多词预测)实际上来自2023年文章YaRN,而且最终DeepSeek V3只实现了N=1的MTP,也即比传统的GPT多预测一个词;
2)MOE所引入的Aux-Loss-Free Load Balancing技术,其实仅仅是在传统Expert的分配算法面前加入了一个bias term b_{i};
3)DeepSeek MOE上的另一个革新是加入了“共享Expert”,并保证训练时对于每个Token,这些Expert最多分布在4个node上,以减少通信瓶颈。
4)其独创的Multihead Latent Attention 本质上是将QKV通过线性变换降维到一个Latent Space存入Cache,提高存储速度;这有利于推理任务加速。
5)利用自己在量化交易中的经验,创造性地将某些移动平均值(如Adam参数状态)存在CPU中,减少并行开销,等等
它证明了在极限的精度和优化条件下,训练一个600B大模型成本能走到多低。
但不至于颠覆硅谷,是一个非常好的阶段性进展
the Murican bubbly artificially hyped AI sector is done.
made in china. as we all know this is not the first time nvidia is attacked.
Great news !efficient & $cost less !
deepseek still hiding some dark past of china
所以能给我这些技术小白总结一下您认为对nvda的影响吗?
Deepsink... Discussion with several AI folks, this is all bs. As for TSLA, they have the data and inference models that Deepsink will never have.
Deepseek is open source, anyone can download and play with it.
I just type 'xi jingping' and deepseek says sorry to me
wall street use this excuse to crash the market
它的出现表明训练大模型不需要这么多算力 这会导致市场认为nvda的订单会减少 这是根据github上公开的论文分析的 但民间流传的消息是它的创始人梁文峰最初是做量化交易的,公司叫幻方量化,后来他向AI转型,在2021年花了10亿人民币买了英伟达的a100,建了10个篮球场大的算力中心 到了2023年7月正式创立deepseek 不知道是否在早期训练和验证过程中是否用了a100 触犯美国ai出口禁令
A propaganda tool of CCP! Sad that bears use this to crash their own quality technology companies…
The CCP has very little to do with this. Disinfect yourself of the non stop western propaganda. The CCP is pro China not anti you.
lolx
indeed, learning LLM in the Black Boxes is not easy. Months are needed, much data and pattern recognition. i don't believe everything? Buy and selling news ... ?
You think too highly of the Americans, buddy.
oh yeah the Chinese are known for telling the truth, I've seen their buildings, I'll pass on their swap meet tech
Might be just the first copycat.
I tried deepseek it gives the same leftist propaganda that chatgtp does, nothing special
I m buying NVDA !
Will it affect Nvidia? It already did. Nvda is down 11.34% $16.17 in the premarket.
Reason For Report