This is a conversation about bg2 Bill Gurley and Brad Gerstner with SemiAnalysis's Dylan Patel, discussing the impact of AI development on the semiconductor industry and future trend predictions.
NVIDIA's comprehensive victory: software, hardware, and system integration
The conversation begins by focusing on NVIDIA's absolute leading position in the AI field. Dylan points out:
• NVIDIA holds a market share of up to 98% in the AI training market (excluding Google's self-developed TPU).
• Even with the inclusion of Google's TPU, NVIDIA still occupies 70% of the market share.
Why is NVIDIA so powerful?
1. Software:
NVIDIA's CUDA platform provides AI developers with a friendly development environment, significantly reducing the barriers to entry and forming a large developer community.
2. Hardware:
NVIDIA has always been at the forefront of technology, rapidly introducing products that meet the high-performance computing needs of AI.
3. Network:
After acquiring Mellanox, NVIDIA has enhanced its high-speed networking capabilities, enabling the provision of complete AI solutions.
Competitive System Design: NVIDIA vs. Google
In terms of system-level design, Google started earlier:
• As early as 2018, collaborated with Broadcom to develop the TPU v3.
• NVIDIA's Blackwell system is expected to be launched by 2024.
Although Google has a technical advantage, NVIDIA still leads the market relying on a powerful software ecosystem and hardware capabilities. Google's TPU is mainly used for its own business (such as Google Search and advertising) and is restricted in commercialization.
The Future of AI Scale: Continuing Expansion Possibilities
Regarding the doubt whether the scale of AI models has reached its limit, Dylan has put forward an optimistic view:
1. Data synthesis generation:
Using AI technology to generate a large amount of training data, breaking the limitations of real data, and further improving model performance.
2. Inference time calculation:
Moving some training to the inference stage can reduce training costs and enhance model flexibility.
The continued investment of large companies in larger AI computing clusters indicates that the expansion potential of AI is still enormous, and the demand for high-performance computing will continue to grow.
Competitive Landscape and Alternative Solutions: Challenges from AMD and Google
Although NVIDIA dominates the AI chip market, other companies are also trying to challenge its position:
1. AMD:
• Excellent hardware performance, its MI300 chip can rival the NVIDIA H100 in terms of performance.
• However, AMD still lacks in software and system design, and its ROCm platform ecosystem has not yet matured.
2. Google:
• TPU has certain technical advantages, especially in terms of energy efficiency.
• However, its main applications are in internal business, with limited commercialization capabilities.
Revolution in the memory market: HBM demand surges
With the growth of AI model's demand for inference computing, the demand for High Bandwidth Memory (HBM) has also significantly increased:
• SK Hynix becomes the main supplier.
• Samsung's market share faces downward pressure.
Summary: NVIDIA continues to dominate the AI Chip market with software and hardware integration and ecosystem, but competitors like AMD and Google are also actively positioning themselves. There is still potential for the scaling development of AI models, and the demand for high-performance computing and storage will continue to increase, all of which are key points investors need to pay attention to in 2025.