Blackwell is too popular, which causes tension in customer relationships.
Huang Renxun's remarks have reignited the capital markets.
After a brief morning slump, the US stock market staged a thrilling V-shaped major reversal, with all three major indices closing higher. The Nasdaq rose by 2.17%, achieving its largest single-day gain since August 16.
The key driver behind this reversal is not only the market's self-adjustment, but also the strong performance of technology giants like Nvidia, as well as the subtle shifts in market expectations for Fed interest rate hikes.
Nvidia initially dipped slightly below $107 in the early trading session, but steadily rose throughout the day following Huang Renxun's speech, reaching new daily highs and briefly surpassing $117 by the closing bell.
Ultimately, Nvidia surged by 8.15%, marking its largest six-week gain, with its market cap skyrocketing an astonishing $216.1 billion overnight (approximately RMB 1.54 trillion).
Boosted by Nvidia, the semiconductor sector saw a general rise, with ARM up by over 10%, Broadcom by over 6%, Taiwan Semiconductor and Micron by over 4%.
Huang Renxun: Blackwell is in high demand and is capable of transferring orders from Taiwan Semiconductor Manufacturing Company (TSMC)!
CEO Huang Renxun stated at a Goldman Sachs conference that NVIDIA's products have become the hottest commodities in the technology industry, and customers are competing for limited supply. In particular, the limited growth of Blackwell, the AI chip supplier, has frustrated some customers.
He also hinted that if necessary, NVIDIA would reduce dependence on TSMC and turn to other chip manufacturers for supply.
He told the audience, "Our product demand is very strong, and everyone wants to be the first to receive the goods and receive the most products. Today, we may have more emotional customers, which is understandable. The relationship is tense, but we are doing our best."
Huang Renxun stated that the company's latest chip, "Blackwell" (known as the "most powerful AI chip"), is particularly popular, and suppliers are working hard to meet demand.
When asked if the significant AI spending brings investment returns to customers, Huang Renxun said that businesses have no choice but to accept "accelerated computing".
He explained that NVIDIA's technology not only accelerates traditional workloads, such as data processing, but also handles AI tasks that old technologies cannot cope with.
Huang Renxun also stated that NVIDIA heavily relies on Taiwan Semiconductor for chip production because Taiwan Semiconductor is leading in chip manufacturing.
However, he also stated that if necessary, Nvidia can turn to other suppliers because we have made preparations on both hands. Nvidia has developed most of the technology internally.
However, he said that such a change could lead to a decrease in the quality of their chips. "TSMC's agility and their ability to respond to our needs are incredible."
In addition, the US government is considering allowing Nvidia to export advanced chips to Saudi Arabia. After the news was released, Nvidia's stock price soared, and its increase has surpassed double this year.
Attached: Full text of the dialogue between Jensen Huang and Goldman Sachs (Chinese version)
1. First, talk about some of your thoughts when you founded the company 31 years ago. Since then, you have transformed the company from a GPU company focused on gaming to one that provides a wide range of hardware and software for the datacenter industry. Can you talk about this journey? What were you thinking when you started? How did it evolve? What are your key priorities for the future, and how do you view the future world?
Huang Renxun: I want to say that one thing we did right is that we anticipated that there would be another form of computing in the future that could enhance general computing and solve problems that general tools can never solve. This processor would initially do things that are extremely difficult for CPUs, like computer graphics processing.
But we gradually expanded into other areas. The first area we chose was, of course, image processing, which is complementary to computer graphics processing. We expanded it to physical simulation because in the video game field we chose, you not only want it to be beautiful, but you also want it to be dynamic, to create a virtual world. We gradually expanded and introduced it to scientific computing. One of the first applications was molecular dynamics simulation, and another was seismic processing, which is essentially inverse physics. Seismic processing is very similar to CT reconstruction and is another form of inverse physics. So, step by step, we solved problems, expanded into adjacent industries, and ultimately solved these problems.
The core philosophy we have always adhered to is that accelerating computing can solve interesting problems. Our architecture remains consistent, which means that software developed today can run on a large installed base you leave behind, and software developed in the past can be accelerated with new technology. This mindset of architecture compatibility, creating a large installed base, and developing together with the ecosystem has been with us since 1993 and continues to this day. This is why NVIDIA's CUDA has such a huge installed base because we have been protecting it. Protecting the investment of software developers has always been our top priority.
Protecting the investment of software developers has always been our top priority. Looking to the future, some of the problems we solve along the way, including learning how to become a founder, how to become a CEO, how to operate a business, and how to build a company, are new skills. It's a bit like inventing the modern computer gaming industry. People may not know, but NVIDIA has the largest installed base for video game architecture in the world. GeForce has about 0.3 billion players and is still growing rapidly and very active. So I think every time we enter a new market, we need to learn new algorithms, market dynamics, and create new ecosystems.
The reason we need to do this is that unlike general-purpose computers, once a general-purpose computer is built, everything will eventually run on it. But we are an accelerator computer, which means you need to ask yourself, what do you want to accelerate? There is no such thing as a universal accelerator.
2. Let's talk in depth about the difference between general-purpose computing and accelerated computing.
Huang Renxun: If you look at the software now, there are a lot of file input and output in the software you write, there are parts that set up data structures, and some magical algorithm cores. These algorithms are different depending on whether they are used for computer graphics, image processing, or something else. It can be something in the fluid, particle, inverse physics, or image domain. So these different algorithms are different. If you create a processor specifically designed for these algorithms and complement the CPU to handle tasks that it is good at, in theory, you can greatly accelerate the operation of the application. The reason is that usually 5% to 10% of the code takes up 99.99% of the running time.
Therefore, if you offload that 5% of code to our accelerator, you can technically speed up the application by 100 times. This is not uncommon. We often accelerate image processing by 500 times. Now we are doing data processing. Data processing is one of my favorite applications because almost everything related to machine learning is evolving. It can be SQL data processing, Spark-like data processing, or vector database-like processing, handling unstructured or structured data, which are data frames.
We greatly accelerate these, but to do this, you need to create a top-level library. In the field of computer graphics, we were fortunate to have Silicon Graphics' OpenGL and Microsoft's DirectX, but beyond these, there are no truly existing libraries. So, for example, one of our most famous libraries is a library similar to SQL. SQL is a library for storage and calculation, and we created a library that is the world's first neural network computing library.
We have cuDNN (a library for neural network computing), cuOpt (a library for combinatorial optimization), cuQuantum (a library for quantum simulation and simulation), and many other libraries, such as cuDF for data frame processing, similar to SQL functionality. Therefore, all these different libraries need to be invented, and they can rearrange the algorithms in the application so that our accelerator can run. If you use these libraries, you can achieve 100 times acceleration and get more speed, which is amazing.
Therefore, the concept is very simple and meaningful, but the problem is how do you invent these algorithms and make the video game industry use them, write these algorithms and make the entire earthquake processing and energy industry use them, write new algorithms and make the entire AI industry use them. Do you understand what I mean? Therefore, all these libraries, each library, first we must complete the research of computer science, and secondly, we must go through the development process of the ecosystem.
We have to convince everyone to use these libraries, and then consider the types of computers on which they run, each computer is different. Therefore, we step by step into one field after another. We created a very rich library for autonomous driving cars, a very outstanding library for robot development, and an incredible library for virtual filtering, both physical-based and neural network-based virtual filtering, as well as an amazing library for climate technology.
We must go out and make friends, create markets. In fact, nvidia is truly good at creating new markets. We've been at this for so long now that nvidia's accelerated computing seems to be everywhere, but we really must go step by step, developing markets one industry at a time.
3. Many investors in the field are very concerned about the data center market. Can you share your views on medium- and long-term opportunities? Obviously, your industry is driving what you call the 'next industrial revolution'. How do you view the current state of the data center market and the future challenges?
Huang Renxun: Two things are happening simultaneously, and they are often confused and discussed separately to help understand. First, let's assume that there is no AI. In a world without AI, general-purpose computing has come to a standstill. As we all know, some principles in semiconductor physics, such as Moore's Law and Denard scaling, have come to an end. We no longer see the phenomenon of doubling the performance of CPUs every year. We have been very lucky to see performance double in ten years. Moore's Law used to mean a tenfold performance increase in five years and a hundredfold increase in ten years.
But now these have come to an end, so we have to accelerate everything that can be accelerated. If you are doing SQL processing, speed it up; if you are doing any data processing, speed it up; if you are creating an internet company with a recommendation system, it must be accelerated. The largest recommendation system engines today are all accelerated. A few years ago, these were still running on CPUs, and now they are all accelerated. Therefore, the first dynamic is that the global trillion-dollar general data centers will be modernized and transformed into accelerated computing data centers. This is inevitable.
In addition, because Nvidia's accelerated computing has brought such tremendous cost reductions, computational power has grown not at a rate of 100 times, but at a rate of 1 million times in the past decade. So the question is, if your plane can be a million times faster, what would you do differently?
So people suddenly realized, 'Why don't we let computers write software instead of imagining these functions ourselves, or designing the algorithms ourselves?' We just need to give all the data, all the predictive data to the computer and let it find the algorithms - that is machine learning, generative AI. Therefore, we have applied it on a large scale in many different data fields, where computers not only know how to process data, but also understand the meaning of the data. Because it understands multiple data patterns at the same time, it can perform data translation.
Therefore, we can convert from English to images, from images to English, from English to proteins, and from proteins to chemicals. Because it understands all the data, it can perform all these translation processes, which we call generative AI. It can convert a large amount of text into a small amount of text, or expand a small amount of text into a large amount of text, and so on. We are now in the era of this computer revolution.
And now what's surprising is that the first wave of datacenters worth trillions of dollars will be accelerated, and we've also invented this new type of software called Generative AI. Generative AI is not just a tool, it's a skill. It's because of this that a new industry is being created.
Why is this? If you look at the entire IT industry until now, we have been making tools and instruments for people to use. For the first time, we are creating skills that can enhance human abilities. Therefore, people believe that AI will surpass the value of tens of trillions of dollars of data centers and the IT industry, and enter the world of skills.
So, what is a skill? For example, digital currency is a skill, autonomous driving cars are a skill, digital assembly line workers, robots, digital customer service, chatbots, digitally planning the supply chain for Nvidia. This can be a digital agent for SAP. Our company heavily uses ServiceNow, and we now have digital employee services. So, we now have these digitized humans, this is the AI wave we are currently in.
4. There is an ongoing debate in the financial market about whether the investment return is sufficient as we continue to build AI infrastructure. How do you assess the return on investment that clients have received in this cycle? If you look back on history, look back on PCs and cloud computing, how was the return on investment in similar adoption cycles? What are the differences compared to now?
Huang Renxun: That's a very good question. Let's take a look. Before cloud computing, the biggest trend was virtualization, if you remember. Virtualization basically meant that we virtualized all the hardware in the data centers into virtual data centers, and then we could move workloads across data centers without being directly tied to specific computers. The result was an increase in data center utilization, and we saw data center costs reduced by half to two and a half times, almost overnight.
Then, we put these virtual computers into the cloud, and as a result, not just one company, but many companies can share the same resources, costs drop again, and utilization rates rise again.
All the progress of these years has obscured the underlying fundamental change, which is the end of Moore's Law. We have gained a doubling, or even more, of cost reduction from the increase in utilization, but this has also reached the limit of transistors and CPU performance.
Furthermore, all of these improvements in utilization have reached their limits, which is why we are now seeing the inflation of data centers and computing. As a result, the first thing that is happening is the acceleration of computation. So, when you're dealing with data, for example, using Spark - which is one of the most widely used data processing engines in the world today - if you use Spark and accelerate it with NVIDIA accelerators, you can see a 20x speedup. This means you'll save 10 times the cost.
Of course, your computing costs will increase a bit because you need to pay for Nvidia's GPU. However, your computing costs may double, but you will reduce computing time by 20 times. Therefore, you ultimately save 10 times the cost. And such a return on investment is not uncommon for accelerated computing. So I suggest you accelerate any work that can be accelerated and use GPU acceleration to immediately achieve investment returns.
In addition, the discussion of generative AI is the first wave of AI today. Infrastructure players, such as ourselves and all cloud service providers, put the infrastructure in the cloud for developers to use these machines to train models, fine-tune models, and provide protection for models, among other things. Due to the high demand, for every $1 spent with us, cloud service providers can get a rental return of $5. This situation is happening globally, and demand is extremely high for this kind of demand.
We have seen some applications, including some well-known ones like OpenAI's ChatGPT, GitHub's Copilot, or the shared generator we use internally, which have incredible productivity improvements. Every software engineer in our company now uses the shared generator, whether it's the one we created for CUDA or the one used for USD (another language used in our company), or the generators for Verilog, C, and C++.
Therefore, I believe the days when every line of code was written by software engineers are over. In the future, every software engineer will have a digital engineer by their side, available 24/7 to assist with work. That's the future. So, when I look at NVIDIA, we have 32,000 employees, but there will be many more digital engineers around them, possibly 100 times more digital engineers.
5. Many industries are embracing these changes. Which use cases and industries are you most excited about?
Huang Renxun: In our company, we use AI in computer graphics. Without artificial intelligence, we can no longer do computer graphics. We calculate only one pixel and then infer the other 32 pixels. In other words, we 'imagine' the other 32 pixels to some extent, and they are visually stable and look like photo-quality realism. The image quality and performance are both excellent.
Calculating one pixel requires a lot of energy, while inferring the other 32 pixels requires very little energy and can be done very quickly. So, AI is not just about training models, that's just the first step. What's more important is how you use the models. When you use models, you save a lot of energy and time.
Without AI, we would not be able to provide services to the autonomous driving industry. Without AI, our work in robot technology and digital biology would also be impossible. Now, almost every tech biotech company revolves around Nvidia, and they are using our data processing tools to generate new proteins, small molecule generation, virtual screening, and other areas that will be completely reshaped by artificial intelligence.
6. Let's talk about competition and your competitive barriers. Currently, there are many public and private companies that hope to break your leadership position. How do you view your competitive barriers?
First of all, I think there are a few things that set us apart. The first point to remember is that AI is not just about chips. AI is about the entire infrastructure. Today's computers are not made by manufacturing a chip and people buying it and putting it into a computer. That model belongs to the 90s. Today's computers are developed under the name of supercomputing clusters, infrastructure, or supercomputers. It's not just a chip, and it's not entirely a computer.
So, in fact, we are building the entire data center. If you take a look at one of our supercomputer clusters, you will find that the software required to manage this system is very complex. There is no "Microsoft Windows" that can be directly used for these systems. This customized software is developed by us for these superclusters. Therefore, the company that designs chips, build supercomputers, and develops complex software naturally is the same company, ensuring optimization, performance, and efficiency.
Secondly, AI is fundamentally an algorithm. We are very good at understanding how algorithms work and how the computing stack distributes computations, and how to run on millions of processors for days, maintaining the stability of the computer, energy efficiency, and the ability to complete tasks quickly. We are very good at this.
Finally, the key to AI computing is the installed base. It is important to have a unified architecture that spans across all cloud computing platforms and on-premise deployments. Whether you are building supercomputer clusters in the cloud or running AI models on a device, there should be the same architecture to run all the same software. This is called the installed base. And this consistency in architecture since 1993 is one of the key reasons why we have achieved what we have today.
Therefore, if you want to start an AI company today, the most obvious choice is to use Nvidia's architecture because we are already present on all cloud platforms. No matter which device you choose, as long as it has the Nvidia logo, you can run the same software directly.
7. Blackwell is 4 times faster in training and 30 times faster in inference speed compared to its predecessor product, Hopper. With such a fast pace of innovation, can you maintain this rhythm? Can your partners keep up with your pace of innovation?
Huang Renxun: Our fundamental innovation approach is to ensure that we constantly drive architectural innovation. The innovation cycle for each chip is about two years, at best. Each year, we also perform midterm upgrades, but the overall architectural innovation is about once every two years, which is already very fast.
We have seven different chips that collectively act on the entire system. We can introduce a new AI supercomputing cluster every year, which is more powerful than the previous generation. This is because we have multiple parts that can be optimized. Therefore, we can deliver higher performance very quickly, and this performance improvement directly translates into a decrease in total cost of ownership (TCO).
Blackwell's performance improvement means that customers with 1 gigawatt of power can receive three times the income. Performance directly translates into throughput, and throughput translates into income. If you have 1 gigawatt of power available, you can receive three times the income.
Therefore, the return on this performance improvement is unparalleled, and the 3x income gap cannot be compensated for by reducing chip costs.
8. How to view the dependence on the Asian supply chain?
Jensen Huang: The supply chain in Asia is very complex and highly interconnected. Nvidia's GPU is not just a chip, it is a complex system made up of thousands of components, similar to the structure of an electric car. Therefore, the supply chain network in Asia is extensive and complex. We strive to design diversity and redundancy in every link, ensuring that even if there are problems, we can quickly shift production to other locations. Overall, even if there is a disruption in the supply chain, we have the ability to adapt and ensure continuity of supply.
We are currently manufacturing at Taiwan Semiconductor because it is the best in the world, not just a little better, but much better. We have a long history of cooperation with them, and their flexibility and scale capabilities are very impressive.
Last year, our revenue saw significant growth, thanks to the fast response of the supply chain. Taiwan Semiconductor's agility and their ability to meet our needs are remarkable. In less than a year, we have significantly increased our production capacity, and we will continue to expand next year, and further expand the following year. Therefore, their agility and capability are excellent. However, if needed, we can certainly turn to other suppliers.
Your company is in a very advantageous market position. We have discussed many excellent topics. What are you most worried about?
Huang Renxun: Currently, our company collaborates with every AI company globally and every datacenter. I don't know of any cloud computing service provider or computer manufacturer that we do not collaborate with. Therefore, with such an expansion in scale, we bear a huge responsibility. Our customers are very emotional because our products directly impact their income and competitiveness. The demand is high, and the pressure to meet these demands is also significant.
We are currently in full production of Blackwell and plan to start shipping and expanding further in the fourth quarter. The demand is so high that everyone wants to get the product as soon as possible and get the maximum share. Such a tense and intense atmosphere is unprecedented.