share_log

黄仁勋:Blackwell太抢手已经让客户不满 英伟达股价转涨超6.5%

Huang Renxun: Blackwell is too aggressive and has already made customers dissatisfied. nvidia's stock price has surged more than 6.5%.

wallstreetcn ·  Sep 12 02:40

Nvidia CEO Huang Renxun said that the growth of Nvidia's AI chip Blackwell supply is limited, which has frustrated some customers. He also hinted that if necessary, Nvidia would reduce its reliance on Taiwan Semiconductor and turn to other chip manufacturers. In addition, it is reported that the US government is considering allowing Nvidia to export advanced chips to Saudi Arabia.

Jensen Huang, CEO of NVIDIA, the leader in the AI boom, said on Wednesday that NVIDIA's products have become the most sought-after commodities in the technology industry. Customers are competing for limited supply, especially the limited growth rate of AI chip Blackwell supplied by Taiwan Semiconductor, which has frustrated some customers. He also hinted that if necessary, NVIDIA would reduce its dependence on Taiwan Semiconductor and turn to other chip manufacturing suppliers.

He told the audience at the Goldman Sachs technology conference in San Francisco:

"The demand for our products is so high that everyone wants to be the first to get them and get the most share. Today we may have more emotional customers, which is understandable. The relationship is very tense, but we are doing our best."

Huang Renxun introduced to the audience that the company's latest generation of AI chip Blackwell is facing strong demand. Currently, NVIDIA has outsourced the production of Blackwell, and Huang Renxun said that NVIDIA's suppliers are trying their best to keep up with demand and make progress.

However, most of NVIDIA's revenue depends on a few customers, such as Microsoft and Meta Platforms Inc., data center operators. When asked whether the huge AI spending brought investment returns to customers, Huang Renxun said that companies have no choice but to accept "accelerated computing." He explained that NVIDIA's technology can not only accelerate traditional workloads - data processing, but also handle AI tasks that old technologies cannot cope with.

Huang Renxun also stated that NVIDIA heavily relies on Taiwan Semiconductor for chip production because Taiwan Semiconductor is leading in chip manufacturing.

However, he also mentioned that NVIDIA has internally developed most of the technology, which allows the company to transfer orders to other suppliers. However, he said that such a change could lead to a decline in the quality of their chips.

"The agility of Taiwan Semiconductor and their ability to respond to our demands are truly incredible. So we chose them because they are excellent, but if necessary, of course, we can also turn to other suppliers."

In addition, the report stated that the U.S. government is considering allowing Nvidia to export advanced chips to Saudi Arabia, which may help the country train and run the most powerful AI models. Some individuals working for the Saudi Data and AI Authority said that Saudi Arabia is working hard to comply with U.S. security requirements to expedite the process of obtaining these chips.

After the interview was released, Nvidia's stock price turned from a decline to an increase within the day, rising by over 6.5% during trading to $115.18, while also driving the Nasdaq from a 1.6% intraday decline to a 1.46% increase. This year, Nvidia's stock price has more than doubled, and it has risen by 239% in 2023.

big

The following is an excerpt from the interview with Huang Renxun:

1. First, talk about some of your thoughts when you founded the company 31 years ago. Since then, you have transformed the company from a GPU company focused on gaming to one that provides a wide range of hardware and software for the datacenter industry. Can you talk about this journey? What were you thinking when you started? How did it evolve? What are your key priorities for the future, and how do you view the future world?

Huang Renxun: I think one thing we did right is that we foresaw another form of computing in the future that could enhance general computing and solve problems that general-purpose tools can never solve. This processor initially did something that was extremely difficult for CPUs, which is computer graphics processing.

But we gradually expanded into other areas. The first area we chose was, of course, image processing, which is complementary to computer graphics processing. We expanded it to physical simulation because in the video game field we chose, you not only want it to be beautiful, but you also want it to be dynamic, to create a virtual world. We gradually expanded and introduced it to scientific computing. One of the first applications was molecular dynamics simulation, and another was seismic processing, which is essentially inverse physics. Seismic processing is very similar to CT reconstruction and is another form of inverse physics. So, step by step, we solved problems, expanded into adjacent industries, and ultimately solved these problems.

The core philosophy we have always adhered to is that accelerating computing can solve interesting problems. Our architecture remains consistent, which means that software developed today can run on a large installed base you leave behind, and software developed in the past can be accelerated with new technology. This mindset of architecture compatibility, creating a large installed base, and developing together with the ecosystem has been with us since 1993 and continues to this day. This is why NVIDIA's CUDA has such a huge installed base because we have been protecting it. Protecting the investment of software developers has always been our top priority.

Protecting the investment of software developers has always been our top priority. Looking to the future, some of the problems we solve along the way, including learning how to become a founder, how to become a CEO, how to operate a business, and how to build a company, are new skills. It's a bit like inventing the modern computer gaming industry. People may not know, but NVIDIA has the largest installed base for video game architecture in the world. GeForce has about 0.3 billion players and is still growing rapidly and very active. So I think every time we enter a new market, we need to learn new algorithms, market dynamics, and create new ecosystems.

The reason we need to do this is that unlike general-purpose computers, once a general-purpose computer is built, everything will eventually run on it. But we are an accelerator computer, which means you need to ask yourself, what do you want to accelerate? There is no such thing as a universal accelerator.

2. Let's talk in depth about the difference between general-purpose computing and accelerated computing.

Huang Renxun: If you look at the software now, there are a lot of file input and output in the software you write, there are parts that set up data structures, and some magical algorithm cores. These algorithms are different depending on whether they are used for computer graphics, image processing, or something else. It can be something in the fluid, particle, inverse physics, or image domain. So these different algorithms are different. If you create a processor specifically designed for these algorithms and complement the CPU to handle tasks that it is good at, in theory, you can greatly accelerate the operation of the application. The reason is that usually 5% to 10% of the code takes up 99.99% of the running time.

Therefore, if you offload that 5% of code to our accelerator, you can technically speed up the application by 100 times. This is not uncommon. We often accelerate image processing by 500 times. Now we are doing data processing. Data processing is one of my favorite applications because almost everything related to machine learning is evolving. It can be SQL data processing, Spark-like data processing, or vector database-like processing, handling unstructured or structured data, which are data frames.

We greatly accelerate these, but to do this, you need to create a top-level library. In the field of computer graphics, we were fortunate to have Silicon Graphics' OpenGL and Microsoft's DirectX, but beyond these, there are no truly existing libraries. So, for example, one of our most famous libraries is a library similar to SQL. SQL is a library for storage and calculation, and we created a library that is the world's first neural network computing library.

We have cuDNN (a library for neural network computing), cuOpt (a library for combinatorial optimization), cuQuantum (a library for quantum simulation and simulation), and many other libraries, such as cuDF for data frame processing, similar to SQL functionality. Therefore, all these different libraries need to be invented, and they can rearrange the algorithms in the application so that our accelerator can run. If you use these libraries, you can achieve 100 times acceleration and get more speed, which is amazing.

Therefore, the concept is very simple and meaningful, but the problem is how do you invent these algorithms and make the video game industry use them, write these algorithms and make the entire earthquake processing and energy industry use them, write new algorithms and make the entire AI industry use them. Do you understand what I mean? Therefore, all these libraries, each library, first we must complete the research of computer science, and secondly, we must go through the development process of the ecosystem.

We have to convince everyone to use these libraries, and then consider the types of computers on which they run, each computer is different. Therefore, we step by step into one field after another. We created a very rich library for autonomous driving cars, a very outstanding library for robot development, and an incredible library for virtual filtering, both physical-based and neural network-based virtual filtering, as well as an amazing library for climate technology.

Therefore, we have to make friends and create markets. It turns out that what Nvidia is really good at is creating new markets. We have been doing this for so long now that Nvidia's accelerated computing seems to be everywhere, but we really have to complete it step by step, developing the market one industry at a time.

3. Many investors in the field are very concerned about the data center market. Can you share your views on medium- and long-term opportunities? Obviously, your industry is driving what you call the 'next industrial revolution'. How do you view the current state of the data center market and the future challenges?

Huang Renxun: Two things are happening simultaneously, and they are often confused and discussed separately to help understand. First, let's assume that there is no AI. In a world without AI, general-purpose computing has come to a standstill. As we all know, some principles in semiconductor physics, such as Moore's Law and Denard scaling, have come to an end. We no longer see the phenomenon of doubling the performance of CPUs every year. We have been very lucky to see performance double in ten years. Moore's Law used to mean a tenfold performance increase in five years and a hundredfold increase in ten years.

But now these have come to an end, so we have to accelerate everything that can be accelerated. If you are doing SQL processing, speed it up; if you are doing any data processing, speed it up; if you are creating an internet company with a recommendation system, it must be accelerated. The largest recommendation system engines today are all accelerated. A few years ago, these were still running on CPUs, and now they are all accelerated. Therefore, the first dynamic is that the global trillion-dollar general data centers will be modernized and transformed into accelerated computing data centers. This is inevitable.

In addition, because Nvidia's accelerated computing has brought such tremendous cost reductions, computational power has grown not at a rate of 100 times, but at a rate of 1 million times in the past decade. So the question is, if your plane can be a million times faster, what would you do differently?

So people suddenly realized, 'Why don't we let computers write software instead of imagining these functions ourselves, or designing the algorithms ourselves?' We just need to give all the data, all the predictive data to the computer and let it find the algorithms - that is machine learning, generative AI. Therefore, we have applied it on a large scale in many different data fields, where computers not only know how to process data, but also understand the meaning of the data. Because it understands multiple data patterns at the same time, it can perform data translation.

Therefore, we can convert from English to images, from images to English, from English to proteins, and from proteins to chemicals. Because it understands all the data, it can perform all these translation processes, which we call generative AI. It can convert a large amount of text into a small amount of text, or expand a small amount of text into a large amount of text, and so on. We are now in the era of this computer revolution.

What is amazing now is that the first batch of data centers worth tens of trillions of dollars will be accelerated, and we have also invented this new type of software called generative AI. Generative AI is not just a tool, it is a skill. It is because of this reason that new industries are being created.

Why is this? If you look at the entire IT industry until now, we have been making tools and instruments for people to use. For the first time, we are creating skills that can enhance human abilities. Therefore, people believe that AI will surpass the value of tens of trillions of dollars of data centers and the IT industry, and enter the world of skills.

So, what is a skill? For example, digital currency is a skill, autonomous driving cars are a skill, digital assembly line workers, robots, digital customer service, chatbots, digitally planning the supply chain for Nvidia. This can be a digital agent for SAP. Our company heavily uses ServiceNow, and we now have digital employee services. So, we now have these digitized humans, this is the AI wave we are currently in.

4. There is an ongoing debate in the financial market about whether the investment return is sufficient as we continue to build AI infrastructure. How do you assess the return on investment that clients have received in this cycle? If you look back on history, look back on PCs and cloud computing, how was the return on investment in similar adoption cycles? What are the differences compared to now?

Hwang In-hyuk: That's a very good question. Let's take a look. Before cloud computing, the biggest trend was virtualization, if you remember. Virtualization basically meant that we virtualized all the hardware in the data center into a virtual data center, and then we could move workloads across data centers without being directly associated with a specific computer. The result was a doubling to two and a half times reduction in data center costs, almost overnight.

Then, we put these virtual computers into the cloud, and as a result, not just one company, but many companies can share the same resources, costs drop again, and utilization rates rise again.

All the progress of these years has obscured the underlying fundamental change, which is the end of Moore's Law. We have gained a doubling, or even more, of cost reduction from the increase in utilization, but this has also reached the limit of transistors and CPU performance.

Furthermore, all of these improvements in utilization have reached their limits, which is why we are now seeing the inflation of data centers and computing. As a result, the first thing that is happening is the acceleration of computation. So, when you're dealing with data, for example, using Spark - which is one of the most widely used data processing engines in the world today - if you use Spark and accelerate it with NVIDIA accelerators, you can see a 20x speedup. This means you'll save 10 times the cost.

Of course, your computational costs will increase slightly because you'll need to pay for NVIDIA GPUs, and computational costs may double, but you'll reduce computation time by 20 times. So ultimately, you'll save 10 times the cost. And such a return on investment is not uncommon for accelerated computing. So, I would recommend accelerating any work that can be accelerated and using GPUs for acceleration to immediately gain ROI.

In addition, the discussion of generative AI is the first wave of AI today. Infrastructure players, such as ourselves and all cloud service providers, put the infrastructure in the cloud for developers to use these machines to train models, fine-tune models, and provide protection for models, among other things. Due to the high demand, for every $1 spent with us, cloud service providers can get a rental return of $5. This situation is happening globally, and demand is extremely high for this kind of demand.

We have seen some applications, including some well-known ones like OpenAI's ChatGPT, GitHub's Copilot, or the shared generator we use internally, which have incredible productivity improvements. Every software engineer in our company now uses the shared generator, whether it's the one we created for CUDA or the one used for USD (another language used in our company), or the generators for Verilog, C, and C++.

Therefore, I believe the days when every line of code was written by software engineers are over. In the future, every software engineer will have a digital engineer by their side, available 24/7 to assist with work. That's the future. So, when I look at NVIDIA, we have 32,000 employees, but there will be many more digital engineers around them, possibly 100 times more digital engineers.

5. Many industries are embracing these changes. Which use cases and industries are you most excited about?

Huang Renxun: In our company, we use AI in computer graphics. Without artificial intelligence, we can no longer do computer graphics. We calculate only one pixel and then infer the other 32 pixels. In other words, we 'imagine' the other 32 pixels to some extent, and they are visually stable and look like photo-quality realism. The image quality and performance are both excellent.

Calculating one pixel requires a lot of energy, while inferring the other 32 pixels requires very little energy and can be done very quickly. So, AI is not just about training models, that's just the first step. What's more important is how you use the models. When you use models, you save a lot of energy and time.

Without AI, we would not be able to provide services to the autonomous driving industry. Without AI, our work in robot technology and digital biology would also be impossible. Now, almost every tech biotech company revolves around Nvidia, and they are using our data processing tools to generate new proteins, small molecule generation, virtual screening, and other areas that will be completely reshaped by artificial intelligence.

6. Let's talk about competition and your competitive barriers. Currently, there are many public and private companies that hope to break your leadership position. How do you view your competitive barriers?

Nvidia: First of all, I think there are a few things that make us different. The first thing to remember is that AI is not just about chips. AI is about the entire infrastructure. Today's computers don't just manufacture a chip and people buy it and put it in their computers. That model belongs to the 1990s. Today's computers are developed in the name of supercomputer clusters, infrastructure, or supercomputers. They are not just a chip, nor are they just a computer.

So, in fact, we are building the entire data center. If you take a look at one of our supercomputer clusters, you will find that the software required to manage this system is very complex. There is no "Microsoft Windows" that can be directly used for these systems. This customized software is developed by us for these superclusters. Therefore, the company that designs chips, build supercomputers, and develops complex software naturally is the same company, ensuring optimization, performance, and efficiency.

Secondly, AI is fundamentally an algorithm. We are very good at understanding how algorithms work and how the computing stack distributes computations, and how to run on millions of processors for days, maintaining the stability of the computer, energy efficiency, and the ability to complete tasks quickly. We are very good at this.

Finally, the key to AI computing is the installed base. It is important to have a unified architecture that spans across all cloud computing platforms and on-premise deployments. Whether you are building supercomputer clusters in the cloud or running AI models on a device, there should be the same architecture to run all the same software. This is called the installed base. And this consistency in architecture since 1993 is one of the key reasons why we have achieved what we have today.

Therefore, if you want to start an AI company today, the most obvious choice is to use Nvidia's architecture because we are already present on all cloud platforms. No matter which device you choose, as long as it has the Nvidia logo, you can run the same software directly.

7. Blackwell is 4 times faster in training and 30 times faster in inference speed compared to its predecessor product, Hopper. With such a fast pace of innovation, can you maintain this rhythm? Can your partners keep up with your pace of innovation?

Huang Renxun: Our fundamental innovation approach is to ensure that we constantly drive architectural innovation. The innovation cycle for each chip is about two years, at best. Each year, we also perform midterm upgrades, but the overall architectural innovation is about once every two years, which is already very fast.

We have seven different chips that collectively act on the entire system. We can introduce a new AI supercomputing cluster every year, which is more powerful than the previous generation. This is because we have multiple parts that can be optimized. Therefore, we can deliver higher performance very quickly, and this performance improvement directly translates into a decrease in total cost of ownership (TCO).

Blackwell's performance improvement means that customers with 1 gigawatt of power can receive three times the income. Performance directly translates into throughput, and throughput translates into income. If you have 1 gigawatt of power available, you can receive three times the income.

Therefore, the return on this performance improvement is unparalleled, and the 3x income gap cannot be compensated for by reducing chip costs.

8. How to view the dependence on the Asian supply chain?

Huang Renxun: The supply chain in Asia is very complex and highly interconnected. Nvidia's GPU is not just a chip, it is a complex system composed of thousands of components, similar to the construction of an electric vehicle. Therefore, the supply chain network in Asia is very extensive and complex. We strive to design diversity and redundancy at every stage to ensure that even if problems occur, we can quickly shift production to other places for manufacturing. Overall, even if the supply chain is interrupted, we have the ability to make adjustments to ensure continuity of supply.

We are currently manufacturing at Taiwan Semiconductor because it is the best in the world, not just a little better, but much better. We have a long history of cooperation with them, and their flexibility and scale capabilities are very impressive.

Last year, we saw a substantial increase in revenue, which is attributed to the rapid response of the supply chain. Taiwan Semiconductor's agility and their ability to meet our needs are quite remarkable. In less than a year, we significantly increased production capacity, and we will continue to expand next year and further expand the year after. Therefore, their agility and capability are excellent. However, if necessary, we can certainly turn to other suppliers.

Your company is in a very advantageous market position. We have discussed many excellent topics. What are you most worried about?

Huang Renxun: Currently, our company collaborates with every AI company globally and every datacenter. I don't know of any cloud computing service provider or computer manufacturer that we do not collaborate with. Therefore, with such an expansion in scale, we bear a huge responsibility. Our customers are very emotional because our products directly impact their income and competitiveness. The demand is high, and the pressure to meet these demands is also significant.

We are currently in full production of Blackwell and plan to start deliveries and further expansion in the fourth quarter. The demand is so high that everyone hopes to get the product as soon as possible to secure the largest share. This unprecedented tense and intense atmosphere is truly remarkable.

While it is very exciting to create the next generation of computer technology and see the innovation of various applications, we feel a huge responsibility and significant pressure. But we strive to do our best. We have adapted to this intensity and will continue to work hard.

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more
    Write a comment