share_log

黄仁勋最新万字访谈:AGI即将来临,AI将彻底改变生产力

Huang Renxun's latest 10,000-word interview: AGI is coming soon, AI will completely change productivity.

wallstreetcn ·  Oct 14 07:52

The flywheel of machine learning is the most important. Just having a powerful GPU does not guarantee a company's success in the field of AI.

On October 4th, NVIDIA CEO Jensen Huang appeared on the interview program Bg2 Pod, engaging in a wide-ranging conversation with hosts Brad Gerstner and Clark Tang.

They mainly discussed how to expand intelligence to AGI, NVIDIA's competitive advantages, the importance of reasoning and training, future market dynamics in the AI field, the impact of AI on various industries, topics such as Elon's Memphis supercluster and X.ai, OpenAI, among others.

Huang Renxun emphasized the rapid evolution of AI technology, especially breakthroughs on the path to general artificial intelligence (AGI). He mentioned that AGI assistants are about to appear in some form and will become more sophisticated over time.

Huang Renxun also shared NVIDIA's leadership position in the computing revolution, pointing out that by reducing computing costs and innovating hardware architecture, NVIDIA has a significant edge in driving machine learning and AI applications. He particularly mentioned NVIDIA's 'moat,' the ecosystem of software and hardware accumulated over a decade, making it difficult for competitors to surpass through a single chip improvement.

Furthermore, Huang Renxun praised X.ai and the Musk team for completing the construction of the Memphis supercluster with a hundred thousand GPUs in just 19 days, calling it an 'unprecedented' achievement. This cluster is undoubtedly one of the fastest supercomputers globally and will play a crucial role in AI reasoning and training tasks.

Discussing the impact of AI on productivity, Huang Renxun optimistically stated that AI will greatly enhance companies' efficiency, bring more growth opportunities, and not lead to massive unemployment. At the same time, he also called on the industry to strengthen its focus on AI safety to ensure that the development and use of technology benefit society.

The summary of the key points of the full text is as follows:

  • (AGI Assistant) will soon appear in some form... At first, it will be very useful but not perfect. Then over time, it will become more and more perfect.
  • We reduced the marginal cost of computation by 100,000 times in 10 years. Our entire stack is growing, the whole stack is innovating.
  • People think that the reason for designing better chips is that it has more triggers, more bits, and bytes... But machine learning is not just software; it's about the entire data pipeline.
  • The flywheel of machine learning is the most important. You have to consider how to make this flywheel faster.
  • Having powerful GPUs alone does not guarantee a company's success in the field of AI.
  • Musk's understanding of large system engineering, construction, and resource allocation is unique... One hundred thousand GPUs as a cluster... Completed in 19 days.
  • AI will not change every job, but it will have a significant impact on how people work. When companies use AI to improve productivity, it usually results in better profits or growth.

The evolution of AGI and AI assistants

Brad Gerstner:

The theme for this year is to expand intelligence to AGI. Two years ago when we were doing this, we were in the AI era, that was two months before ChatGPT, considering all these changes, it's truly incredible. So I think we can start with a thought experiment and a prediction.

If I simplistically imagine AGI as my personal assistant in my pocket, if I think of AGI as that conversational assistant I'm used to. It knows everything about me. It has a perfect memory of me and can communicate with me. They can help me book hotels, or make doctor appointments. Given the speed of change in the world today, when do you think we will have a personal assistant?

Huang Renxun:

It will soon emerge in some form. And over time, this assistant will only get better and better. That's the marvelous technology we know. So I think initially it will be very useful but not perfect. Then over time, it will become more and more perfect. Just like all technologies.

Brad Gerstner:

When we look at the speed of change, I think Musk said, the only thing truly important is the speed of change. We do feel the speed of change has accelerated dramatically, this is the fastest rate of change we've seen on these issues, because we have been struggling in the AI field for ten years, or even longer. Is this the fastest rate of change you have seen in your career?

Huang Renxun:

This is because we reinvented computing. Many things happen because we reduced the marginal cost of computing by 100,000 times in 10 years. Moore's Law should be around 100 times. We achieved this in multiple ways. First, we introduced accelerated computing, putting less efficient work on the CPU onto the GPU. We achieved this by inventing new numerical precision. We achieved this by new architectures, inventing tensor cores, systemically building MV Links, extremely fast memory, expanding with MV Links and working across the entire stack. Essentially, everything I described about how NVIDIA works has led to an innovation speed beyond Moore's Law.

What's truly amazing now is that, since then, we've shifted from manual programming to machine learning. The magic of machine learning is that it can learn very fast. It has been proven. So, when we redefined how computation is allocated, we did a lot, all kinds of parallelism. Tensor parallelism, various pipeline parallelism. We excel at inventing new algorithms and training methods on this basis, and all these technologies, all these inventions are results of each other stacking up.

Looking back, if you look at how Moore's Law works, software was static. It was pre-compiled, like a shrink-wrapped raft placed in a store. It was static, while the hardware underneath grew at the speed of Moore's Law. Now, our entire stack is growing, the entire stack is innovating. So, I think, now we're suddenly seeing expansion.

Of course, this is extraordinary. But what we've been talking about in the past is pre-training models and expanding at that level, and how we doubled model sizes, thus doubling data sizes accordingly. As a result, the required computing power doubles every year. That's a big deal. But now we see expansion in post-training, we see expansion in inference. So, people used to think pre-training was hard, inference was easy. Now everything is hard. It makes sense, but thinking all human thoughts are one-off ideas is a bit absurd. So, there must be concepts like fast thinking, slow thinking, inference, reflection, iteration, and simulation. Now it's emerging.

NVIDIA's competitive moat

Clark Tang:

I think one of the most easily misunderstood things about NVIDIA is how deep the real NVIDIA model is. I think there's a notion that if someone invents a better chip, they win. But the fact is, you spend ten years building a complete stack from GPU to CPU to networking, especially the software and libraries that support applications. Running on NVIDIA. So I think you're talking about this, but when you think about NVIDIA's moat today, do you think today's video models are bigger or smaller than three or four years ago?

Huang Renxun:

Well, I appreciate your recognition of how computing has changed. In fact, people believe (and many still do) that the reason for designing better chips is because they have more triggers, more bits, and bytes. Do you understand what I mean? You will see their keynote presentation slides. It has all these triggers, bar graphs, and such. All of these are good. What I mean is, look, horsepower is indeed important. Yes. So these things are fundamentally important.

However, unfortunately, these are all ideas. These are ideas in the sense that software runs in some application on Windows and software is static, right? This means that the best way to improve the system is to make ships faster. But we realize that machine learning is not human programming. Machine learning is not just software, it's about the entire data pipeline. In fact, the flywheel of machine learning is the most important. So how do you view me enabling this flywheel? On one hand, enabling data scientists and researchers to work efficiently in this flywheel, which started from the beginning. Many people didn't even realize that AI was needed to manage data, to teach AI. And AI itself is quite complex.

Brad Gerstner:

Is AI itself improving? Is it also accelerating? Again, when we consider competitive advantage, yes, that's right. It's the combination of all these.

Huang Renxun:

It is because of smarter AI managing the data that leads to this situation. Now we even have synthetic data generation and various ways of presenting data to it. So before you receive training, you are already involved in a lot of data processing. So people would think, oh, Pytorch, this is the beginning of the world, and the end. This is very important.

But don't forget, before and after Pytorch, the meaning of the flywheel is how you must think, how do I think about the entire flywheel, how to design a computing system, a computing architecture, to help you leverage this flywheel, make it as efficient as possible. It's not about the size of an application training. Does that make sense? That's just one step. Well. Each step on the flywheel is difficult. So the first thing you should do is not think about making Excel faster, how to make Doom faster, that's the past, right? Now you have to think about how to make this flywheel faster? This flywheel has many different steps, machine learning is not easy, you all know.

What OpenAI, X, or Gemini teams are doing is not easy, they are thinking deeply about us. I mean, what they are doing is not easy. So we decided, you see, this is what you should consider. This is the whole process, you want to accelerate every part of it. You must respect Moore's Law, which shows that if it is 30% of the time, and I accelerate it three times, then I haven't really accelerated the whole process. Does that make sense? You really want to create a system to speed up every step, because only by doing the whole thing, can you truly improve the cycle time and the flywheel substantially, the learning rate, ultimately leading to exponential growth.

So, what I want to say is that our view of what the company really does will be reflected in the products. Note that I have been talking about this flywheel, the entire website. Yes, that's right. We accelerated everything.

Now, the main focus is on video. Many people are focusing on physical AI and video processing. Imagine the frontend. TB of data enters the system every second. For example, a pipeline will receive all this data. The training needs to be prepared first. Yes, this way the whole process can be accelerated.

Clark Tang:

Today, people only consider text models. Yes, but in the future, it will be about this video model, using some text models like o1 to truly process large amounts of data before we get there.

Huang Renxun:

Yes. So, the language model will be involved in everything. But in our industry, there has been a huge amount of technical and energy spent on training language models, training these large language models. Now, we are using large language formats at every step. This is very amazing.

Brad Gerstner:

I hear what you're saying that in a combined system, yes, the advantage will grow over time. So, I hear you saying that our advantage today is greater than three to four years ago because we are improving every component. That's the combination, when you think of, for example, as a business case study, Intel, relative to where you are now, it has a dominant mode, occupying a dominant position in the stack. Perhaps, to summarize again, compare your competitive advantage with the competitive advantage they had in their peak period.

Huang Renxun:

What sets Intel apart is that they may be the first company to excel in manufacturing technology and manufacturing. When it comes to manufacturing, it's about making chips. Designing chips, building chips in the x86 architecture, making faster x86 chips, this is where their talent lies, they integrate it with manufacturing.

Our company is a bit different, we recognize this fact. In fact, parallel processing does not require every transistor to perform well, while serial processing requires every transistor to perform well. Parallel processing requires a large number of transistors to be cost-effective. I would rather have ten times more transistors, and be 20% slower. Then have ten times fewer transistors, and be 20% faster. Does that make sense? They want the opposite result. Therefore, single-thread performance, single-thread processing, and parallel processing are very different. So we observe that in fact, our world is not getting better and better. We want to do very well, as well as we can, but our world is really evolving.

Parallel computing, parallel processing is challenging because each algorithm needs a different restructuring and a architecture to rebuild the algorithm. What people don't realize is that you can have three different CPUs. They all have their own C compiler. You can compile the software onto that axis.

This is impossible in accelerating computing. Companies proposing architectures must present their own Open GL. So we revolutionarily pursued deep learning, as our domain-specific library is called cuDNN (deep neural network library), a domain-specific library called optical. We have a domain-specific library called cuQuantum.

Brad Gerstner:

As for industry-specific algorithms below, you know, everyone is focusing on Pytorch layers. Just as I often hear.

Huang Renxun:

If we don't invent it, none of the applications above will work. Do you understand what I mean? So, running is really what nvidia excels at. The dissemination of science above the underlying architecture is what we truly excel at.

Nvidia is building a complete ai computing platform, including hardware, software, and ecosystem.

Clark Tang:

Now all attention is focused on reasoning. But I remember, two years ago, when Brad and I asked you a question over dinner, do you think your moat will be as powerful in reasoning as it is in training?

Huang Renxun:

I'm not sure if I ever said it would be stronger.

Clark Tang:

You just mentioned a lot of these elements, the combinability between the two, or, we don't know the overall combination. For customers, it's very important to maintain flexibility between the two. But now that we are in the era of inference, can you talk about it?

Huang Renxun:

Reasoning training is about reasoning on that scale. What I mean is, you are right. So if you train properly, then it's very likely that you will reason properly, and if you build it on this architecture without any consideration, it will run on this architecture. Well, you can still optimize it to fit other architectures, but at least because it's been built on NVIDIA architecture, it will run on NVIDIA.

Now, on another front, of course, it's just a capital investment aspect, that is when you train a new model, you want to train it with your best new equipment. This leaves you with the devices you used yesterday, which are very suitable for inference. So there is a range of free devices behind the new infrastructure that are compatible. Therefore, we rigorously ensure that we always stay compatible so everything we leave behind will continue to excel.

Now, we also put a lot of effort into constantly reinventing new algorithms so that when the time comes, the Hopper architecture will be two, three, or four times better than when they were purchased, so that this infrastructure will continue to be truly effective. Therefore, all the work we do, improving new algorithms, new frameworks. Note that it contributes to every installed base we have. Hopper is better for it, Ampere is better for it, and even Volta is better for it.

I just heard from Sam Altman that they recently shut down OpenAI's Volta infrastructure. So I think we leave a trace of the installed base, just as important as all computational installed base. NVIDIA is involved in every cloud, including local and edge computing.

VILA's vision language model is created in the cloud, without modification, and can run perfectly on robot edges. They are all highly compatible. Therefore, I believe that architectural compatibility is very important for large devices, and the same goes for the iPhone and other devices. I think the installed base is crucial for inference.

Huang Renxun:

But what really benefits me is that we are working hard to train these large language models on new architectures. We are able to think about how to create architectures that perform well in inference when the time is right one day. Therefore, we have been thinking about iterative models of inference models, and how to create a very interactive inference experience for this, right, your personal agents. You don't want to leave and think for a while after finishing speaking. You want to interact with you quickly. So how do we create something like this?

With MVLink, we can adopt systems that are very suitable for training. When you complete it, the inference performance will be excellent. So if you want to optimize the time to the first token, it is actually very difficult because it requires a lot of bandwidth to get to the first token. But if your context is also rich, then you will need a large number of FLOPS. Therefore, you need an infinite amount of bandwidth and an infinite amount of FLOPS to achieve a response time of a few milliseconds. So this architecture is really difficult to achieve. That's why we invented the great Blackwell MVLink.

Brad Gerstner:

Earlier this week, I had dinner with Andy Jassy, the President and CEO of Amazon. Andy mentioned that we have Tranium and Inferentia coming soon. Most people once again see these as NVIDIA's issue. But next, he said, NVIDIA is an important partner for us and will continue to be so. As far as I can see, the future world will rely on NVIDIA.

So when you think about the custom ASICs being built, they will be used for targeted applications. Perhaps Meta's inference accelerators, Amazon's training, or Google's TPU. Then consider the supply shortages you are facing today, will these factors alter this dynamic? Or will they complement the systems they purchase from you?

Huang Renxun:

We are just doing different things. Yes, we are trying to accomplish different things. Currently, NVIDIA is attempting to build a computing platform for this new world, this machine learning world, this generative AI world, and this agent-based AI world. We are trying to create, in the computing field, such a profound point that after 60 years of development, we have reinvented the entire computing stack. From programming to machine learning, from CPU to GPU, from software to AI, applications from software to AI. From software tools to AI. Therefore, every aspect of the computing stack and technology stack has changed.

What we want to create is a ubiquitous computing platform. This is indeed the complexity of our work, the complexity of our work is that, if you carefully think about what we are doing, you will realize that we are building an entire AI infrastructure, we see it as a computer. I have said before, the data center is now the unit of computation. For me, when I think of a computer, I am not thinking about the chip. I am considering this thing. This is my mental model, where all the software, all the orchestration, all the machines are my mission. This is my computer.

Every year, we try to build a new one. Yes, this is crazy. No one has ever done this before. Every year, we try to build a whole new one. Each year, we deliver two to three times the performance. Therefore, every year, we reduce costs by two to three times. Every year, we increase energy efficiency by two to three times. So we ask customers not to buy everything at once, but to buy a little each year, right? Okay. The reason we do this is that we want their costs to stay on average in the future. Now, everything is compatible in terms of architecture, so it's very difficult to build these things separately at our current speed.

Now, the double challenge is that we accept all of this, instead of selling it as infrastructure or services, we do not agree with all of this. We integrate it into GCP, AWS, Azure, and X. So everyone's integration is different. We have to integrate all our architecture libraries, all algorithms, and all frameworks into their frameworks. We integrate our security system into their system, we integrate our network into their system, right? Then we basically perform 10 integrations, and now we do this every year. That's a miracle.

Brad Gerstner:

We, I mean, you try to do this every year, it's crazy. So what drives you to do this every year?

Huang Renxun:

Yes, that's when you systematically break it down. The more you break down, the more surprised everyone is. Yes. How can the entire electronic ecosystem today be dedicated to working with us, ultimately building a computer cube integrated into all these different ecosystems and coordinating so seamlessly. So obviously what we propagate backward are APIs, methods, business processes, and design rules, and what we propagate forward are methods, architectures, and APIs.

Brad Gerstner:

It's the way they were.

Huang Renxun:

For decades, they have been working hard. Yes, and they are also constantly developing with our progress. However, these access points must be integrated together.

Clark Tang:

Some people just need to call the OpenAI API, and it works. That's it.

Huang Renxun:

Yes. Yes, a bit crazy. This is a whole. This is what we invented, this huge computing infrastructure, the whole planet is collaborating with us. It integrates anywhere. You can sell it through Dell, or you can sell it through HP. It's hosted in the cloud. It's everywhere, everywhere. People are now using it in robot systems, robots and human robots, they are in self-driving cars. They are compatible in architecture. Quite crazy.

Brad Gerstner:

This is too crazy.

Huang Renxun:

I don't want you to think that I haven't answered the question. In fact, I did. When we truly talk about the concept of stratification, what I mean is the way of thinking. We are just doing some different things. Yes, as a company, we want to understand the situation. I am very familiar with everything around the company and the ecosystem, right?

I know everyone is doing other things, what they are doing. Sometimes this is not beneficial to us, sometimes it is. I am very clear about this, but it has not changed the company's goals. Yes, the company's sole goal is to build a ubiquitous platform architecture. That is our goal.

We will not try to take any share from anyone. Nvidia is a market maker, not a share taker. If you look at the slides that the company does not show, you will find that this company does not talk about market share for a day, neither internally. All we talk about is how we create the next thing?

What is the next issue we can solve in this flywheel? How can we better serve people? How can we shorten the flywheel that used to take about a year to about a month? Yes. What is the speed of light? Isn't it?

So we are considering all these different things, but one thing we won't do, we won't, we have a thorough understanding of everything, but we are confident that our mission is very unique. The only question is whether this mission is necessary. Does it make sense? All companies, all great companies should take it as a core. That's about what you are doing?

Of course. The only question is, is it necessary? Is it valuable? Yes. Is it impactful? Is it helpful to people? I am sure you are a developer, you are a generative AI startup, you are about to decide how to become a company.

One of the choices you don't have to make is which A6 I support? If you only support CUDA, you can go anywhere. You can always change your mind later. But we are the gateway to the AI world, right?

Once you decide to join our platform, you can postpone all other decisions. You can always build your own foundation later. We do not object. We won't get angry about it. When I work with all GCP, GCP Azure, we will show them our roadmap several years in advance.

They did not show us their basic roadmap, and this has never offended us. Does that make sense? We create, we are one. If you have a unique goal, your goal is meaningful, and your mission is precious to you and others, then you can be transparent. Please note that my roadmap is transparent at GTC. Our roadmap is deeper for our friends at Azure, AWS, and other companies. We have no problem doing any of these things, even if they are building their own assets.

Brad Gerstner:

I think when people look at the business, you recently mentioned the demand for Blackwell is crazy. You mentioned that one of the most difficult parts of the job is saying 'no' emotionally when the world lacks the computing power you can produce and offer. But critics have said these things. Wait a moment. They say it's like Cisco in 2000, we're overbuilding fiber optics. This will be a cycle of prosperity and recession. I think back to the dinner we had in '23. At that January '23 dinner, Nvidia's forecast was that the revenue for 2023 would reach 26 billion dollars. You've reached 60 billion dollars.

Huang Renxun:

Let the facts be known. This is the biggest forecasting failure in world history. Yes. We can at least acknowledge that.

GPUs are playing an increasingly important role in AI computing.

Brad Gerstner:

That's right, we were very excited on November 22nd because we had people like Mustafa from Inflection coming in, not people from Character, to talk about investing in their company. They said, well, if you can't invest in our company, buy Nvidia because everyone in the world is trying to get Nvidia chips to build these world-changing applications. Of course, the Cambrian Moment is happening on ChatGPT. However, these 25 analysts are still focused on the winners in crypto, to the point where they cannot imagine what is happening in the world. So the scale is much larger in the end. In very plain English, the demand for Blackwell is insane, and as long as you can foresee, it will continue. Of course, the future is unknown and unknowable. But why did critics get it so wrong, thinking it wouldn't overbuild like Cisco did in 2000.

Huang Renxun:

The best way to think about the future is to start from first principles, right? Okay, so what is the first principle of what we are doing for the problem? First, what are we doing? The first thing we are doing is reinventing computing, isn't it? As we just said, the future of computing will be highly machine learning. Yes, highly machine learning. Okay, almost everything we do, almost every application, Word, Excel, Powerpoint, Photoshop, Premier, AutoCAD, your favorite applications are all hand-designed. I assure you, in the future, it will be highly machine learning. Right? So all these tools will be like that, and most importantly, you will have agents, machines to help you use them. Okay. So now we know this is a fact. Right? We have reinvented computing. We will not go back. The entire computing technology stack is being reinvented. Okay. Now that we have done this, we said software will be different. What software can write will be different. The way we use software will also be different. So let's acknowledge that now. So these are my basic facts now. Yes.

The question now is what will happen? Let's look back at past household computing. Past computers had $1 trillion in investment. Let's see, just open the door, look at the datacenter, look at it. Are these the computers you want for the future? The answer is no. You have all those CPUs over there. We know what it can do, what it can't do. We just know we have $1 trillion that needs modernizing in datacenters. So now, as we speak, if we are to modernize these old things in the next four to five years. This is not unreasonable.

So we have a trend, you are talking to those who must modernize on GPUs. That's right.

Let's do a test again. You have $50 billion in capital expenditures. Do you prefer to spend options A and B to build capital expenditures for the future, right?

Or build capital expenditures as in the past, now that you already have past capital expenditures, right? Yes, correct. It's already there. It hasn't improved much anyway. Moore's Law is basically over. So why rebuild it?

We take out $50 billion and invest in generative AI, right? So now your company is getting better. Right? How much of this $50 billion will you invest now? Well, I will invest 100% of the $50 billion because I already have the infrastructure from the past four years.

So now you just, I just deduce from someone thinking from the perspective of first principles, this is what they are doing. Smart people are doing smart things. Now, the second part is like this. So we have a capacity worth a trillion dollars. Go on, Bill.

Infrastructure worth trillions of dollars. Probably around $150 billion. Okay. So, we have $1 trillion of infrastructure that needs to be built in the next four to five years. Well, the second thing we observe is that the way software is written is different, but the way software is used is also different.

In the future, we will have agents. Our company will have digital employees. In your inbox, you will see these small dots on the low faces. In the future, things mean low AIS icons. Right? I will send these to them.

I am no longer programming computers with C++, I will use prompts to program AI. Right? Now, this is no different from chatting with me this morning.

Before I came here, I wrote a lot of emails. Of course, I am briefing my team. I will describe the background, describe the basic constraints I know, describe their tasks. I will leave enough space, I will provide enough direction for them to understand what I need. I try to make it clear as possible what the results should be, but I leave enough vagueness, a bit of creative space, so they can surprise me.

Right? This is no different from briefing AI today. Yes, this is exactly how I propose AI. So, on top of the modernized infrastructure, there will be a new infrastructure. This new infrastructure will be an AI factory that operates these digital humans 24/7.

We will provide these devices to all companies worldwide. We will own them in the factories, we will own them in autonomous systems. Right? So, there is a whole layer of computational structure. This entire layer I call an AI factory, something the world must manufacture, but does not exist today.

So the question is, how big is this. It is not known yet. It could be in the trillions. I know the current situation, but the beauty of it as we sit here building, is that the modern architecture of this new data center and the architecture of the AI factory are the same. That's a good thing.

Brad Gerstner:

Can you make it clear that you have one trillion dollars of old stuff? You need to modernize. You have at least one trillion new AI workloads coming soon. Yes, your revenue this year will reach $125 billion. Someone once told you that the market cap of this company will never exceed $1 billion. What reasons do you have when you sit here today? Yes, if you only have $125 billion in a TAM of trillions of dollars, then your future revenue will not be 2 or 3 times what it is now. Is there a reason why your revenue hasn't grown? No.

Huang Renxun:

As you know, not everything is like that. Companies are only limited by the size of the fish pond, goldfish ponds can only be that big. So the question is, what is our fish pond? What is our pond? This requires a lot of imagination, that's why market makers consider the future without creating new fish ponds. It's hard to look back and try to grab market share. Yes. Share gainers can only be so big. Of course. Market makers can be very large. Of course.

So, I think the good fortune our company has is that from the very beginning of the company, we had to create the market to swim in it. People didn't realize it at that time, but now they do, and we are at the starting point of creating the 3D gaming PC market. We basically invented this market, as well as all the ecosystem and card ecosystem, we invented all of this. Therefore, there is a need to invent a new market to serve it in the future, which is a very comfortable thing for us.

Huang Renxun: I am delighted with the success of OpenAI

As is well known, OpenAI raised $6.5 billion at a valuation of $150 billion this week. We all participated.

Huang Renxun:

Yes, really happy for them, really glad they came together. Yes, they did a great thing, and the team did a great job.

Brad Gerstner:

Reports say that their revenue or revenue this year will reach around $5 billion, and next year it could reach $10 billion. If you look at today's business, its revenue is roughly twice that of Google's initial public offering. They have 0.25 billion, yes, an average of 0.25 billion users per week, we estimate this is twice the number of users at Google's initial public offering. If you look at the company's P/E ratio, if you believe it will reach $10 billion next year, then it's about 15 times the expected revenue, the same P/E ratio as Google and Meta at their initial public offerings. Imagine a company that had zero revenue and zero average users per week just 22 months ago.

Tell us about the importance of OpenAI as a partner to you, and the power of OpenAI in driving public awareness and usage of AI.

Huang Renxun:

Well, this is one of the most important companies of our time, a pure AI company pursuing the AGI vision. Whatever its definition may be. I hardly think the definition matters at all, nor do I think timing is crucial. One thing I know is that AI will have a capability roadmap over time. And this capability roadmap will be very spectacular and peculiar. In this process, even before it reaches anyone's definition of AGI, we will fully leverage it.

All you have to do is, now, as we speak, talk to digital biologists, climate tech researchers, materials researchers, physicists, astrophysicists, quantum chemists. You can talk to video game designers, manufacturing engineers, robotics experts. Choose your favorite. No matter what industry you choose, dive deep, talk to key people, ask them if AI has completely changed how you work. Collect these data points, then ask yourself how skeptical you want to be. Because they are not talking about the conceptual advantage of AI. They are talking about using AI in the future. Now, agricultural technology, materials technology, climate tech, you pick your tech, you pick your science field. They are advancing. AI is helping them advance their work.

Now, as we said, every industry, every company, every height, every university. Unbelievable. Right? Absolutely. It will change business in some way. We know that. I mean, we know it's so real.

Today. It's happening. It's happening. So, I think ChatGPT's awakening triggered it, which is absolutely incredible. I like their speed and their unique goal of driving the development in this field, it's really important.

Brad Gerstner:

They have built an economic engine that can fund the next frontier of models. I think Silicon Valley is forming a consensus that the entire model layer, commercialized Llama, enables many to build models at very low cost. So in the early days, we had many model companies. These, features, tones, and cohesion are all listed on the list.

Many question whether these companies can build escape velocity on the economic engine to continue funding the next generation. My feeling is that this is why you see consolidation. OpenAI clearly has achieved velocity. They can fund their own future. I'm not sure how many other companies can. Is this a fair assessment of the current state of the model layer? We will, as in many other markets, consolidate this onto market leaders who can afford it, they have the economic engine and applications to keep investing.

Simply having powerful GPUs does not guarantee a company's success in the AI field.

Huang Renxun:

First, there is a fundamental difference between models and AI. Yes. Models are essential elements. Yes. For AI, it is necessary but not sufficient. Yes. So, AI is an ability, but for what, right? So what's its application? Right? AI for self-driving cars is related but not the same as AI for humanoid robots, the latter is related but not the same as AI for chatbots.

So you have to understand taxonomy. Yes, the taxonomy of the stack. At each layer of the stack, there will be opportunity, but not every layer of the stack will provide unlimited opportunities for everyone.

Now, I just said a sentence, and what you did was replace the word model with GPU. In fact, this was a great observation of our company 32 years ago, that there is a fundamental difference between GPU, graphics chips, or GPUs, and accelerated computing. Accelerated computing is different from the work we do in AI infrastructure. They are related but not entirely the same. They are superimposed on each other. They are not completely the same. And each of these abstraction layers requires completely different skills.

People who are truly good at building GPUs do not necessarily know how to become an accelerated computing company. I can give you an example, there are many people manufacturing GPUs. I don't know which one is later, we invented the GPU, but you know we are not, we are not the only company making GPUs today, right? GPUs are everywhere, but they are not accelerated computing companies. Many people do this. Their accelerators can accelerate applications, but that's different from being an accelerated computing company. For example, a very specialized AI application, right, that could be a very successful thing, right?

Brad Gerstner:

This is MTIA, the cni next-generation AI accelerator chip developed by Mata.

Huang Renxun:

Yes. But it may not be the kind of impactful and capable company. So you have to decide what kind of person you want to become. All of these different fields may have opportunities. But just like building a company, you have to pay attention to the changes in the ecosystem and what will be commodified over time, recognize what is a feature, what is a product, right, what is a company. Right. I just said, well, you can think about this issue in many different ways.

xAI and the Memphis supercomputer cluster have entered the era of '0.2 million to 0.3 million GPU clusters.'

Brad Gerstner:

Of course, there is a newcomer who is rich, intelligent, and ambitious. That's xAI. Yes, that's right. Moreover, there are reports that you and Larry Ellison (Oracle's co-founder) and Musk had dinner together. They convinced you to give up 100,000 H100 chips. They went to Memphis and within a few months built a large continuous super cluster.

Huang Renxun:

Three points, let's not equate things, okay? Yes, I had dinner with them.

Brad Gerstner:

Do you think they have the ability to build this super cluster? There are rumors that they want another hundred thousand H200, right, to expand the scale of this super cluster. First of all, tell us about X, their ambitions, and their achievements, but at the same time, have we reached the era of 0.2 to 0.3 million GPU clusters?

Huang Renxun:

The answer is yes. First, acknowledge the achievements. From the moment of the concept to the data center ready for Nvidia to install our equipment there, to when we start it up, connect it, and perform the first training, it's all worth it.

Huang Renxun:

Okay. So the first part is to build a huge factory in such a short time, with water cooling, electricity, and permits. What I mean is, it's like Superman. Yes, as far as I know, there is only one person in the world who can do this. What I mean is, Musk's understanding of engineering and construction of large systems, as well as resource allocation, is unparalleled. Yes, this is truly incredible. Of course, his engineering team is also excellent. What I mean is, the software team is great, the network team is great, the infrastructure team is great. Musk has a deep understanding of this.

From the moment we decided to start planning with the engineering team, network team, infrastructure computing team, and software team, all the preparations were done ahead of time. Then all the infrastructure, all the logistics, the amount of technology and equipment arriving on the day, video infrastructure and computing infrastructure, and all the technologies needed for training, were all in suspense for 19 days. What do you want? Done.

Stepping back, do you know how many days 19 days are? Is 19 days a few weeks? Right? If you see with your own eyes, the amount of technology is incredible. All the cabling and networks, the networks of Nvidia equipment are very different from those of the super large-scale data centers. Ok, how many wires does a node need. The back of the computer is full of wires, and it's incredible to integrate this pile of technology and all the software together.

So I think what Musk and the X team have done, I am very grateful for his recognition of the engineering work and planning work we have done together. But their achievements are unique, never seen before. Just from this perspective. One hundred thousand GPUs, as a cluster, could easily become the fastest supercomputer on earth. Typically, building a supercomputer requires three years of planning time. Then they deliver the equipment, and it takes a year to get them all up and running. Yes, we're talking about 19 days.

Clark Tang:

What is Nvidia's credit?

Huang Renxun:

Everything is running smoothly now. Yes, of course, there's a whole ton of X algorithms, X frameworks, X stacks, and so on. We say we have a lot of reverse integration to do, but the planning is superb. Just pre-planning.

Large-scale distributed computing is an important direction for the future development of AI.

Brad Gerstner:

One end is correct. Musk is one end. Yes, you, but when you answered this question from the beginning, you said, yes, there are 20 to 0.3 million GPU clusters here. Yes, that's right. Can this scale to 0.5 million? Can it scale to 1 million? Does your product demand depend on it scaling to 2 million?

Huang Renxun:

The last part is negative. My feeling is that distributed training must be efficient. My sense is that distributed computing will be invented. Some form of federated learning and distributed computing, asynchronous distributed computing will be discovered.

I am very enthusiastic and optimistic about this, of course, to be aware that scaling laws used to be about pre-training. Now we have turned to multimodal, we have turned to synthetic data generation, post-training has now expanded incredibly. Synthetic data generation, reward systems, based on reinforcement learning, and now inference scaling has peaked. A model has undergone incredibly 10,000 internal inferences before answering your question.

This may not be unreasonable. It may have completed tree search. It may have done reinforcement learning on top of this. It may, it may have done some simulation, certainly much reflection, maybe searched for some data, looked at some information, right? So his background may be quite extensive. I mean, this type of intelligence is. Well, that's what we do. That's what we do. Right? So, in terms of ability, this expansion, I just made a calculation and compounded it with model and computation size quadrupling every year.

On the other hand, demand for usage continues to grow. Do we think we need millions of GPUs? Without a doubt. Yes, now that's certain. So the question is, how do we build it from a data center perspective? This is largely whether the data center is a few thousand megawatts at a time or 250 megawatts at a time. My feeling is, you'll get both at the same time.

Clark Tang:

I think analysts always focus on current architectural bets, but I think one of the biggest takeaways from this conversation is that you are considering the entire ecosystem and many years into the future. So, as NVIDIA is just expanding or scaling up to meet future demands. It's not about just relying on a world with 0.5 million or even a million GPU clusters. When distributed training emerges, you will write software to implement it.

Huang Renxun:

We developed Megatron seven years ago. Yes, the scaling of these large training tasks will happen. So we invented Megatron, so all ongoing model parallelism, all breakthroughs in distributed training, and all batch processing and all these things are because we did the early work, now we are doing the early work for the next generation.

AI is changing the way we work.

Brad Gerstner:

So let's talk about Strawberries and O1. I think it's cool that they named it O1. This means recruiting the best and brightest in the world and bringing them to the USA. I know we are all passionate about this. So, I like the idea of building a model of thought that takes us to the next level of expanded intelligence, right? It’s a tribute to the fact that it's these people who came to the USA through immigration that made us who we are today, brought their collective wisdom to the USA.

Huang Renxun:

Certainly. And extraterrestrial intelligence.

Brad Gerstner:

Certainly. This is led by our friend Noam Brown. The reasoning time for reasoning as a new carrier of expanded intelligence is crucial, separate from just building larger models.

Huang Renxun:

This is a big deal. This is a big deal. I think many intelligences cannot be completed a priori. Right. Many computations, and even many computations, cannot be reordered. What I mean is, unordered execution can take precedence, many things can only be completed at runtime.

So, whether you are thinking from a computer science perspective or an intelligence perspective, too many things require context. Environment, right. And quality, the type of answer you are looking for. Sometimes, a quick answer is sufficient. It depends on the consequences, impact of the answers. It depends on the nature of the answer's use. So, some answers, take an evening, some answers take a week.

Yes. Right? So I can totally imagine sending a prompt to my AI, telling it, take an evening. Don't tell me right away. I want you to think all night, and then tell me tomorrow. What are your best answers and reasons for me. So, I think from a product perspective, the current quality, the segmentation of intelligence. There will be one-time versions. Certainly. And some that will need five minutes.

Right? And humans. So if you're willing, we will become a large workforce. Some of them are digital in AI, some are biological, and I hope some are even super robots.

Brad Gerstner:

From a business perspective, this is a severely misunderstood matter. You just described a company that produces as much as a company with 0.15 million people but you achieved it with only 0.05 million people. Right. Now, you didn't say I'm going to lay off all employees. No. You are still increasing the number of employees in the organization, but the output of the organization will increase significantly.

Huang Renxun:

This, this is often misunderstood. AI is not me. AI will not change every job. AI will have a huge impact on how people work. Let us admit this. AI has the potential to bring incredible benefits. It also has the potential to cause harm. We must build safe AI. Yes, let's lay this foundation. Yes. Good.

Huang Renxun:

What people overlook is that when companies use AI to increase productivity, it is likely to result in better profits or better growth, or both. When this happens, the CEO's next email is likely not about lay-offs.

Brad Gerstner:

Of course it is a public announcement, because you are growing.

Huang Renxun:

The reason is that we have more ideas, we can explore, and we need people to help us carefully consider before automation. So in terms of automation, AI can help us achieve it. Obviously, it will also help us think, but we still need to figure out what problems I want to solve. The problems we can solve are in the trillions. So, the company needs to address what issues, choose these ideas, and find ways to automate and scale. Therefore, as we become more productive, we will hire more people. People forget this point, if you go back in time, obviously we have more ideas today than 200 years ago. That's why GDP is larger, and there are more jobs. Even though we are aggressively automating at the ground level.

Brad Gerstner:

This is a very important point of this period. We are entering an era where almost all human productivity, almost all human prosperity, is a by-product of automation. The technology of the past 200 years. I mean, you can look at the creative destruction of Adam Smith and Joseph Schumpeter, you can look at the per capita GDP growth chart over the past 200 years, and now it is accelerating.

Yes, this makes me think of this issue. If you look at the 1990s, our U.S. productivity growth rate was about 2.5% to 3% per year, right? Then in 2010, it slowed down to about 1.8%. And the past 10 years have been the slowest decade of productivity growth. So this is the slowest our fixed quantities of labor and capital or output have been since records began.

Many people are debating this reason. But if the world is really as you describe it, that we are harnessing and creating intelligence, then are we on the cusp of a sharp expansion in human productivity?

Huang Renxun:

This is our hope. This is our hope. Of course, we live in this world, so we have direct evidence.

We have direct evidence that either isolated cases or individual researchers who can use AI to explore science on an unimaginably large scale. That's productivity. Measuring productivity one hundred percent, or we are designing such incredible chips at such a high speed. The complexity of the chips we are building and the complexity of computation are growing exponentially, and the company's employee base is not the standard for measuring productivity, right.

The software we develop is getting better and better because we use AI and supercomputers to help us. The number of employees is almost growing linearly. Another manifestation of productivity.

So, I can delve into research, I can sample many different industries. I can inspect it personally. Yes, you're right. Business. Exactly.

So I can, of course, you can't, we can't, we might overfitting. But the art of it, of course, is summarizing what we observe, and whether this will be reflected in other industries.

Undoubtedly, AI is the most valuable commodity known in the world. Now we are going to mass produce it. We, we, all of us must excel, what will happen if you are surrounded by these AIs, which are doing very well, much better than you. When I look back, this is my life. I have 60 direct subordinates.

They are world-class in their own field, and they are better than me. Much better than me. I interact with them without any difficulty, and I can effortlessly design them. I can also effortlessly program them. So I think what people need to learn is that they will all become CEOs.

They will all become CEOs of AI agents. They have the ability to be creative, well, some knowledge, and how to reason, how to break down problems, so you can program these AIs to help you achieve goals like me. That's running a company.

AI security requires multi-party efforts

Brad Gerstner:

Now. You mentioned something, that is the lack of coordination, safe ai. You mentioned the tragedy unfolding in the Middle East. We have a lot of autonomy, and a lot of ai is being used around the world. So let's talk about the bad guys, secure ai, and coordination with Washington. How do you feel today? Are we on the right track? Do we have enough coordination? I think Mark Zuckerberg once said, the way we defeat bad ai is by making good ai better. How do you describe your view on how we ensure this brings positive net benefits to humanity, rather than living in this dystopian world.

Huang Renxun:

The discussion about security is indeed very important and good. Yes, the abstract perspective, the conceptual view of AI as a huge neural network, is not that great, right? Good. The reason is, as we all know, AI and large-scale language models are related, but not the same thing. I think there are a lot of things being done very well. First, open-source models so that the entire research community, every industry, and every company can participate in AI, yes, and learn how to harness this capability for applications. Very good.

Second, people underestimate the number of technologies dedicated to inventing AI to ensure AI security. Yes, AI can organize data, carry information, train, create AI to coordinate AI, generate synthetic data to expand the knowledge of AI, and reduce illusions. All created for vectoring or graphing or any other AI monitoring system created to inform AI, protect AI, to monitor other AI, these secure AIs created are being praised, right?

Brad Gerstner:

So we have built.

Huang Renxun:

So, we are building all of this. Yes, in the entire industry, methodologies, red teams, processes, model cards, evaluation systems, benchmark testing systems, all of these, all of these are being built at an incredible pace. I want to know, celebrate. Do you understand? Yes, you know.

Brad Gerstner:

And, there are no government regulations saying you must do this. Yes, today, participants building AI in this field are taking these key issues seriously and coordinating around best practices. That's right.

Huang Renxun:

So this has not received sufficient attention, nor has it been fully understood. Yes. Someone is needed, everyone needs to start talking about AI, this is an AI system, an engineering system, carefully designed, built from first principles, thoroughly tested, and so on. Remember, AI is a capability that can be applied. I don't think it's necessary to regulate important technologies, but also not to over-regulate, to the point where some regulation should apply to most applications. All different ecosystems that already regulate technology applications must now regulate technology applications integrating AI.

So, I think, do not misunderstand, do not ignore the plethora of regulations that must be initiated worldwide for AI. Don't just rely on a universal silver galaxy. An AI Commission may be able to do this, as the establishment of all these different institutions has reasons. The establishment of all these different regulatory bodies has reasons. Going back to the original principles, I will.

The opposition between open source and closed source is wrong.

Brad Gerstner:

You have launched a very important, very large, and very powerful open-source model.

Huang Renxun:

Nemotron.

Brad Gerstner:

Yes, obviously, Meta has made a significant contribution to open source. I found that when I read Twitter, there are many discussions about open and closed. How do you view open source, your own open source model, can it keep up with the cutting edge? This is the first question. The second question is, you know, having open source models and closed source models, do they power business operations, is this your view of the future? Do these two things create healthy tension for security?

Huang Renxun:

Open source and closed source are related to security, but not only security. For example, having closed source models is absolutely not wrong, they are the engine of the economic models necessary to maintain innovation. Well, I totally agree with that. I think the opposition between closed and open is wrong.

Because openness is a necessary condition for activating many industries, now, if we don't have open source, how can all these different scientific fields be activated, activate AI. Because they have to develop AI in their specific fields, they have to use open source models to develop their AI, create AI in specific fields. They are relevant, not, say it again, not the same. Just because you have an open source model doesn't mean you have AI. So you must have that open source model to create AI. So industries and scientific fields like financial services, medical care, transportation, this list has now been made possible because of open source.

Brad Gerstner:

Unbelievable. Have you seen the huge demand for your open-source models?

Huang Renxun:

Our open-source models? First. Llama download. Clearly, yes, the work Mark and his team have done is incredible. Beyond imagination. Yes. It has fully activated and engaged every industry, every scientific field.

Okay, of course. The reason we do Nemotron is to generate synthetic data. Intuitively speaking, an AI will sit there in some way looping and generating data to learn itself. This sounds fragile. How many times can you go around this infinite loop, this loop is questionable. However, the picture in my mind is a bit like finding a super smart person, locking him in a padded room, closing the door for about a month, and what comes out is probably not a smarter person. So, so, but you can have two or three people sitting together, we have different AIs, we have different knowledge distributions, and we can back and forth ensure the quality. The three of us can all become smarter.

So, you can let AI models exchange, interact, pass back and forth, discuss reinforcement learning, synthetic data generation, etc., this idea makes intuitive sense, suggestions can be made and meaningful. Therefore, our model Nemotron 350B is the best reward system model in the world. So, this is the best criticism.

Interesting. This is a great model that can enhance other models. Therefore, no matter how good other people's models are, I would recommend using Nemotron 340B to enhance and improve them. We've seen Llama get better, making all other models better.

Brad Gerstner:

As a person who delivered the DGX1 in 2016, this has been a truly incredible journey. Your journey is both incredible and unbelievable. It's amazing as if you just survived in the early days. You delivered the first DGX1 in 2016. We ushered in the Cambrian moment in 2022.

So, I have a question that I often want to know the answer to - how long can you maintain your current role with 60 direct subordinates? You are everywhere. You are driving this revolution. Are you having fun? Is there anything else you would rather be doing?

Huang Renxun:

This is a question about the past hour and a half. The answer is "I enjoyed it very much." It was a great time. I can't imagine doing anything else I would prefer. Let's see. I think, I think we shouldn't leave the impression that our work is always fun. My work is not always fun, and I don't expect it to always be fun. Did I ever expect it to always be fun? I think it's always important.

Yes, I don't take myself too seriously. I take work very seriously. I take our responsibilities very seriously. I take our contributions and our moments very seriously.

Is it always fun? No. But have I always enjoyed it? Yes. Like everything, whether it's family, friends, or children. Is it always fun? No. Do we always love it? Absolutely.

So, how long can I go on? The real question is, how long can I remain relevant? That's what matters most, and the answer to this question can only be how will I continue to learn? Today I am more optimistic. I say this not only because of our topic today. I am more optimistic about my relevance and ability to keep learning, because of AI. I use it every day, and I don't know, but I believe you all do too. I use it almost every day.

I don't have a study that doesn’t involve AI. Yes, not a single question, even when I know the answer, I cross-check with AI repeatedly. Yes, surprisingly, the next two or three questions I ask reveal some things I didn't know. You choose your topics. You choose your topics. I think AI is a mentor.

AI is an assistant, AI is a partner, can brainstorm together, check my work, colleagues, this is completely revolutionary. I am an information worker. I output information. So I think their contribution to society is amazing. So I think, if that's the case, if I can maintain this relevance, continue to make contributions, I know this work is important enough, yes, I want to continue pursuing it, my quality of life is incredible. So I will.

Brad Gerstner:

I can't imagine you and I have been working in this field for decades, I can't imagine missing this moment. This is the most important moment in our careers. We are very grateful for this partnership.

Huang Renxun:

Looking forward to the next ten years.

Brad Gerstner:

Thought partnership. Yes, you make things smarter. Thank you. I think it's really important as part of the leadership, right, to optimistically and safely lead all this forward. So thank you.

Huang Renxun:

Being with you all. I'm really happy. Really. Thank you.

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more
    Write a comment