share_log

NVIDIA Ethernet Networking Accelerates World's Largest AI Supercomputer, Built by XAI

NVIDIA Ethernet Networking Accelerates World's Largest AI Supercomputer, Built by XAI

英伟达以太网网络加速世界上最大的人工智能超级计算机,由XAI打造
GlobeNewswire ·  10/28 11:00

NVIDIA Spectrum-X Makes Colossal NVIDIA Hopper 100,000-GPU System Possible

英伟达Spectrum-X使得巨大的英伟达Hopper 100,000-GPU系统成为可能。

SANTA CLARA, Calif., Oct. 28, 2024 (GLOBE NEWSWIRE) -- NVIDIA today announced that xAI's Colossus supercomputer cluster comprising 100,000 NVIDIA Hopper Tensor Core GPUs in Memphis, Tennessee, achieved this massive scale by using the NVIDIA Spectrum-X Ethernet networking platform, which is designed to deliver superior performance to multi-tenant, hyperscale AI factories using standards-based Ethernet, for its Remote Direct Memory Access (RDMA) network.

加利福尼亚圣塔克拉拉市,2024年10月28日(环球新闻社) - 英伟达今天宣布,xAI的Colossus超级计算机集群在田纳西州孟菲斯市,由10万个英伟达Hopper张量核GPU组成,通过使用英伟达的NVIDIA Spectrum-X以获得此巨大规模,以提供卓越性能给多住户、超大规模AI工厂,使用基于标准的以太网进行远程直接内存访问(RDMA)网络。 Spectrum-X 以太网网络平台,旨在通过符合标准的以太网,为多住户、超大规模AI工厂提供卓越性能,用于其远程直接内存访问(RDMA)网络。

Colossus, the world's largest AI supercomputer, is being used to train xAI's Grok family of large language models, with chatbots offered as a feature for X Premium subscribers. xAI is in the process of doubling the size of Colossus to a combined total of 200,000 NVIDIA Hopper GPUs.

Colossus,世界上最大的AI超级计算机,正在用于训练xAI的Grok系列大型语言模型,聊天机器人作为X Premium订阅用户的功能。 xAI正在将Colossus的规模扩大一倍,总共达到20万个英伟达Hopper GPU。 英伟达Hopper GPU。

The supporting facility and state-of-the-art supercomputer was built by xAI and NVIDIA in just 122 days, instead of the typical timeframe for systems of this size that can take many months to years. It took 19 days from the time the first rack rolled onto the floor until training began.

xAI和**英伟达**仅用122天建成了支持设施和最先进的超级计算机,而不是常规系统需要花费数月甚至数年的时间。从第一个机架滚到地板上直到培训开始只花了19天。

While training the extremely large Grok model, Colossus achieves unprecedented network performance. Across all three tiers of the network fabric, the system has experienced zero application latency degradation or packet loss due to flow collisions. It has maintained 95% data throughput enabled by Spectrum-X congestion control.

在对极大型Grok模型进行训练时,Colossus实现了前所未有的网络性能。在网络结构的三个层面上,该系统未经历任何应用延迟降级或数据包丢失,也没有因流碰撞导致的问题。通过Spectrum-X拥塞控制,系统保持了95%的数据吞吐量。

This level of performance cannot be achieved at scale with standard Ethernet, which creates thousands of flow collisions while delivering only 60% data throughput.

标准以太网无法在规模上实现这种性能水平,因为它在提供仅60%数据吞吐量的同时会造成数千个流碰撞。

"AI is becoming mission-critical and requires increased performance, security, scalability and cost-efficiency," said Gilad Shainer, senior vice president of networking at NVIDIA. "The NVIDIA Spectrum-X Ethernet networking platform is designed to provide innovators such as xAI with faster processing, analysis and execution of AI workloads, and in turn accelerates the development, deployment and time to market of AI solutions."

"人工智能正变得至关重要,并需要提高性能、安全性、可扩展性和成本效益," 英伟达网络高级副总裁Gilad Shainer说道。"英伟达Spectrum-X以太网络平台旨在为xAI等创新公司提供更快速的处理、分析和执行人工智能工作负载的能力,从而加速人工智能解决方案的开发、部署和上市时间。"

"Colossus is the most powerful training system in the world," said Elon Musk on X. "Nice work by xAI team, NVIDIA and our many partners/suppliers."

“Colossus是世界上最强大的训练系统,”埃隆·马斯克在 X“xAI团队、NVIDIA和我们的许多合作伙伴/供应商做得很好,”

"xAI has built the world's largest, most-powerful supercomputer," said a spokesperson for xAI. "NVIDIA's Hopper GPUs and Spectrum-X allow us to push the boundaries of training AI models at a massive-scale, creating a super-accelerated and optimized AI factory based on the Ethernet standard."

“xAI已经构建了世界上最大、最强大的超级计算机,”xAI的一位发言人说,“NVIDIA的Hopper GPU和Spectrum-X让我们能够突破规模化训练人工智能模型的边界,创建了一个基于以太网标准的超级加速和优化的人工智能工厂。”

At the heart of the Spectrum-X platform is the Spectrum SN5600 Ethernet switch, which supports port speeds of up to 800Gb/s and is based on the Spectrum-4 switch ASIC. xAI chose to pair the Spectrum-X SN5600 switch with NVIDIA BlueField-3 SuperNICs for unprecedented performance.

在Spectrum-X平台的核心是 Spectrum SN5600以太网交换机,支持高达800Gb/s的端口速度,并基于Spectrum-4交换机ASIC。xAI选择将Spectrum-X SN5600交换机与 英伟达BlueField-3 SuperNICs 实现了前所未有的性能。

Spectrum-X Ethernet networking for AI brings advanced features that deliver highly effective and scalable bandwidth with low latency and short tail latency, previously exclusive to InfiniBand. These features include adaptive routing with NVIDIA Direct Data Placement technology, congestion control, as well as enhanced AI fabric visibility and performance isolation — all key requirements for multi-tenant generative AI clouds and large enterprise environments.

Spectrum-X以太网网络为AI带来先进的功能,提供高效可伸缩的bandwidth,低延迟和short tail latency,之前只有InfiniBand才有的。这些功能包括采用NVIDIA Direct Data Placement技术的自适应路由、拥塞控制,以及增强的AI fabric可见性和性能隔离 — 这些都是面向多租户生成式AI云和大型企业环境的关键要求。

About NVIDIA
NVIDIA (NASDAQ: NVDA) is the world leader in accelerated computing.

关于NVIDIA
通过与技术领袖(包括NVIDIA)的合作,飞利浦继续将AI迅速集成到其心血管超声成像产品组合中,以帮助提高效率和生产力,并缓解员工短缺问题,同时为越来越多的患者提供高质量的心脏病学护理。(NASDAQ:英伟达)是加速计算的全球领导者。

For further information, contact:
Alex Shapiro
NVIDIA Corporation
+1-415-608-5044
ashapiro@nvidia.com

如需更多信息,请联系:
Alex Shapiro
英伟达公司
+1-415-608-5044
ashapiro@nvidia.com

Certain statements in this press release including, but not limited to, statements as to: the benefits, impact, and performance of NVIDIA's products, services, and technologies, including NVIDIA Hopper Tensor Core GPUs, NVIDIA Spectrum-X Ethernet networking platform, NVIDIA Spectrum SN5600 Ethernet switch, Spectrum-4 switch ASIC, and NVIDIA BlueField-3 SuperNICs; features of xAI's Colossus supercomputer cluster; xAI being in the process of doubling the size of Colossus to a combined total of 200,000 NVIDIA Hopper GPUs; the NVIDIA Spectrum-X Ethernet networking platform being designed to provide innovators such as xAI with faster processing, analysis and execution of AI workloads, and in turn accelerating the development, deployment and time to market of AI solutions; NVIDIA's Hopper GPUs and Spectrum-X allowing xAI to push the boundaries of training AI models at a massive scale, creating a super-accelerated and optimized AI factory based on the Ethernet standard are forward-looking statements that are subject to risks and uncertainties that could cause results to be materially different than expectations. Important factors that could cause actual results to differ materially include: global economic conditions; our reliance on third parties to manufacture, assemble, package and test our products; the impact of technological development and competition; development of new products and technologies or enhancements to our existing product and technologies; market acceptance of our products or our partners' products; design, manufacturing or software defects; changes in consumer preferences or demands; changes in industry standards and interfaces; unexpected loss of performance of our products or technologies when integrated into systems; as well as other factors detailed from time to time in the most recent reports NVIDIA files with the Securities and Exchange Commission, or SEC, including, but not limited to, its annual report on Form 10-K and quarterly reports on Form 10-Q. Copies of reports filed with the SEC are posted on the company's website and are available from NVIDIA without charge. These forward-looking statements are not guarantees of future performance and speak only as of the date hereof, and, except as required by law, NVIDIA disclaims any obligation to update these forward-looking statements to reflect future events or circumstances.

本新闻稿中的某些声明,包括但不限于对英伟达产品、服务和技术的益处、影响和性能的陈述,包括英伟达Hopper张量核心GPU、英伟达Spectrum-X以太网网络平台、英伟达Spectrum SN5600以太网交换机、Spectrum-4交换机ASIC和英伟达BlueField-3 SuperNIC等; xAI的Colossus超级计算机群的功能;xAI正在扩大Colossus规模至总共20万英伟达Hopper GPU的过程;英伟达Spectrum-X以太网网络平台旨在为xAI等创新者提供更快的处理、分析和执行人工智能工作负载,进而加速人工智能解决方案的开发、部署和上市时间;英伟达的Hopper GPU和Spectrum-X使xAI能够推动人工智能模型训练规模的界限,创建基于以太网标准的超加速和优化的人工智能工厂是前瞻性声明,受到可能导致结果与预期大相径庭的风险和不确定性的影响。可能导致实际结果大相径庭的重要因素包括:全球经济状况;我们依赖第三方制造、组装、封装和测试我们的产品;技术发展和竞争的影响;开发新产品和技术或加强我们现有产品和技术的功能;市场对我们或合作伙伴产品的接受程度;设计、制造或软件缺陷;消费者偏好或需求变化;行业标准和接口的变化;我们的产品或技术集成到系统时性能意外丧失;以及其他在英伟达不时向美国证券交易委员会(SEC)提交的最新报告中详细说明的因素,包括但不限于其年度10-k表格和季度10-Q表格的报告。英伟达向SEC提交的报告副本已发布在公司网站上,并可免费从英伟达获取。这些前瞻性声明并不是未来表现的保证,仅截至本日期,除非法律要求,英伟达否认有义务更新这些前瞻性声明以反映未来事件或情况。

2024 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, NVIDIA Spectrum-X and BlueField are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated. Features, pricing, availability and specifications are subject to change without notice.

2024年 英伟达 公司。保留所有权利。 英伟达,英伟达标志,英伟达Spectrum-X和BlueField是 英伟达 公司在美国和其他国家的商标和/或注册商标。其他公司和产品名称可能是其关联公司的商标。功能、定价、可用性和规格如有更改,恕不另行通知。

A photo accompanying this announcement is available at

此公告附带的照片


声明:本内容仅用作提供资讯及教育之目的,不构成对任何特定投资或投资策略的推荐或认可。 更多信息
    抢沙发