Intelligence of Large Models Comparable to That of a Five-Year-Old Child, Says Professional at Shanghai-Chongqing Institute of Artificial Intelligence

上海-重庆人工智能研究所专业人士表示，大型模型的智能水平相当于五岁儿童。

钛媒体 · 07/22 04:53

TMTPOST—A recent news report about poor capabilities of many large AI models in solving simple arithmetic problems has sparked heated discussions in China.

Users asked 12 AI models, including GPT-4o, whether "Which number is bigger? 9.11 or 9.9?" Only four models—Alibaba's Tongyi Qianwen, Baidu's Wenxin Yiyan, Minimax, and Tencent's Yuanbao—provided the correct answer, while the other eight, including ChatGPT-4o, gave incorrect responses.

This discrepancy highlights significant issues with the mathematical capabilities of large AI models, showing numerous problems that need to be addressed.

In an exclusive interview with TMTPost, Qi Peng, Director of the AI Large Model Center at Shanghai-Chongqing Institute of Artificial Intelligence, noted that while large models have immense potential and can handle complex problems with generalization abilities, their current level of intelligence is still rudimentary.

Qi likened these models to "five-year-old children" due to limitations such as insufficient computational power, inadequate text data, and challenges with accuracy and reliability.

Qi holds bachelor's and master's degrees at Tsinghua University and a Ph.D. from the University of Wisconsin-Madison and has extensive experience in data science and AI. Under his leadership, the Shanghai-Chongqing Institute of Artificial Intelligence has developed the "Zhao Yan" large language model, which ranked third globally and second domestically in the SuperCLUE Chinese Large Model Intelligence Benchmark in March this year.

Additionally, in July, Qi and his team, including PhD student Zhuang Shaobin, replicated the Sora text-to-video model in an open-source community project. The advanced Latte spatiotemporal decoupling attention architecture enabled the generation of 16-second (128-frame) videos, a significant improvement from the previous 3-second (24-frame) capability.

Qi explained that the Sora model functions like a new "tool" that addresses various issues. Beyond video generation, Sora can be applied in areas such as autonomous driving and physical world simulation. The most immediate application is in video creation, where users can input text descriptions to rapidly produce videos, thus enhancing efficiency and convenience.

Qi also observed that while large models have broad applications across various sectors, real-world deployment remains limited. The primary challenges include the models' mathematical and engineering deficiencies and the inherent limitations of statistical methods in achieving 100% accuracy.

Looking to the future of artificial general intelligence (AGI) development, Qi emphasized that humanity is at a pivotal moment on the path to AGI. Although current models have not yet reached AGI standards, he believes that ChatGPT has positioned human beings at a critical juncture in history.

While the intelligence of large models can continue to advance from a child's level to that of top experts, they will always require supportive infrastructure and tools for effective operation and application. Although developing these facilities might be relatively inexpensive, they are crucial for the practical use and societal value of large models, Qi added.

TMTPOST—A recent news report about poor capabilities of many large AI models in solving simple arithmetic problems has sparked heated discussions in China.

TMTPOST——最近关于许多大型人工智能模型在解决简单算术问题方面的能力不佳的新闻报道在中国引发了激烈的讨论。

用户询问了包括GPT-4o在内的12种人工智能模型，是否 “哪个数字更大？9.11 还是 9.9？”只有四种型号——阿里巴巴的统易千文、百度的文心一言、Minimax和腾讯的元宝——提供了正确的答案，而包括ChatGPT-4O在内的其他八个模型给出了错误的答案。

This discrepancy highlights significant issues with the mathematical capabilities of large AI models, showing numerous problems that need to be addressed.

这种差异凸显了大型人工智能模型数学能力的重大问题，显示了许多需要解决的问题。

上海重庆人工智能研究所人工智能大模型中心主任齐鹏在接受《TMTPost》独家采访时指出，尽管大型模型具有巨大的潜力，可以通过泛化能力处理复杂问题，但它们目前的智能水平仍处于初级水平。

Qi likened these models to "five-year-old children" due to limitations such as insufficient computational power, inadequate text data, and challenges with accuracy and reliability.

由于计算能力不足、文本数据不足以及准确性和可靠性方面的挑战等局限性，齐将这些模型比作 “五岁的孩子”。

齐拥有清华大学的学士和硕士学位以及威斯康星大学麦迪逊分校的博士学位，在数据科学和人工智能领域拥有丰富的经验。在他的领导下，上海-重庆人工智能研究所开发了 “赵岩” 大语言模型，该模型在今年3月的SuperClue中国大型模型智能基准测试中排名全球第三，在国内排名第二。

此外，在7月，齐和他的团队，包括博士生庄少斌，在一个开源社区项目中复制了Sora文本转视频模式。先进的 Latte 时空解耦注意力架构支持生成 16 秒（128 帧）视频，与之前的 3 秒（24 帧）能力相比有了显著改进。

齐解释说，Sora模型的功能就像一个解决各种问题的新 “工具”。除了视频生成，Sora 还可以应用于自动驾驶和物理世界模拟等领域。最直接的应用是视频创作，用户可以在其中输入文字描述来快速制作视频，从而提高效率和便利性。

齐还观察到，尽管大型模型在各个领域都有广泛的应用，但实际部署仍然有限。主要挑战包括模型的数学和工程缺陷以及统计方法在实现 100% 准确性方面的固有局限性。

展望人工通用智能（AGI）发展的未来，齐强调人类正处于通往人工智能之路的关键时刻。尽管目前的模型尚未达到 AGI 标准，但他认为 ChatGPT 将人类置于历史的关键时刻。

尽管大型模型的智能水平可以继续从孩子的水平提升到顶级专家的水平，但它们始终需要支持性的基础设施和工具来进行有效的操作和应用。齐补充说，尽管开发这些设施可能相对便宜，但它们对于大型模型的实际用途和社会价值至关重要。

声明：本内容仅用作提供资讯及教育之目的，不构成对任何特定投资或投资策略的推荐或认可。更多信息

Intelligence of Large Models Comparable to That of a Five-Year-Old Child, Says Professional at Shanghai-Chongqing Institute of Artificial Intelligence

Intelligence of Large Models Comparable to That of a Five-Year-Old Child, Says Professional at Shanghai-Chongqing Institute of Artificial Intelligence

风险及免责提示

声明