OpenAI launches an explosive new product: the ChatGPT voice assistant provides real-time feedback on videos.

Zhitong Finance · Dec 13, 2024 11:25

在OpenAI首次演示七个月后，它现在推出可以理解实时视频的功能。

智通财经获悉，在首次公开演示将近七个月后，OpenAI正式面向广大付费用户推出ChatGPT类人高级人工智能(AI)语音助手的新功能，实时视频对话。在周四的直播中，该公司表示，ChatGPT类似人类的对话功能Advanced Voice Mode正在实现。Advanced Voice由OpenAI的多模态模型GPT-4o支持。

OpenAI宣布，在ChatGPT的移动端应用程序App中，上线ChatGPT高级语音模式Advanced Voice的视频和共享屏幕功能，即使用ChatGPT应用程序，订阅ChatGPT Plus、 Team或Pro的用户可以将他们的手机指向对象，并让ChatGPT近乎实时地响应。

OpenAI的研究员在直播中演示了任何运用新功能，点击ChatGPT 聊天栏旁边的语音图标，然后点击左下角的视频图标，就可以启动视频对话。如果想共享屏幕，手机用户要点击打开一个有三个喧嚣的菜单，选择其中的“共享屏幕”。Advanced Voice可以通过屏幕共享来理解设备屏幕上的内容。例如，它可以解释各种设置菜单，或者对数学问题给出建议。

OpenAI表示，大多数ChatGPT Plus和 Pro套餐的订阅用户以及所有Team用户都将可以在今后几天内通过ChatGPT的App访问周四推出的新功能，预计欧盟、瑞士、冰岛、挪威和列支敦士登的 ChatGPT Plus 和 Pro 用户也将很快可以运用新功能。ChatGPT的企业版和教育版Enterprise和Edu将于明年1月上线新功能。

Advanced Voice已经被推迟了好几次，据报道部分原因是OpenAI在产品准备就绪之前就宣布了这项功能。今年4月，OpenAI承诺，“Advanced Voice”将在“几周内”向用户推出。几个月后，该公司表示需要更多时间。

OpenAI在6月末向一小批Plus计划用户推出该语音模式，6月又宣布推迟一个月发布，以便确保该功能安全有效地处理来自数百万用户的请求。当时OpenAI称，计划今年秋季让所有Plus用户都可以访问该功能，确切的时间表取决于是否达到内部对安全性和可靠性的高标准。7月末，OpenAI对有限的部分付费Plus用户推出高级语音模式下的ChatGPT，称语音模式无法模仿他人的说话方式，且增加了新的过滤器，保证软件能够发现并拒绝某些生成音乐或其他形式受版权保护音频的请求。

此外，谷歌(GOOGL.US)和Meta(META.US)等竞争对手也在为各自的聊天机器人产品开发类似的功能。本周，谷歌推出了实时视频分析对话式人工智能功能Project Astra，供一群“值得信赖的测试者”使用。

Seven months after its first demonstration, OpenAI has now launched a feature that can understand real-time video.

Zhito Finance learned that nearly seven months after the first public demonstration, OpenAI officially launched a new function of ChatGPT-like advanced AI voice assistant for paying users, which is real-time video conversations. In a live stream on Thursday, the company stated that the Advanced Voice Mode, which resembles human conversation, is being realized. Advanced Voice is supported by OpenAI's multimodal model GPT-4o.

OpenAI announced that in the mobile application of ChatGPT, the video and screen sharing features of the Advanced Voice mode have been launched, allowing users who subscribe to ChatGPT Plus, Team or Pro to point their phones at objects and let ChatGPT respond almost in real-time.

OpenAI’s researchers demonstrated the use of the new feature in the live stream, clicking the voice icon next to the ChatGPT chat bar, and then clicking the video icon in the lower left corner to initiate a video conversation. To share the screen, mobile users must click to open a menu with three noisy options and select 'Share Screen'. Advanced Voice can understand content on the device's screen through screen sharing. For example, it can explain various settings menus or provide suggestions for math problems.

OpenAI stated that most subscribers of ChatGPT Plus and Pro packages as well as all Team users will be able to access the new features launched on Thursday through the ChatGPT app within the next few days, and ChatGPT Plus and Pro users in the EU, Swiss Franc, Iceland, Norway, and Liechtenstein are also expected to soon utilize new features. The enterprise and education versions of ChatGPT, Enterprise and Edu, will launch the new features in January next year.

Advanced Voice has been delayed several times, reportedly partly because OpenAI announced the feature before it was product-ready. In April of this year, OpenAI promised that 'Advanced Voice' would be launched to users 'within a few weeks'. Months later, the company stated that it needed more time.

At the end of June, OpenAI introduced the voice mode to a small group of Plus plan users and then announced a month-long delay to ensure that the function could safely and effectively handle requests from millions of users. At that time, OpenAI stated that the plan was to allow all Plus users to access the feature this fall, with the exact timeline depending on whether internal high standards for safety and reliability were met. At the end of July, OpenAI launched ChatGPT under the Advanced Voice mode for a limited number of paying Plus users, stating that the voice mode cannot mimic others' speaking styles and has added new filters to ensure that the software can detect and reject requests to generate music or other forms of copyright-protected audio.

In addition, competitors such as Google (GOOGL.US) and Meta (META.US) are also developing similar features for their respective chatbot products. This week, Google launched the real-time video analysis conversational AI feature Project Astra for a group of "trusted testers" to use.

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more

OpenAI推出炸裂新产品：ChatGPT语音助手对视频给出实时反馈

OpenAI launches an explosive new product: the ChatGPT voice assistant provides real-time feedback on videos.

Risk Disclaimer

Statement