share_log

商汤想要创造“超级时刻”

SenseTime wants to create a "super moment".

wallstreetcn ·  Jul 6 01:24

Author | Liu Baodan From performance to market confidence, Meituan is walking out of a three-year low point, but Wang Xing is not stopping there - he has even bigger plans. Going overseas has become a must for Chinese companies. Meituan, which has been warming up for 8 years, has finally made up its mind to put going overseas on the agenda. Recently, Meituan began recruiting senior engineers for international silver enterprise direct connection. After the model was successful in the Hong Kong market, Meituan officially kicked off its overseas expansion, accelerated recruitment and put the first stop of the overseas expansion in Saudi Arabia in the Middle East. Going overseas is a critical turning point, which means that after more than ten years of capacity accumulation, Meituan has to export its local life capabilities to the world, which is as significant as the replication of TikTok by ByteDance. In the wave of Internet companies going overseas, Meituan went overseas later because local life patterns are more important than social, e-commerce and other industries. However, Wang Xing must make this move. Against the background of intensified domestic competition and the shrinking of community group buying, he must find a new growth story. On his entrepreneurial journey, Wang Xing is still determined to create a new business legend in this global adventure. A must-have question. Meituan has fought a beautiful takeaway battle in Hong Kong. On May 6, Measurable AI, a market research firm, released the latest data showing that by March 2024, according to the number of orders, KeeTa, the takeaway business of Meituan in Hong Kong, has a market share of 44%, rising to the largest takeaway platform in Hong Kong. However, Hong Kong is only a stopover for Meituan's overseas expansion, and Meituan has set its real meaning of going overseas in Saudi Arabia. Wall Street news learned that Meituan has been recruiting people around the direction of going overseas in the past two months. The positions include engineers, overseas human resources and operation experts, international payment and transaction product managers, mainly responsible for payments, employee management and related products in overseas markets. More importantly, the recruitment of local talents. More than a month ago, Meituan posted relevant recruitment information on LinkedIn and the Middle East recruitment platform Baye.com, with Riyadh, the capital of Saudi Arabia, as the place of work. From the city selection, Meituan did not choose the United States with a larger market space, nor did it choose Southeast Asia where culture and food are more similar, but chose Saudi Arabia. It can be seen that Meituan's overseas expansion strategy still has a heavy experimental component and is more cautious. Wang Xing is not fighting an unprepared battle. For this overseas expansion, Meituan has been planning for many years. As early as 2016, Wang Xing began to consider the issue of going overseas and visited Silicon Valley, Berlin, Israel, Jakarta and other places. In 2017, Meituan officially laid out overseas accommodation business, first connecting hotels in nearly 100 countries overseas to the Meituan application. At that time, the domestic and foreign takeaway wars were in full swing, and with Meituan's listing in Hong Kong in 2018, Wang Xing's overseas strategy was forced to be shelved. Since then, Meituan has also made a series of international investments, including Swiggy in India, Gojek in Indonesia, and Opay in Nigeria, involving food, taxis, payments and other fields, to prepare for going overseas. Along with the frequent news reports of Meituan's victory in Hong Kong, Meituan's overseas plan was finally brought to an unprecedented strategic height in 2024, and Wang Xing once again rushed to the forefront. In February, Meituan put the home business group, the in-store business group and other businesses into the core local business sector, and appointed Wang Putong as CEO, while Wang Xing personally took charge of overseas business, which ensured the landing of the overseas expansion strategy in the organizational structure. In fact, before the confirmation of the overseas expansion strategy, Wang Xing personally visited the Middle East last May and met with members of the Saudi royal family, laying the foundation for Meituan's layout in Saudi Arabia.

In today's weather is good. Today's weather is good.

The explosion of ChatGPT has shown people the huge potential of large-scale AI models. After more than a year of technical catching up, domestic large-scale model companies are betting on the application end.

However, it is not an easy task to develop a truly influential product.

At the 2024 World Artificial Intelligence Conference, Xu Li, CEO of SenseTime, cautiously pointed out that although the trend is surging, we are still some distance away from the 'super moment' that truly shocks the industry. He emphasized that AI has not yet fully penetrated into the marrow of all walks of life, nor been able to create broad and profound waves of change in society.

Based on this clear understanding, SenseTime has focused on the performance of the large-scale model itself.

On July 5th, at the "Da Ai No Boundaries, Towards New Power" Artificial Intelligence Forum, SenseTime released "Ri Ri Xin 50", the first WYSIWYG model in China, with interactive experience benchmarked against GPT-40.

Specifically, "Ri Ri Xin 50" brings a new AI interaction mode by integrating cross-modal information based on voice, text, images, videos, and other forms, which is a real-time streaming multi-modal interaction.

Regarding why it is named 50, Lu Lewei, director of SenseTime Research Institute, explained to Wall Street news that this version introduces many cutting-edge capabilities that can now compete with GPT-40, and is relatively conservative in version naming. V6 will have a larger plan and bring a more comprehensive and basic upgrade.

Innovative interaction mode

At the scene, SenseTime demonstrated the capabilities of "Ri Ri Xin 50":

At the beginning, the staff just greeted "Ri Ri Xin 50", and it automatically recognized the words on the staff's neck badge, judged that it was at the venue of the World Artificial Intelligence Conference, and said that it can "study well" here.

Next, the staff took a cute puppy doll, and "Ri Ri Xin 50" accurately described the appearance, expression, and important wearing - a white hat with the SenseTime logo, very impressive for the audience.

Moving on to something more challenging, randomly opening any page of a book, "Ri Ri Xin 50" can automatically introduce it, not just simple OCR text recognition, but recognizing and summarizing the graphics and text into easy-to-understand content. All of this can be done in an instant for real-time interaction.

The staff also demonstrated their drawing skills on the spot, casually drawing a simple rabbit, and "Ri Ri Xin 50" praised the cuteness of it. After the staff added a smiling expression, it captured the joy from the calm expression. The staff then drew another line to make the mouth bigger and added a tongue. "Ri Ri Xin 50" immediately noticed the change and expressed even more happiness about the expression.

"Ri Ri Xin 50" creates a chat-like communication dialogue, and according to SenseTime, this interactive mode is particularly suitable for real-time conversation and voice recognition applications. The interactive experience that can achieve the benchmark of GPT-40 is due to the comprehensive improvement of the basic model capabilities of "Ri Ri Xin 5.5".

Next plan

In April of this year, SenseTime released "Ri Ri Xin 5.0", the first domestically produced large-scale model that benchmarks GPT-4 Turbo, triggering a wave of excitement in the capital markets.

In just over two months, the brand new "Ri Ri Xin 5.5" system has undergone multiple upgrades, with comprehensive performance improving an average of 30% over "Ri Ri Xin 5.0", and significant enhancements in mathematical reasoning, English proficiency, and instruction following. The interactive effect and multiple core indicators have achieved the benchmark of GPT-4o.

Lu Lewei stated that from a technical research perspective, the release of 5.5 was not in the last few months, but rather the integration of a native multi-modal methodology that SenseTime had been researching since the end of last year. "This area happens to be the same as the actual meaning of 'Omni' in GPT-4o. We predicted this trend early on, and there was a technical team doing this research and development."

"It can cover the knowledge brought by multiple modalities in the training process and then fuse them. This is a big help for improving the performance accuracy of the algorithm." Lu Lewei further emphasized that this native multi-modal solution integrates audio, video, and the earliest images all into one model from the input encoder to the output decoder.

In addition, "Ri Ri Xin 5.5" adopts a hybrid edge-cloud collaborative expert architecture, maximizing the collaboration between the cloud and the edge to reduce inference costs. The model training is based on over 10TB tokens of high-quality training data, including a large amount of synthetic memory chain data to improve inference thinking ability.

Regarding the upcoming version plan, Lu Lewei stated that this version update is still quite significant. At the time, we also considered following the convention of a V6 version number, but we are also working on a larger plan for the V6 version, which can support a more comprehensive and fundamental upgrade.

"We will first do a conservative promotion for the release of version 5.5, hoping to make everyone look forward to it, and then V6 will bring a more comprehensive upgrade."

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more
    Write a comment