Analysis and discussion of large models in Apple WWDC: large and small models (valve control), cooperative verification, agent orchestration, access estimation, memory, and security issues

lee… joined discussion · Jun 13 03:26

1. Background

On June 10, 24, Apple $Apple (AAPL.US)$ ushered in the 24th WWDC conference. The author is more concerned about the content related to the large model. After the press conference, several contents can be confirmed:

1. Apple and Open AI have reached a cooperation, and GPT-4o will land on IOS 18

2. Apple itself has a small model upper end

3. Simple tasks use small models, and complex tasks use cloud models GTP-4o, which supports online query. The derived security issues and the threshold control of large and small models, that is, the switching of large and small models, are likely to take the fixed application method, that is, the use of small model capabilities in simple application scenarios

4. For complex task flows, they are connected in series using orchestration. The Agent intelligent body capabilities under the blessing of large models are further exerted, such as using Siri as the starting point for workflow triggering

2. Model size

According to the news of the press conference, GPT-4o has confirmed landing on IOS 18. In addition, Apple itself uses small model capabilities (several were mentioned in the press conference), such as message importance inversion, content rewriting, etc., and users are currently free to use. In addition, if you want to use GPT-4o capabilities alone, you can access the API key call yourself

Since the author has been following the situation of Apple's large models on the end, I have previously analyzed the issue of large and small models.

Based on the information currently disclosed by Apple, for complex tasks, the cloud large model capabilities (GPT-4o) are called, and simple tasks are run locally, which is limited by the computing power on the end, resulting in limited capabilities of local models. As for whether it is Google's Gemini, it is highly likely not based on the current amount of information (both parties have not disclosed information to indicate this situation), so Google is likely to be eliminated

In addition, several models were mentioned in the press conference, so it is inferred that the end-side machine is multi-modal realized by multiple models, rather than the unification of the underlying multi-modal of a single model (such as text, photos, etc.)

This leads to a question that needs to be verified, whether the Transformer architecture is currently used, or whether it is a model based on previous research results, which needs to be verified. According to Apple's current technology and the maturity of the evolution of large models on the Transformer architecture, it may not be too complicated for Apple to realize a multimodal "small" model for a relatively simple task.

3. Cooperation verification

During the period when Apple's large model was put on the machine, several rumors were spread:

1. Apple cooperated with Open AI to put GPT on the machine: When Open AI released GPT-4o in May, especially when emphasizing the response speed, it was vaguely felt that there was a push in this direction, especially when the demo of the voice assistant was launched, which further strengthened the speculation in this regard. The cooperation has been confirmed.

2. Apple cooperated with Google and put Gemini on the end: Thanks to the Gemini small model developed by Google, the outside world once rumored to cooperate with Google and put Gemini on the end. Moreover, according to the calculation power TOPS estimate at the time, at the scale of 3 billion parameters, Apple's chip can also fully hold it, which further deepens this inference (Apple has not produced outstanding results). At present, the rumors are self-defeating and the cooperation has not been reached.

3. Apple cooperates with Baidu: Due to the network reasons of GPT-4o, domestic calls (over the wall does not meet national security restrictions) give the possibility of cooperation with Baidu. And, according to Apple, different versions of IOS 18 will have differences in AI, and this news is strengthening the promotion of this cooperation. Although there were rumors before and the news has not been confirmed, with the launch of IOS 18, the grayscale release of AI capabilities can verify the specific news of this area. And, according to the current situation of large models in China, cooperation with Baidu is the best choice. After all, Wenxin Yiyan's position has been consolidated in the C-end advantage.

Finally, according to the speed of advancement, Apple plans to use large language models to optimize its smart assistant Siri. However, the launch of these features may have to wait until 2025. Therefore, it is highly likely that there will be no accurate news in a short time.

4. Agent orchestration, access estimation and memory messages

I can only say that this ability is not surprising. At GPT-4o and the Google Developer Conference, I have seen the specific operations that large models can currently do, and Apple has integrated them. In the past two days, the outside world has been talking about the good combination of software and hardware, but the previous demonstrations of Google and Meta have already demonstrated this content. In addition, it is specially pointed out here: Regarding the use of travel and the implementation of the plan, the author has also done some things based on the big model. From the press conference, thanks to the integration of the big model GPT-4o's own capabilities, response speed and Apple system, the overall information flow is completed, and the task decomposition, arrangement, processing and reorganization process are realized. After overcoming the problem of slow response speed of the big model, efficient communication and task solving between the cloud and the local have become possible. What needs to be mentioned here is the Agent capability. For the situation after task decomposition, injecting the Agent's reshaping ability into each link has a greater effect on the overall capability improvement of GPT-4o. Secondly, the scene on the mobile phone is a relatively fixed link, or a link that can be relatively fixed. Therefore, the operation mode of this model has become possible (open-》relatively open)

Here are also several sets of data released by Apple itself:

1. Siri users issue 1.5 billion voice requests every day

2. More than 1 billion Apple users worldwide

Based on the above data, the following questions worth discussing can be derived:

1. How much concurrency can GPT-4o load, and what is the response speed under concurrency?

2. Why does IOS also need to integrate local models?

Regarding question 1: GPT-4o itself has nearly 200 million users, and Apple has 1 billion users worldwide (GPT-4o users who can make normal network requests will be multiplied by a coefficient). Assuming that the number of Apple users in the United States is about 140 million

Siri has 1.5 billion voice requests every day. If multiplied by the number of users in the United States, the number of requests in the United States is estimated to be:

(140 million / 1 billion) * 1.5 billion times = 210 million times

What proportion of each wake-up and action of Siri will use GPT-4o, although this cannot be verified

And what is the specific amount of GPT-4o's own response to user requests every day? According to the latest data, the number of accesses to Open AI in May was 750M times, which is 25 million times per day if calculated based on 30 days.

The number of high-frequency requests (240 million times/day) combined with the two may actually exceed the total number. Can Open AI hold on? On the one hand, in terms of the speed of reasoning, and on the other hand, in terms of overall stability.

Based on the answer to question 1, question 2 is raised:

The author speculates that this is also an idea for the end-side to do local models - that is, diversion. Through the diversion of local models, on the one hand, the call cost is saved, and on the other hand, GTP-4o is buffered under high concurrency pressure to prevent poor user experience caused by jams.

Secondly, this is also a test link for Apple's end-side layout. After all, it is rare to have such an opportunity to do this.

However, the memory limitation problem follows. According to the latest news: Although Apple IOS 18 will bring this model capability in the future, the memory requirement is more than 8G, and many models will not be able to use it due to memory and chip computing power issues. In other words, is the market promoting the replacement of mobile phones in disguise? However, from the author's point of view, with the current deployment of large model capabilities and the overall performance of Apple's system, it is not worth changing the phone just to try a round of large model capabilities.

5. Security issues

As for the solution of cooperating with Open AI and combining the end-side with the cloud large model, there is a great security risk to user information security. Musk broke the news in the morning:

1. Musk said that the problem with "agreeing" to Apple's data sharing terms is that no one really reads these user terms and conditions, so users may be betrayed without knowing it

2. Secondly, the harm to data security may not be obvious in the short term, but once the data accumulates to a certain extent, the harm is very large (data->characterize users->monitor users). In addition, with the access of Open AI, Apple data has become the food of OpenAI, which continues to learn and grow. In this regard, Musk also said: "If Apple integrates OpenAI at the operating system level, then Apple devices will be banned from use in all my companies. This is an unacceptable security violation. Visitors to his company must check their Apple devices at the door, and these devices will be stored in a Faraday cage. 3. From Musk's perspective, he believes that Apple is not smart enough to make its own artificial intelligence, but it can ensure that OpenAI protects the security and privacy of users. This is obviously ridiculous! Once Apple gives your data to OpenAI, there is no way to know what will happen. They are selling you out." The author resists this information. Unlike usual information queries and conversations, once the information on the mobile phone is inexplicably extracted by the model and provided to the cloud, and personal behavior is retained by the model in the end-side service, Open AI is fully capable of completely shaping individuals in the real data in the cloud. Although Open AI and Apple do not abuse data, once a data leak occurs, the damage to users is undoubtedly fatal, and users face huge risks in the process. In addition, the large model capabilities demonstrated by Apple are not worth the risk for users. Although there is cloud on both domestic and foreign mobile phones, information may be collected, but this information is not used for interaction. In addition, the consequences of information integration under the support of a new technology are always daunting. 6. Summary 1. This round of large models on the Apple side has been expected by both the market and the author for a long time. After all, both the market and Apple itself have been using this content as a hot topic for publicity, chips, model architecture, technology and other directions. However, in terms of actual results, after experiencing GPT-4o and the Google Developer Conference, the author did not feel too amazing. In addition, since the capabilities of the machine itself have not been fully opened, whether it will be discounted on the basis is unknown. 2. Security issues are one of the most important issues. From the perspective of Open AI's current control of security, it will undoubtedly increase the risk, because its security, super alignment team and other situations have been publicly reported before. Including the head of security and the technical master responsible for security have also resigned. In addition, the interaction with GPT-4o or the future GPT-5, all information is in the cloud. How to solve this problem? It is not enough to just look at the publicity. Secondly, including the issue of storage after information collection, how to ensure that data leakage does not occur? The author believes that this is an urgent problem that needs to be solved on the big model side of the book.

$Apple(AAPL.US)$

Disclaimer: Community is offered by Moomoo Technologies Inc. and is for educational purposes only. Read more

Translate

Report

23K Views

Comment

lee…

Investment=probability * odds

352Followers

37Following

1224Visitors