Changjiang Securities: Apple (AAPL.US) released the Apple Intelligence asia vets assistant, heterogenous chips may become a new direction for AI computing power.

与GPU相比，TPU不需要再频繁地访问内存，减少了与存储器的交互次数，从而大幅度提高了计算效率。

智通财经APP获悉，长江证券发布研报称，在2024年的全球开发者大会上，苹果(AAPL.US)推出了个人智能助理Apple Intelligence。该助手包含了多个功能强大的生成模型，可以快速、高效地处理用户日常任务，并能即时适应用户当前的活动。此外，苹果发布了基础模型AFM，为底层操作系统赋能。Apple Intelligence正是由AFM基础模型赋能的。在算力层面，AFM模型由谷歌(GOOG.US)TPU算力芯片提供支持，该TPU性能追及英伟达(NVDA.US)旗舰算力芯片。

苹果发布Apple Intelligence智能助手

苹果发布了基础模型AFM，为底层操作系统赋能。在2024年的全球开发者大会上，苹果推出了个人智能助理Apple Intelligence。Apple Intelligence包含了多个功能强大的生成模型，可以快速、高效地处理用户日常任务，并能即时适应用户当前的活动。Apple Intelligence可以实现撰写和润色文本、优先处理和总结通知、为与家人朋友的对话创建有趣的图片等功能，以及采取应用内操作以简化跨应用交互。Apple Intelligence将被搭载于IOS18、IOS18ipad以及MacOS18操作系统上。目前Apple Intelligence搭载于iOS18.1Beta版上，仅限注册开发者试用，订阅价格为99美金一年;普通用户仍需要排队等待。

Apple Intelligence由AFM基础模型赋能

AFM基础模型主要包含了端侧模型和云端模型两个部分。其中端侧模型专为端侧应用的特定场景所设计，只能处理语言相关的单模态任务，可以本地化搭载于iphone、ipad、Mac等设备上，模型包含30亿参数量。云端模型为私有云应用场景所设计，具备多模态能力，有更高的泛化能力，可应对更加通用的任务。这两个基础模型是苹果创建的生成式模型家族中的一部分，除了上述两个模型，Apple Intelligence还包含了一个编码模型和一个扩散模型。编码模型基于AFM语言模型，用于为Xcode注入智能功能;扩散模型帮助用户以视觉方式表达自己，比如在Messages应用中使用。

AFM云端模型性能追及GPT-3.5，略逊于GPT-4。

在模型性能评估阶段，苹果设计了1393个任务，将AFM模型与其他主流模型的性能进行了对比。对比结果显示，AFM云端模型性能超越Mixtral-8x22混合专家模型、GPT-3.5等模型，略逊于GPT-4和LLaMA-3-70B模型;在端侧模型方面，AFM端侧模型性能接近市场主流端侧模型。人类测评结果显示，AFM端侧模型性能超越Gemma-7B、Phi-3-mini、Mistral-7B、Gemma-2B等主流模型，略逊于LLaMA-3-8B模型;结果证明了AFM端侧模型的优异性能，有望在iPhone、iPad等设备上发挥较高实用性。

异构芯片或成AI算力发展新方向

在算力层面，AFM模型由谷歌TPU算力芯片提供支持。谷歌为本次训练提供了算力支持，云端的AFM-server模型由8192个TPU V4算力芯片训练得到，在训练阶段，苹果把8192芯片分成8组，每组1024个芯片相互串联形成一个基本单位，各组之间保持平行关系，训练数据与迭代仅在组内完成;端侧的AFM-on-device模型由2048个TPU V5p算力芯片训练得到。

谷歌TPU性能追及英伟达旗舰算力芯片。TPU(Tensor Processing Unit)，是一种专为处理张量运算而设计的ASIC芯片。TPU通过脉动阵列机制实现高效运算。与GPU相比，TPU不需要再频繁地访问内存，减少了与存储器的交互次数，从而大幅度提高了计算效率。因此，TPU的有效算力利用率相比于GPU更高;GPU的算力利用率通常为20%-40%，而TPU的算力利用率往往超过50%。

风险提示

1、AI技术推进不及预期;

2、下游应用需求不及预期。

Compared with GPUs, TPU does not need to access memory frequently, reducing the interaction with storage and greatly improving computing efficiency.

According to Zhichuang Finance APP, Changjiang Securities released a research report stating that Apple (AAPL.US) launched a personal intelligent assistant, Apple Intelligence, at the global developers conference in 2024. The assistant contains multiple powerful generative models that can quickly and efficiently handle users' daily tasks and immediately adapt to their current activities. In addition, Apple released the basic model AFM, which empowers the underlying operating system. Apple Intelligence is powered by the AFM basic model. In terms of computing power, the AFM model is supported by Google's (GOOG. US) TPU chip, the performance of which is equivalent to that of Nvidia's (NVDA. US) flagship chip.

Apple released Apple Intelligence, an intelligent assistant.

Apple released the basic model AFM, which empowers the underlying operating system. They also launched a personal intelligent assistant, Apple Intelligence, which contains multiple powerful generative models that can quickly and efficiently handle users' daily tasks and immediately adapt to their current activities. Apple Intelligence can write and refine text, prioritize and summarize notifications, and create interesting images for conversations with family and friends, as well as simplify cross-application interactions with in-app operations. Apple Intelligence will be available on IOS18, IOS18ipad, and MacOS18 operating systems. Apple Intelligence is currently only available on the iOS18.1Beta version for registered developers to try out, and the subscription price is $99 per year. Ordinary users still need to wait in line.

Apple Intelligence is powered by the AFM basic model.

The AFM basic model mainly includes two parts: the terminal-side model and the cloud-side model. The terminal-side model is designed for specific scenarios of terminal-side applications, can only process language-related single-modal tasks, and can be localized on devices such as iPhones, iPads, and Macs, with a model parameter amount of 3 billion. The cloud-side model is designed for private cloud application scenarios, has multi-modal capabilities, and has higher generalization capabilities to handle more generic tasks. These two basic models are part of the generative model family created by Apple. In addition to the above two models, Apple Intelligence also includes an encoding model and a diffusion model. The encoding model is based on the AFM language model and is used to inject intelligent features into Xcode; the diffusion model helps users express themselves visually, such as in the Messages app.

The performance of the AFM cloud-side model is comparable to that of GPT-3.5 and slightly inferior to that of GPT-4.

In the model performance evaluation phase, Apple designed 1393 tasks to compare the performance of the AFM model with other mainstream models. The comparative results show that the performance of the AFM cloud-side model surpasses that of the Mixtral-8x22 hybrid expert model, the GPT-3.5 model, and is slightly inferior to the GPT-4 and LLaMA-3-70B models; in the terminal-side model, the performance of the AFM terminal-side model is similar to that of the mainstream terminal-side models on the market. The results of human evaluation show that the performance of the AFM terminal-side model exceeds that of mainstream models such as Gemma-7B, Phi-3-mini, Mistral-7B, Gemma-2B, and is slightly inferior to the LLaMA-3-8B model. These results prove the excellent performance of the AFM terminal-side model, which is expected to have high practicality on devices such as iPhone and iPad.

Heterogeneous chips may become a new direction for AI computing power development.

In terms of computing power, the AFM model is supported by Google's TPU chip. Google provided computing power support for this training. The cloud-side AFM-server model was trained with 8192 TPU V4 chips. In the training phase, Apple divided the 8192 chips into 8 groups, each group consisting of 1024 chips connected in series to form a basic unit, and the groups remain parallel to each other. The training data and iterations are only completed within the group; the terminal-side AFM-on-device model was trained with 2048 TPU V5p chips.

TPU (Tensor Processing Unit) is an ASIC chip designed specifically for processing tensor operations. TPU achieves efficient computation through pulsation array mechanism. Compared with GPUs, TPU does not need to access memory frequently, reducing the interaction with storage and greatly improving computing efficiency. Therefore, the effective utilization rate of TPU's computing power is higher than that of GPUs; the computing power utilization rate of GPUs is usually 20%-40%, while the computing power utilization rate of TPUs often exceeds 50%.

Risk warning

1. AI technology is not developing as expected;

2. The downstream application demand is not as expected.

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more

长江证券：苹果(AAPL.US) 发布Apple Intelligence智能助手 异构芯片或成AI算力新方向

Changjiang Securities: Apple (AAPL.US) released the Apple Intelligence asia vets assistant, heterogenous chips may become a new direction for AI computing power.

Risk Disclaimer

Statement

长江证券：苹果(AAPL.US) 发布Apple Intelligence智能助手异构芯片或成AI算力新方向