Source: Arm Author: Arm
For artificial intelligence (AI), no single hardware or computing component can be a one-size-fits-all solution for every type of workload. AI spans the entire modern computing landscape, from the cloud to the edge, and to meet different AI use cases and needs, a heterogeneous computing platform with flexible use of different computing engines such as cpus, Gpus, and Npus is essential.
Based on the performance, power efficiency, accessibility, ease of programming and flexibility of Arm cpus, Arm cpus have laid the foundation for AI acceleration on a variety of platforms, from small embedded devices to large data centers.
The three main reasons this is great for the ecosystem in terms of flexibility are, first, that Arm cpus can handle a wide range of AI inference use cases, many of which are typically used in billions of devices, such as today's smartphones, clouds, and data centers. Not only that, in addition to reasoning, the CPU is often used for other tasks such as data preprocessing and orchestration in the technology stack. Second, developers are able to run a wider range of software in a wider variety of data formats without having to build multiple versions of code. Finally, the flexibility of the CPU makes it an ideal tool for accelerating AI workloads.
Provide diversity and choice to help the industry flexibly deploy AI computing
In addition to the CPU portfolio, the Arm computing platform includes AI accelerator technologies such as Gpus and Npus, which are being integrated with cpus in many markets.
On the mobile side, the Arm Terminal Computing Subsystem (CSS) contains Armv9.2 CPU clusters and integrates with Arm Immortalis-G925 Gpus to provide acceleration for a variety of AI use cases, including image segmentation, object detection, natural language processing, and speech-to-text. When it comes to the Internet of Things (IoT), the Arm Ethos-U85 NPU can be designed with Arm Cortex-A based systems that require accelerated AI performance, such as in scenarios such as factory automation.
In addition to Arm's own accelerator technology, partners have the flexibility to customize differentiated chip solutions with Arm cpus. For example, NVIDIA's Grace Blackwell and Grace Hopper superchips for AI infrastructure use Arm cpus and NVIDIA's AI accelerator technology to significantly improve AI performance.
The NVIDIA Grace Blackwell Superchip combines NVIDIA's Blackwell GPU architecture with Arm Neoverse-based Grace cpus. Arm's unique product portfolio enables NVIDIA to perform system-level design optimizations that deliver 25 times less power consumption and 30 times better performance per GPU compared to NVIDIA H100 Gpus. Specifically, thanks to the flexibility of the Arm Neoverse platform, NVIDIA is able to implement its own high-bandwidth NVLink interconnect technology and improve data bandwidth and latency between cpus, Gpus and memory.
Arm is committed to injecting AI-accelerated opportunities into the entire ecosystem through the Arm Total Design Ecology Program. Through this eco-program, developers will have faster access to Arm CSS technology, enabling hardware and software technology advancements to drive AI and chip innovation, and accelerate the development and deployment of AI-optimized chip solutions.
The Arm architecture provides the unique flexibility required by AI
The key to the flexibility of Arm CPU design is Arm's leading architecture. It provides a foundational platform that can be tightly integrated with AI accelerator technology and supports a variety of vector lengths from 128 bits to 2,048 bits, making it easy to execute multiple neural networks on many different data points.
The flexibility of Arm's architecture provides diverse customization opportunities across the entire chip ecosystem, and Arm has been committed to helping partners build their own differentiated chip solutions faster. This unique flexibility also enables Arm to continuously innovate in architecture, regularly rolling out key instructions and features to accelerate AI computing, which in turn benefits the entire ecosystem, including leading chip partners, as well
More than 20 million software developers building applications on the Arm computing platform.
It all started with the Armv7 architecture, which introduced advanced Single instruction Multiple Data (SIMD) extensions such as Neon technology, Arm's first foray into machine learning (ML) workloads. The architecture has been enhanced over the past few years with the addition of vector dot product and matrix multiplication features in Armv8, followed by the introduction of Arm SVE2 and the new Arm SME technology in Armv9, improving computational performance and reducing power consumption for a wide range of generative AI workloads and use cases.
Seamless integration with AI accelerator technology
Arm is the computing platform of the AI era, driving continuous architectural innovation to meet the development of faster, more interactive and immersive AI applications. Arm cpus are part of a heterogeneous computing approach for flexible AI workloads that seamlessly enhances and integrates AI accelerator technologies such as Gpus and Npus.
The Arm CPU is a practical choice for handling many AI inference workloads, and thanks to its flexibility, it can seamlessly integrate with accelerator technology to create more powerful, higher performance AI capabilities that precisely meet specific use cases and computing needs. For Arm's technology partners, excellent flexibility enables a wealth of customization options, enabling them to build complete chip solutions for AI workloads.
免责声明: 本文章转自其它平台,并不代表本站观点及立场。若有侵权或异议,请联系我们删除。谢谢! Disclaimer: This article is reproduced from other platforms and does not represent the views or positions of this website. If there is any infringement or objection, please contact us to delete it. thank you! |