Why did Intel Falcon Shores go from XPU to GPU?
By Yuwei Liu Associate Principal Analyst, Electronic Engineering Album

Why did Intel Falcon Shores go from XPU to GPU?

Falcon Shores is the replacement for Rialto Ridge after its cancellation and the designated successor to Ponte Vecchio. According to earlier reports, Falcon Shores will be a hybrid XPU product, as it will integrate a GPU, CPU and memory in the package. However, at Monday's meeting, Intel confirmed that Falcon Shores will no longer be an XPU and will return to being simply a GPU.

According to Reuters, Intel provided more details on Monday (May 23) local time about an artificial intelligence (AI) computing chip it plans to launch in 2025.

Speaking at a supercomputing conference in Germany, Intel said the upcoming "Falcon Shores" chip will have 288GB of HBM3 memory and 9.8TB/s of total memory throughput, and as expected, it will support smaller data types such as FP8 and BF16. These details are also the first disclosures of Intel's strategic shift to capture the AI processor market to catch up with Nvidia and AMD.

图片无替代文字

Falcon Shores is the replacement for the Rialto Ridge after its cancellation and is the designated successor to Ponte Vecchio. According to earlier reports, Falcon Shores will be a hybrid XPU product, as it will integrate GPU, CPU and Memory in the package. However, at Monday's meeting, Intel confirmed that Falcon Shores will no longer be an XPU and will return to being simply a GPU.

图片无替代文字

The Falcon Shores GPU will be part of Intel's Xeon Max GPU family, with standard Ethernet switching, much like Intel's AI-focused Gaudi architecture. In addition, Falcon Shores GPUs will be Chiplet modular, like Ponte Vecchio, allowing program processing to be targeted to a single GPU. This foundational architecture is flexible enough to integrate new Intel and customer IP (including CPU cores and other artefacts) over time, manufactured using the Intel IDM 2.0 model.

The basic sketch of the device also includes a common GPU-based programming interface, OneAPI, which will allow for broad compatibility with other CPUs and architectures. Intel also lists CXL (Compute Express Link) support as a key differentiator, which allows GPUs, AI chips and other accelerators to easily access large pools of storage and memory.


Why the change from XPU to GPU?

Here we break down the deeper reasons behind Intel pulling CPU cores out of Falcon Shores.

图片无替代文字

Intel says that the current computing environment is not yet mature enough to achieve the initial goal of mixing CPU and GPU cores into the same Falcon Shores package. As the figure above shows, the optimal mix of CPU and GPU cores changes over time as the workloads of different processors change as generative AI and LLM move into HPC space. So it has also triggered a shift in Intel's thinking about how to build the next generation of supercomputing architectures, and they believe that now is not the time to lock customers into a specific CPU to GPU ratio.

Furthermore, by design, supercomputers at the cutting edge are highly specialised designs for specific tasks, and software tuning for the architecture is just a routine operation of running a supercomputer. These factors mean that CPU/GPU ratios are not the only reason Intel is removing CPU cores from its designs.

Intel also noted that Falcon Shores allows its customers to use a variety of different CPUs, logically including AMD's x86 and Nvidia's Arm chips, as well as their GPU designs, and therefore does not limit customers to only Intel's x86 cores, and that the decoupling of CPUs and GPUs will provide more options for customers with different workloads.

Intel said the purpose of using the CXL interface is to allow its customers to combine various CPU/GPU ratios in their custom designs using a composable architecture. However, the CXL interface only provides 64 GB/s of throughput between chip combinations, while custom CPU GPU designs like Nvidia's Grace Hopper can provide up to 1 TB/s of memory throughput between the CPU and GPU. For many types of workloads - especially AI workloads that require memory bandwidth - this offers performance and efficiency advantages over CXL implementations.


Ponte Vechhio will last two more years

Until Falcon Shores comes out in 2025, Ponte Vechhio will remain Intel's best GPU solution for the AI and HPC markets. It will have to compete with more advanced HPC architectures such as Nvidia's Grace Superchips and AMD's upcoming CDNA3/Zen4 hybrid (exascale APU) Instinct MI300, both due out in 2023.

图片无替代文字
Intel Data Center GPU Max 1550 (Ponte Vecchio) versus NVIDIA H100 PCIe (Hopper)

The main reason for the change to Falcon Shores is that Intel currently plans to have 2 product lines and the introduction of Falcon Shores will significantly increase the flexibility of the product. The XPU part for HPC is still ongoing, but it will not be part of the initial Falcon Shores release.

图片无替代文字

Originally intended for the virtual, cloud-accelerated market were the Arcitic Sond M. As previously planned, they were supposed to be replaced this year by the new Rialto Bridge and Lancaster Sound series. However, the latest roadmap sees the development of these two products being discontinued in favour of the next generation directly, which is Falcon Shores.

On the other hand, the Habana Gaudi for the AI section will not be updated beyond the 3rd generation and will be replaced by the Falcon Shores. Intel said they "plan to integrate the Habana and AXG product (GPU) roadmaps", but revealed few details of the integration.

The Gaudi compute architecture is so different from standard GPUs that it seems impossible to fully integrate its compute architecture into a GPU. Intel could therefore integrate smaller parts of the Gaudi design (such as its network interface or other IP blocks) into its GPUs. 

AMD's Instinct MI300 and NVIDIA's Grace Hopper are both known to use hybrid CPU GPU designs, which have the advantage of reducing costs and saving power, but can tie customer product design to vendor solution configurations to a high degree.

In contrast to them, Intel's pure GPU flexible paired CPU solution is good for certain workloads, but it may not be able to compete with its rivals in terms of power, cost or performance for certain applications.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics