The curtains are drawn and it’s almost showtime for Qualcomm and its Snapdragon X SoC team. After first detailing the SoC nearly 8 months ago at the company’s most recent Snapdragon Summit, and making numerous performance disclosures in the intervening months, the Snapdragon X Elite and Snapdragon X Plus launch is nearly upon us. The chips have already shipped to Qualcomm’s laptop partners, and the first laptops are set to ship next week.
In the last 8 months Qualcomm has made a lot of interesting claims for their high-performance Windows-on-Arm SoC – many of which will be put to the test in the coming weeks. But beyond all the performance claims and bluster amidst what is shaping up to be a highly competitive environment for PC CPUs, there’s an even more fundamental question about the Snapdragon X that we’ve been dying to get to: how does it work?
Ahead of next week’s launch, then, we’re finally getting the answer to that, as today Qualcomm is releasing their long-awaited architectural disclosure on the Snapdragon X SoC. This includes not only their new, custom Arm v8 “Oryon” CPU core, but also technical disclosures on their Adreno GPU, and the Hexagon NPU that backs their heavily-promoted AI capabilities. The company has made it clear in the past that the Snapdragon X is a serious, top-priority effort for the company – that they’re not just slapping together a Windows SoC from their existing IP blocks and calling it a day – so there’s a great deal of novel technology within the SoC.
And while we’re excited to look at it all, we’ll also be the first to admit that we’re the most excited to finally get to take a deep dive on Oryon, Qualcomm’s custom-built Arm CPU cores. The first new high-performance CPU design created from scratch in the last several years, the significance of Oryon cannot be overstated. Besides providing the basis of a new generation of Windows-on-Arm SoCs that Qualcomm hopes will vault them into contention in the Windows PC marketplace, Oryon will also be the basis of Qualcomm’s traditional Snapdragon mobile handset and tablet SoCs going forward.
So a great deal of the company’s hardware over the next few years is riding on this CPU architecture – and if all goes according to plan, there will be many more generations of Oryon to follow. One way or another, it’s going to set Qualcomm apart from its competitors in both the PC and mobile spaces, as it means Qualcomm is moving on from Arm’s reference designs, which by their very nature are accessible Qualcomm’s competition as well.
So without further ado, let’s dive in to Qualcomm’s Snapdragon X SoC architecture.
CPUsLater this year Intel is set to introduce its Xeon 6-branded processors, codenamed Granite Rapids (6x00P) and Sierra Forest (6x00E). And with it will come a new slew of server motherboards and pre-built server platforms to go with it. On the latter note, this will be the first generation where Intel won't be offering any pre-builts of its own, after selling that business off to MiTAC last year.
To that end, MiTAC and its subsidiary Tyan were at this year's event to demonstrate what they've been up to since acquiring Intel's server business unit, as well as to show off the server platforms they're developing for the Xeon 6 family. Altogether, the companies had two server platforms on display – a compact 2S system, and a larger 2S system with significant expansion capabilities – as well as a pair of single-socket designs from Tyan.
The most basic platform that MiTAC had to show is their TX86-E7148 (Katmai Pass), a half-width 1U system that's the successor to Intel's D50DNP platform. Katmai Pass has two CPU sockets, supports up to 2 TB of DDR5-6400 RDIMMs over 16 slots (8 per CPU), and has two low-profile PCIe 5.0 x16 slots. Like its predecessor, this platform is aimed at mainstream servers that do not need a lot of storage or room to house bulky add-in cards like AI accelerators.
The company's other platform is TX77A-E7142 (Deer Creek Pass), a considerably more serious offering that replaces Intel's M50FCP platform. This board can house up to 4 TB of DDR5-6400 RDIMMs over 32 slots (16 per CPU with 2DPC), four PCIe 5.0 x16 slots, one PCIe 5.0 x8 slot, two OCP 3.0 slots, and 24 hot-swap U.2 bays. Deer Creek Pass can be used both for general-purpose workloads, high-performance storage, as well as workloads that require GPUs or other special-purpose accelerators.
Meanwhile Tyan had the single-socket Thunder CX GC73A-B5660 on display. That system supports up to 2 TB of DDR5-6400 memory over 16 RDIMMs and offers two PCIe 5.0 x16 slots, one PCIe 4.0 x4 M.2 slot, two OCP 3.0 slots, and 12 hot-swappable U.2 drive bays.
Finally, Tyan's Thunder HX S5662 is an HPC server board specifically designed to house multiple AI accelerators and other large PCIe cards. This board supports one Xeon 6 6700 processor, up to 1 TB of memory over eight DDR5-6400 RDIMMs, and has five tradiitonal PCIe 5.0 x16 slots as well as two PCIe 5.0 x2 M.2 slots for storage.
MiTAC is expected to start shipments of these new Xeon 6 motherboards in the coming months, as Intel rolls out its next-generation datacenter CPUs. Pricing of these platforms is unknown for now, but expect it to be comparable to... Servers
The CXL consortium has had a regular presence at FMS (which rechristened itself from 'Flash Memory Summit' to the 'Future of Memory and Storage' this year). Back at FMS 2022, the company had announced v3.0 of the CXL specifications. This was followed by CXL 3.1's introduction at Supercomputing 2023. Having started off as a host to device interconnect standard, it had slowly subsumed other competing standards such as OpenCAPI and Gen-Z. As a result, the specifications started to encompass a wide variety of use-cases by building a protocol on top of the the ubiquitous PCIe expansion bus. The CXL consortium comprises of heavyweights such as AMD and Intel, as well as a large number of startup companies attempting to play in different segments on the device side. At FMS 2024, CXL had a prime position in the booth demos of many vendors.
The migration of server platforms from DDR4 to DDR5, along with the rise of workloads demanding large RAM capacity (but not particularly sensitive to either memory bandwidth or latency), has opened up memory expansion modules as one of the first set of widely available CXL devices. Over the last couple of years, we have had product announcements from Samsung and Micron in this area.
At FMS 2024, SK hynix was showing off their DDR5-based CMM-DDR5 CXL memory module with a 128 GB capacity. The company was also detailing their associated Heterogeneous Memory Software Development Kit (HMSDK) - a set of libraries and tools at both the kernel and user levels aimed at increasing the ease of use of CXL memory. This is achieved in part by considering the memory pyramid / hierarchy and relocating the data between the server's main memory (DRAM) and the CXL device based on usage frequency.
The CMM-DDR5 CXL memory module comes in the SDFF form-factor (E3.S 2T) with a PCIe 3.0 x8 host interface. The internal memory is based on 1α technology DRAM, and the device promises DDR5-class bandwidth and latency within a single NUMA hop. As these memory modules are meant to be used in datacenters and enterprises, the firmware includes features for RAS (reliability, availability, and serviceability) along with secure boot and other management features.
SK hynix was also demonstrating Niagara 2.0 - a hardware solution (currently based on FPGAs) to enable memory pooling and sharing - i.e, connecting multiple CXL memories to allow different hosts (CPUs and GPUs) to optimally share their capacity. The previous version only allowed capacity sharing, but the latest version enables sharing of data also. SK hynix had presented these solutions at the CXL DevCon 2024 earlier this year, but some progress seems to have been made in finalizing the specifications of the CMM-DDR5 at FMS 2024.
Micron had unveiled the CZ120 CXL Memory Expansion Module last year based on the Microchip SMC 2000 series CXL memory controller. At FMS 2024, Micron and Microchip had a demonstration of the module on a Granite Rapids server.
Additional insights into the SMC 2000 controller were also provided.
The CXL memory controller also incorporates DRAM die failure handling, and Microchip also provides diagnostics and debug tools to analyze failed modules. The memory controller also supports ECC, which forms part of the enterprise... Storage
0 Comments