Already solidly in the driver’s seat of the generative AI accelerator market at this time, NVIDIA has long made it clear that the company isn’t about to slow down and check out the view. Instead, NVIDIA intends to continue iterating along its multi-generational product roadmap for GPUs and accelerators, to leverage its early advantage and stay ahead of its ever-growing coterie of competitors in the accelerator market. So while NVIDIA’s ridiculously popular H100/H200/GH200 series of accelerators are already the hottest ticket in Silicon Valley, it’s already time to talk about the next generation accelerator architecture to feed NVIDIA’s AI ambitions: Blackwell.
GPUs
Computex keynote season is kicking into high gear this morning with the show's leading keynote, which is being delivered by AMD. Company CEO Dr. Lisa Su will be presenting a keynote entitled “The future of high-performance computing in the AI era,” and with a run time of 90 minutes, we're expecting AMD to have a whole host of product announcements covering their full spectrum of product categories.
The big expectation here is fresh news around AMD’s Zen 5 CPU core architecture, and the chips built around it. AMD’s most recent Zen 5 roadmap has it slated to deliver all three flavors of Zen 5 by the end of this year, and we’re coming up on the two-year anniversary of the Zen 4 architecture launch.
Along with client chips, AMD has been pushing their server CPUs hard, and they’ve previously told investors that the next-gen EPYC Turin CPU is “looking great”. So we’ll likely hear about both client and server Zen 5 product plans during this keynote.
On the GPU/accelerator side of matters, AMD is mid-cycle (at best) with their Instinct MI300 series accelerators. With the company’s sales repeatedly beating their own expectations, AMD doesn’t seem to need much help moving this premium silicon right now. But with AI being the operative buzzword of this year’s Computex (and indeed, the computing industry as a whole), it would be weird for AMD to not have something to say about their rapidly growing AI accelerator product line.
Come join us at 6:30pm PT / 9:30pm ET / 01:30 UTC to get all the details.
Live Blog
Computex keynote season is kicking into high gear this morning with the show's leading keynote, which is being delivered by AMD. Company CEO Dr. Lisa Su will be presenting a keynote entitled “The future of high-performance computing in the AI era,” and with a run time of 90 minutes, we're expecting AMD to have a whole host of product announcements covering their full spectrum of product categories.
The big expectation here is fresh news around AMD’s Zen 5 CPU core architecture, and the chips built around it. AMD’s most recent Zen 5 roadmap has it slated to deliver all three flavors of Zen 5 by the end of this year, and we’re coming up on the two-year anniversary of the Zen 4 architecture launch.
Along with client chips, AMD has been pushing their server CPUs hard, and they’ve previously told investors that the next-gen EPYC Turin CPU is “looking great”. So we’ll likely hear about both client and server Zen 5 product plans during this keynote.
On the GPU/accelerator side of matters, AMD is mid-cycle (at best) with their Instinct MI300 series accelerators. With the company’s sales repeatedly beating their own expectations, AMD doesn’t seem to need much help moving this premium silicon right now. But with AI being the operative buzzword of this year’s Computex (and indeed, the computing industry as a whole), it would be weird for AMD to not have something to say about their rapidly growing AI accelerator product line.
Come join us at 6:30pm PT / 9:30pm ET / 01:30 UTC to get all the details.
Live BlogA few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
StorageKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
Storage
0 Comments