At FMS 2024, Kioxia had a proof-of-concept demonstration of their proposed a new RAID offload methodology for enterprise SSDs. The impetus for this is quite clear: as SSDs get faster in each generation, RAID arrays have a major problem of maintaining (and scaling up) performance. Even in cases where the RAID operations are handled by a dedicated RAID card, a simple write request in, say, a RAID 5 array would involve two reads and two writes to different drives. In cases where there is no hardware acceleration, the data from the reads needs to travel all the way back to the CPU and main memory for further processing before the writes can be done.
Kioxia has proposed the use of the PCIe direct memory access feature along with the SSD controller's controller memory buffer (CMB) to avoid the movement of data up to the CPU and back. The required parity computation is done by an accelerator block resident within the SSD controller.
In Kioxia's PoC implementation, the DMA engine can access the entire host address space (including the peer SSD's BAR-mapped CMB), allowing it to receive and transfer data as required from neighboring SSDs on the bus. Kioxia noted that their offload PoC saw close to 50% reduction in CPU utilization and upwards of 90% reduction in system DRAM utilization compared to software RAID done on the CPU. The proposed offload scheme can also handle scrubbing operations without taking up the host CPU cycles for the parity computation task.
Kioxia has already taken steps to contribute these features to the NVM Express working group. If accepted, the proposed offload scheme will be part of a standard that could become widely available across multiple SSD vendors.
StorageThe AMD Ryzen 9 9950X and Ryzen 9 9900X Review: Flagship Zen 5 Soars - and Stalls Earlier this month, AMD launched the first two desktop CPUs using their latest Zen 5 microarchitecture: the Ryzen 7 9700X and the Ryzen 5 9600X. As part of the new Ryzen 9000 family, it gave us their latest Zen 5 cores to the desktop market, as AMD actually launched Zen 5 through their mobile platform last month, the Ryzen AI 300 series (which we reviewed). Today, AMD is launching the remaining two Ryzen 9000 SKUs first announced at Computex 2024, completing the current Ryzen 9000 product stack. Both chips hail from the premium Ryzen 9 series, which includes the flagship Ryzen 9 9950X, which has 16 Zen 5 cores and can boost as high as 5.7 GHz, while the Ryzen 9 9900X has 12 Zen 5 cores and offers boost clock speeds of up to 5.6 GHz. Although they took slightly longer than expected to launch, as there was a delay from the initial launch date of July 31st, the full quartet of Ryzen 9000 X series processors armed with the latest Zen 5 cores are available. All of the Ryzen 9000 series processors use the same AM5 socket as the previous Ryzen 7000 (Zen 4) series, which means users can use current X670E and X670 motherboards with the new chips. Unfortunately, as we highlighted in our Ryzen 7 9700X and Ryzen 5 9600X review, the X870E/X870 motherboards, which were meant to launch alongside the Ryzen 9000 series, won't be available until sometime in September. We've seen how the entry-level Ryzen 5 9600X and the mid-range Ryzen 7 9700X perform against the competition, but it's time to see how far and fast the flagship Ryzen 9 pairing competes. The Ryzen 9 9950X (16C/32T) and the Ryzen 9 9900X (12C/24T) both have a higher TDP (170 W/120 W respectively) than the Ryzen 7 and Ryzen 5 (65 W), but there are more cores, and Ryzen 9 is clocked faster at both base and turbo frequencies. With this in mind, it's time to see how AMD's Zen 5 flagship Ryzen 9 series for desktops performs with more firepower, with our review of the Ryzen 9 9950X and Ryzen 9 9900 processors. CPUs
The Endorfy Fortis 5 Dual Fan CPU Cooler Review: Towering Value Standard CPU coolers, while adequate for managing basic thermal loads, often fall short in terms of noise reduction and superior cooling efficiency. This limitation drives advanced users and system builders to seek aftermarket solutions tailored to their specific needs. The high-end aftermarket cooler market is highly competitive, with manufacturers striving to offer products with exceptional performance. Endorfy, previously known as SilentiumPC, is a Polish manufacturer that has undergone a significant transformation to expand its presence in global markets. The brand is known for delivering high-performance cooling solutions with a strong focus on balancing efficiency and affordability. By rebranding as Endorfy, the company aims to enter premium market segments while continuing to offer reliable, high-quality cooling products. SilentiumPC became very popular in the value/mainstream segments of the PC market with their products, the spearhead of which probably was the Fera 5 cooler that we reviewed a little over two years ago and had a remarkable value for money. Today’s review places Endorfy’s largest CPU cooler, the Fortis 5 Dual Fan, on our laboratory test bench. The Fortis 5 is the largest CPU air cooler the company currently offers and is significantly more expensive than the Fera 5, yet it still is a single-tower cooler that strives to strike a balance between value, compatibility, and performance. Cases/Cooling/PSUs
G.Skill on Tuesday introduced its ultra-low-latency DDR5-6400 memory modules that feature a CAS latency of 30 clocks, which appears to be the industry's most aggressive timings yet for DDR5-6400 sticks. The modules will be available for both AMD and Intel CPU-based systems.
With every new generation of DDR memory comes an increase in data transfer rates and an extension of relative latencies. While for the vast majority of applications, the increased bandwidth offsets the performance impact of higher timings, there are applications that favor low latencies. However, shrinking latencies is sometimes harder than increasing data transfer rates, which is why low-latency modules are rare.
Nonetheless, G.Skill has apparently managed to cherry-pick enough DDR5 memory chips and build appropriate printed circuit boards to produce DDR5-6400 modules with CL30 timings, which are substantially lower than the CL46 timings recommended by JEDEC for this speed bin. This means that while JEDEC-standard modules have an absolute latency of 14.375 ns, G.Skill's modules can boast a latency of just 9.375 ns – an approximately 35% decrease.
G.Skill's DDR5-6400 CL30 39-39-102 modules have a capacity of 16 GB and will be available in 32 GB dual-channel kits, though the company does not disclose voltages, which are likely considerably higher than those standardized by JEDEC.
The company plans to make its DDR5-6400 modules available both for AMD systems with EXPO profiles (Trident Z5 Neo RGB and Trident Z5 Royal Neo) and for Intel-powered PCs with XMP 3.0 profiles (Trident Z5 RGB and Trident Z5 Royal). For AMD AM5 systems that have a practical limitation of 6000 MT/s – 6400 MT/s for DDR5 memory (as this is roughly as fast as AMD's Infinity Fabric can operate at with a 1:1 ratio), the new modules will be particularly beneficial for AMD's Ryzen 7000 and Ryzen 9000-series processors.
G.Skill notes that since its modules are non-standard, they will not work with all systems but will operate on high-end motherboards with properly cooled CPUs.
The new ultra-low-latency memory kits will be available worldwide from G.Skill's partners starting in late August 2024. The company did not disclose the pricing of these modules, but since we are talking about premium products that boast unique specifications, they are likely to be priced accordingly.
MemoryKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
Storage
0 Comments