At FMS 2024, Kioxia had a proof-of-concept demonstration of their proposed a new RAID offload methodology for enterprise SSDs. The impetus for this is quite clear: as SSDs get faster in each generation, RAID arrays have a major problem of maintaining (and scaling up) performance. Even in cases where the RAID operations are handled by a dedicated RAID card, a simple write request in, say, a RAID 5 array would involve two reads and two writes to different drives. In cases where there is no hardware acceleration, the data from the reads needs to travel all the way back to the CPU and main memory for further processing before the writes can be done.
Kioxia has proposed the use of the PCIe direct memory access feature along with the SSD controller's controller memory buffer (CMB) to avoid the movement of data up to the CPU and back. The required parity computation is done by an accelerator block resident within the SSD controller.
In Kioxia's PoC implementation, the DMA engine can access the entire host address space (including the peer SSD's BAR-mapped CMB), allowing it to receive and transfer data as required from neighboring SSDs on the bus. Kioxia noted that their offload PoC saw close to 50% reduction in CPU utilization and upwards of 90% reduction in system DRAM utilization compared to software RAID done on the CPU. The proposed offload scheme can also handle scrubbing operations without taking up the host CPU cycles for the parity computation task.
Kioxia has already taken steps to contribute these features to the NVM Express working group. If accepted, the proposed offload scheme will be part of a standard that could become widely available across multiple SSD vendors.
StorageStandard CPU coolers, while adequate for managing basic thermal loads, often fall short in terms of noise reduction and superior cooling efficiency. This limitation drives advanced users and system builders to seek aftermarket solutions tailored to their specific needs. The high-end aftermarket cooler market is highly competitive, with manufacturers striving to offer products with exceptional performance.
Endorfy, previously known as SilentiumPC, is a Polish manufacturer that has undergone a significant transformation to expand its presence in global markets. The brand is known for delivering high-performance cooling solutions with a strong focus on balancing efficiency and affordability. By rebranding as Endorfy, the company aims to enter premium market segments while continuing to offer reliable, high-quality cooling products.
SilentiumPC became very popular in the value/mainstream segments of the PC market with their products, the spearhead of which probably was the Fera 5 cooler that we reviewed a little over two years ago and had a remarkable value for money. Today’s review places Endorfy’s largest CPU cooler, the Fortis 5 Dual Fan, on our laboratory test bench. The Fortis 5 is the largest CPU air cooler the company currently offers and is significantly more expensive than the Fera 5, yet it still is a single-tower cooler that strives to strike a balance between value, compatibility, and performance.
Cases/Cooling/PSUsKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
StorageA few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
StorageStandard CPU coolers, while adequate for managing basic thermal loads, often fall short in terms of noise reduction and superior cooling efficiency. This limitation drives advanced users and system builders to seek aftermarket solutions tailored to their specific needs. The high-end aftermarket cooler market is highly competitive, with manufacturers striving to offer products with exceptional performance.
Endorfy, previously known as SilentiumPC, is a Polish manufacturer that has undergone a significant transformation to expand its presence in global markets. The brand is known for delivering high-performance cooling solutions with a strong focus on balancing efficiency and affordability. By rebranding as Endorfy, the company aims to enter premium market segments while continuing to offer reliable, high-quality cooling products.
SilentiumPC became very popular in the value/mainstream segments of the PC market with their products, the spearhead of which probably was the Fera 5 cooler that we reviewed a little over two years ago and had a remarkable value for money. Today’s review places Endorfy’s largest CPU cooler, the Fortis 5 Dual Fan, on our laboratory test bench. The Fortis 5 is the largest CPU air cooler the company currently offers and is significantly more expensive than the Fera 5, yet it still is a single-tower cooler that strives to strike a balance between value, compatibility, and performance.
Cases/Cooling/PSUsKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
StorageA few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
StorageAs the deployment of PCIe 5.0 picks up steam in both datacenter and consumer markets, PCI-SIG is not sitting idle, and is already working on getting the ecosystem ready for the updats to the PCIe specifications. At FMS 2024, some vendors were even talking about PCIe 7.0 with its 128 GT/s capabilities despite PCIe 6.0 not even starting to ship yet. We caught up with PCI-SIG to get some updates on its activities and have a discussion on the current state of the PCIe ecosystem.
PCI-SIG has already made the PCIe 7.0 specifications (v 0.5) available to its members, and expects full specifications to be officially released sometime in 2025. The goal is to deliver a 128 GT/s data rate with up to 512 GBps of bidirectional traffic using x16 links. Similar to PCIe 6.0, this specification will also utilize PAM4 signaling and maintain backwards compatibility. Power efficiency as well as silicon die area are also being kept in mind as part of the drafting process.
The move to PAM4 signaling brings higher bit-error rates compared to the previous NRZ scheme. This made it necessary to adopt a different error correction scheme in PCIe 6.0 - instead of operating on variable length packets, PCIe 6.0's Flow Control Unit (FLIT) encoding operates on fixed size packets to aid in forward error correction. PCIe 7.0 retains these aspects.
The integrators list for the PCIe 6.0 compliance program is also expected to come out in 2025, though initial testing is already in progress. This was evident by the FMS 2024 demo involving Cadence's 3nm test chip for its PCIe 6.0 IP offering along with Teledyne Lecroy's PCIe 6.0 analyzer. These timelines track well with the specification completion dates and compliance program availability for previous PCIe generations.
We also received an update on the optical workgroup - while being optical-technology agnostic, the WG also intends to develop technology-specific form-factors including pluggable optical transceivers, on-board optics, co-packaged optics, and optical I/O. The logical and electrical layers of the PCIe 6.0 specifications are being enhanced to accommodate the new optical PCIe standardization and this process will also be done with PCIe 7.0 to coincide with that standard's release next year.
The PCI-SIG also has ongoing cabling initiatives. On the consumer side, we have seen significant traction for Thunderbolt and external GPU enclosures. However, even datacenters and enterprise systems are moving towards cabling solutions as it becomes evident that disaggregation of components such as storage from the CPU and GPU are better for thermal design. Additionally maintaining signal integrity over longer distances becomes difficult for on-board signal traces. Cabling internal to the computing systems can help here.
OCuLink emerged as a good candidate and was adopted fairly widely as an internal link in server systems. It has even made an appearance in mini-PCs from some Chinese manufacturers in its external avatar for the consumer market, albeit with limited traction. As speeds increase, a widely-adopted standard for external PCIe peripherals (or even connecting components within a system) will become imperative.
StorageNVIDIA on Tuesday said that future monitor scalers from MediaTek will support its G-Sync technologies. NVIDIA is partnering with MediaTek to integrate its full range of G-Sync technologies into future monitors without requiring a standalone G-Sync module, which makes advanced gaming features more accessible across a broader range of displays.
Traditionally, G-Sync technology relied on a dedicated G-sync module – based on an Altera FPGA – to handle syncing display refresh rates with the GPU in order to reduce screen tearing, stutter, and input lag. As a more basic solution, in 2019 NVIDIA introduced G-Sync Compatible certification and branding, which leveraged the industry-standard VESA AdaptiveSync technology to handle variable refresh rates. In lieu of using a dedicated module, leveraging AdaptiveSync allowed for cheaper monitors, with NVIDIA's program serving as a stamp of approval that the monitor worked with NVIDIA GPUs and met NVIDIA's performance requirements. Still, G-Sync Compatible monitors still lack some features that, to date, require the dedicated G-Sync module.
Through this new partnership with MediaTek, MediaTek will bring support for all of NVIDIA's G-Sync technologies, including the latest G-Sync Pulsar, directly into their scalers. G-Sync Pulsar enhances motion clarity and reduces ghosting, providing a smoother gaming experience. In addition to variable refresh rates and Pulsar, MediaTek-based G-Sync displays will support such features as variable overdrive, 12-bit color, Ultra Low Motion Blur, low latency HDR, and Reflex Analyzer. This integration will allow more monitors to support a full range of G-Sync features without having to incorporate an expensive FPGA.
The first monitors to feature full G-Sync support without needing an NVIDIA module include the AOC Agon Pro AG276QSG2, Acer Predator XB273U F5, and ASUS ROG Swift 360Hz PG27AQNR. These monitors offer 360Hz refresh rates, 1440p resolution, and HDR support.
What remains to be seen is which specific MediaTek's scalers will support NVIDIA's G-Sync technology – or if the company is going to implement support into all of their scalers going forward. It also remains to be seen whether monitors with NVIDIA's dedicated G-Sync modules retain any advantages over displays with MediaTek's scalers.
MonitorsStandard CPU coolers, while adequate for managing basic thermal loads, often fall short in terms of noise reduction and superior cooling efficiency. This limitation drives advanced users and system builders to seek aftermarket solutions tailored to their specific needs. The high-end aftermarket cooler market is highly competitive, with manufacturers striving to offer products with exceptional performance.
Endorfy, previously known as SilentiumPC, is a Polish manufacturer that has undergone a significant transformation to expand its presence in global markets. The brand is known for delivering high-performance cooling solutions with a strong focus on balancing efficiency and affordability. By rebranding as Endorfy, the company aims to enter premium market segments while continuing to offer reliable, high-quality cooling products.
SilentiumPC became very popular in the value/mainstream segments of the PC market with their products, the spearhead of which probably was the Fera 5 cooler that we reviewed a little over two years ago and had a remarkable value for money. Today’s review places Endorfy’s largest CPU cooler, the Fortis 5 Dual Fan, on our laboratory test bench. The Fortis 5 is the largest CPU air cooler the company currently offers and is significantly more expensive than the Fera 5, yet it still is a single-tower cooler that strives to strike a balance between value, compatibility, and performance.
Cases/Cooling/PSUsKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
Storage
0 Comments