When Western Digital introduced its Ultrastar DC SN861 SSDs earlier this year, the company did not disclose which controller it used for these drives, which made many observers presume that WD was using an in-house controller. But a recent teardown of the drive shows that is not the case; instead, the company is using a controller from Fadu, a South Korean company founded in 2015 that specializes on enterprise-grade turnkey SSD solutions.
The Western Digital Ultrastar DC SN861 SSD is aimed at performance-hungry hyperscale datacenters and enterprise customers which are adopting PCIe Gen5 storage devices these days. And, as uncovered in photos from a recent Storage Review article, the drive is based on Fadu's FC5161 NVMe 2.0-compliant controller. The FC5161 utilizes 16 NAND channels supporting an ONFi 5.0 2400 MT/s interface, and features a combination of enterprise-grade capabilities (OCP Cloud Spec 2.0, SR-IOV, up to 512 name spaces for ZNS support, flexible data placement, NVMe-MI 1.2, advanced security, telemetry, power loss protection) not available on other off-the-shelf controllers – or on any previous Western Digital controllers.
The Ultrastar DC SN861 SSD offers sequential read speeds up to 13.7 GB/s as well as sequential write speeds up to 7.5 GB/s. As for random performance, it boasts with an up to 3.3 million random 4K read IOPS and up to 0.8 million random 4K write IOPS. The drives are available in capacities between 1.6 TB and 7.68 TB with one or three drive writes per day (DWPD) over five years rating as well as in U.2 and E1.S form-factors.
While the two form factors of the SN861 share a similar technical design, Western Digital has tailored each version for distinct workloads: the E1.S supports FDP and performance enhancements specifically for cloud environments. By contrast, the U.2 model is geared towards high-performance enterprise tasks and emerging applications like AI.
Without any doubts, Western Digital's Ultrastar DC SN861 is a feature-rich high-performance enterprise-grade SSD. It has another distinctive feature: a 5W idle power consumption, which is rather low by the standards of enterprise-grade drives (e.g., it is 1W lower compared to the SN840). While the difference with predecessors may be just 1W, hyperscalers deploy thousands of drives and for their TCO every watt counts.
Western Digital's Ultrastar DC SN861 SSDs are now available for purchase to select customers (such as Meta) and to interested parties. Prices are unknown, but they will depend on such factors as volumes.
Sources: Fadu, Storage Review
StorageWhen Western Digital introduced its Ultrastar DC SN861 SSDs earlier this year, the company did not disclose which controller it used for these drives, which made many observers presume that WD was using an in-house controller. But a recent teardown of the drive shows that is not the case; instead, the company is using a controller from Fadu, a South Korean company founded in 2015 that specializes on enterprise-grade turnkey SSD solutions.
The Western Digital Ultrastar DC SN861 SSD is aimed at performance-hungry hyperscale datacenters and enterprise customers which are adopting PCIe Gen5 storage devices these days. And, as uncovered in photos from a recent Storage Review article, the drive is based on Fadu's FC5161 NVMe 2.0-compliant controller. The FC5161 utilizes 16 NAND channels supporting an ONFi 5.0 2400 MT/s interface, and features a combination of enterprise-grade capabilities (OCP Cloud Spec 2.0, SR-IOV, up to 512 name spaces for ZNS support, flexible data placement, NVMe-MI 1.2, advanced security, telemetry, power loss protection) not available on other off-the-shelf controllers – or on any previous Western Digital controllers.
The Ultrastar DC SN861 SSD offers sequential read speeds up to 13.7 GB/s as well as sequential write speeds up to 7.5 GB/s. As for random performance, it boasts with an up to 3.3 million random 4K read IOPS and up to 0.8 million random 4K write IOPS. The drives are available in capacities between 1.6 TB and 7.68 TB with one or three drive writes per day (DWPD) over five years rating as well as in U.2 and E1.S form-factors.
While the two form factors of the SN861 share a similar technical design, Western Digital has tailored each version for distinct workloads: the E1.S supports FDP and performance enhancements specifically for cloud environments. By contrast, the U.2 model is geared towards high-performance enterprise tasks and emerging applications like AI.
Without any doubts, Western Digital's Ultrastar DC SN861 is a feature-rich high-performance enterprise-grade SSD. It has another distinctive feature: a 5W idle power consumption, which is rather low by the standards of enterprise-grade drives (e.g., it is 1W lower compared to the SN840). While the difference with predecessors may be just 1W, hyperscalers deploy thousands of drives and for their TCO every watt counts.
Western Digital's Ultrastar DC SN861 SSDs are now available for purchase to select customers (such as Meta) and to interested parties. Prices are unknown, but they will depend on such factors as volumes.
Sources: Fadu, Storage Review
StorageAt FMS 2024, the technological requirements from the storage and memory subsystem took center stage. Both SSD and controller vendors had various demonstrations touting their suitability for different stages of the AI data pipeline - ingestion, preparation, training, checkpointing, and inference. Vendors like Solidigm have different types of SSDs optimized for different stages of the pipeline. At the same time, controller vendors have taken advantage of one of the features introduced recently in the NVM Express standard - Flexible Data Placement (FDP).
FDP involves the host providing information / hints about the areas where the controller could place the incoming write data in order to reduce the write amplification. These hints are generated based on specific block sizes advertised by the device. The feature is completely backwards-compatible, with non-FDP hosts working just as before with FDP-enabled SSDs, and vice-versa.
Silicon Motion's MonTitan Gen 5 Enterprise SSD Platform was announced back in 2022. Since then, Silicon Motion has been touting the flexibility of the platform, allowing its customers to incorporate their own features as part of the customization process. This approach is common in the enterprise space, as we have seen with Marvell's Bravera SC5 SSD controller in the DapuStor SSDs and Microchip's Flashtec controllers in the Longsys FORESEE enterprise SSDs.
At FMS 2024, the company was demonstrating the advantages of flexible data placement by allowing a single QLC SSD based on their MonTitan platform to take part in different stages of the AI data pipeline while maintaining the required quality of service (minimum bandwidth) for each process. The company even has a trademarked name (PerformaShape) for the firmware feature in the controller that allows the isolation of different concurrent SSD accesses (from different stages in the AI data pipeline) to guarantee this QoS. Silicon Motion claims that this scheme will enable its customers to get the maximum write performance possible from QLC SSDs without negatively impacting the performance of other types of accesses.
Silicon Motion and Phison have market leadership in the client SSD controller market with similar approaches. However, their enterprise SSD controller marketing couldn't be more different. While Phison has gone in for a turnkey solution with their Gen 5 SSD platform (to the extent of not adopting the white label route for this generation, and instead opting to get the SSDs qualified with different cloud service providers themselves), Silicon Motion is opting for a different approach. The flexibility and customization possibilities can make platforms like the MonTitan appeal to flash array vendors.
StorageOne of the core challenges that Rapidus will face when it kicks off volume production of chips on its 2nm-class process technology in 2027 is lining up customers. With Intel, Samsung, and TSMC all slated to offer their own 2nm-class nodes by that time, Rapidus will need some kind of advantage to attract customers away from its more established rivals. To that end, the company thinks they've found their edge: fully automated packaging that will allow for shorter chip lead times than manned packaging operations.
In an interview with Nikkei, Rapidus' president, Atsuyoshi Koike, outlined the company's vision to use advanced packaging as a competitive edge for the new fab. The Hokkaido facility, which is currently under construction and is expecting to begin equipment installation this December, is already slated to both produce chips and offer advanced packaging services within the same facility, an industry first. But ultimately, Rapidus biggest plan to differentiate itself is by automating the back-end fab processes (chip packaging) to provide significantly faster turnaround times.
Rapidus is targetting back-end production in particular as, compared to front-end (lithography) production, back-end production still heavily relies on human labor. No other advanced packaging fab has fully automated the process thus far, which provides for a degree of flexibility, but slows throughput. But with automation in place to handle this aspect of chip production, Rapidus would be able to increase chip packaging efficiency and speed, which is crucial as chip assembly tasks become more complex. Rapidus is also collaborating with multiple Japanese suppliers to source materials for back-end production.
"In the past, Japanese chipmakers tried to keep their technology development exclusively in-house, which pushed up development costs and made them less competitive," Koike told Nikkei. "[Rapidus plans to] open up technology that should be standardized, bringing down costs, while handling important technology in-house."
Financially, Rapidus faces a significant challenge, needing a total of ¥5 trillion ($35 billion) by the time mass production starts in 2027. The company estimates that ¥2 trillion will be required by 2025 for prototype production. While the Japanese government has provided ¥920 billion in aid, Rapidus still needs to secure substantial funding from private investors.
Due to its lack of track record and experience of chip production as. well as limited visibility for success, Rapidus is finding it difficult to attract private financing. The company is in discussions with the government to make it easier to raise capital, including potential loan guarantees, and is hopeful that new legislation will assist in this effort.
SemiconductorsG.Skill on Tuesday introduced its ultra-low-latency DDR5-6400 memory modules that feature a CAS latency of 30 clocks, which appears to be the industry's most aggressive timings yet for DDR5-6400 sticks. The modules will be available for both AMD and Intel CPU-based systems.
With every new generation of DDR memory comes an increase in data transfer rates and an extension of relative latencies. While for the vast majority of applications, the increased bandwidth offsets the performance impact of higher timings, there are applications that favor low latencies. However, shrinking latencies is sometimes harder than increasing data transfer rates, which is why low-latency modules are rare.
Nonetheless, G.Skill has apparently managed to cherry-pick enough DDR5 memory chips and build appropriate printed circuit boards to produce DDR5-6400 modules with CL30 timings, which are substantially lower than the CL46 timings recommended by JEDEC for this speed bin. This means that while JEDEC-standard modules have an absolute latency of 14.375 ns, G.Skill's modules can boast a latency of just 9.375 ns – an approximately 35% decrease.
G.Skill's DDR5-6400 CL30 39-39-102 modules have a capacity of 16 GB and will be available in 32 GB dual-channel kits, though the company does not disclose voltages, which are likely considerably higher than those standardized by JEDEC.
The company plans to make its DDR5-6400 modules available both for AMD systems with EXPO profiles (Trident Z5 Neo RGB and Trident Z5 Royal Neo) and for Intel-powered PCs with XMP 3.0 profiles (Trident Z5 RGB and Trident Z5 Royal). For AMD AM5 systems that have a practical limitation of 6000 MT/s – 6400 MT/s for DDR5 memory (as this is roughly as fast as AMD's Infinity Fabric can operate at with a 1:1 ratio), the new modules will be particularly beneficial for AMD's Ryzen 7000 and Ryzen 9000-series processors.
G.Skill notes that since its modules are non-standard, they will not work with all systems but will operate on high-end motherboards with properly cooled CPUs.
The new ultra-low-latency memory kits will be available worldwide from G.Skill's partners starting in late August 2024. The company did not disclose the pricing of these modules, but since we are talking about premium products that boast unique specifications, they are likely to be priced accordingly.
MemoryOptical connectivity – and especially silicon photonics – is expected to become a crucial technology to enable connectivity for next-generation datacenters, particularly those designed HPC applications. With ever-increasing bandwidth requirements needed to keep up with (and keep scaling out) system performance, copper signaling alone won't be enough to keep up. To that end, several companies are developing silicon photonics solutions, including fab providers like TSMC, who this week outlined its 3D Optical Engine roadmap as part of its 2024 North American Technology Symposium, laying out its plan to bring up to 12.8 Tbps optical connectivity to TSMC-fabbed processors.
TSMC's Compact Universal Photonic Engine (COUPE) stacks an electronics integrated circuit on photonic integrated circuit (EIC-on-PIC) using the company's SoIC-X packaging technology. The foundry says that usage of its SoIC-X enables the lowest impedance at the die-to-die interface and therefore the highest energy efficiency. The EIC itself is produced at a 65nm-class process technology.
TSMC's 1st Generation 3D Optical Engine (or COUPE) will be integrated into an OSFP pluggable device running at 1.6 Tbps. That's a transfer rate well ahead of current copper Ethernet standards – which top out at 800 Gbps – underscoring the immediate bandwidth advantage of optical interconnects for heavily-networked compute clusters, never mind the expected power savings.
Looking further ahead, the 2nd Generation of COUPE is designed to integrate into CoWoS packaging as co-packaged optics with a switch, allowing optical interconnections to be brought to the motherboard level. This version COUPE will support data transfer rates of up to 6.40 Tbps with reduced latency compared to the first version.
TSMC's third iteration of COUPE – COUPE running on a CoWoS interposer – is projected to improve on things one step further, increasing transfer rates to 12.8 Tbps while bringing optical connectivity even closer to the processor itself. At present, COUPE-on-CoWoS is in the pathfinding stage of development and TSMC does not have a target date set.
Ultimately, unlike many of its industry peers, TSMC has not participated in the silicon photonics market up until now, leaving this to players like GlobalFoundries. But with its 3D Optical Engine Strategy, the company will enter this important market as it looks to make up for lost time.
Optical connectivity – and especially silicon photonics – is expected to become a crucial technology to enable connectivity for next-generation datacenters, particularly those designed HPC applications. With ever-increasing bandwidth requirements needed to keep up with (and keep scaling out) system performance, copper signaling alone won't be enough to keep up. To that end, several companies are developing silicon photonics solutions, including fab providers like TSMC, who this week outlined its 3D Optical Engine roadmap as part of its 2024 North American Technology Symposium, laying out its plan to bring up to 12.8 Tbps optical connectivity to TSMC-fabbed processors.
TSMC's Compact Universal Photonic Engine (COUPE) stacks an electronics integrated circuit on photonic integrated circuit (EIC-on-PIC) using the company's SoIC-X packaging technology. The foundry says that usage of its SoIC-X enables the lowest impedance at the die-to-die interface and therefore the highest energy efficiency. The EIC itself is produced at a 65nm-class process technology.
TSMC's 1st Generation 3D Optical Engine (or COUPE) will be integrated into an OSFP pluggable device running at 1.6 Tbps. That's a transfer rate well ahead of current copper Ethernet standards – which top out at 800 Gbps – underscoring the immediate bandwidth advantage of optical interconnects for heavily-networked compute clusters, never mind the expected power savings.
Looking further ahead, the 2nd Generation of COUPE is designed to integrate into CoWoS packaging as co-packaged optics with a switch, allowing optical interconnections to be brought to the motherboard level. This version COUPE will support data transfer rates of up to 6.40 Tbps with reduced latency compared to the first version.
TSMC's third iteration of COUPE – COUPE running on a CoWoS interposer – is projected to improve on things one step further, increasing transfer rates to 12.8 Tbps while bringing optical connectivity even closer to the processor itself. At present, COUPE-on-CoWoS is in the pathfinding stage of development and TSMC does not have a target date set.
Ultimately, unlike many of its industry peers, TSMC has not participated in the silicon photonics market up until now, leaving this to players like GlobalFoundries. But with its 3D Optical Engine Strategy, the company will enter this important market as it looks to make up for lost time.
A few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
StorageKioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
StorageThanks to the success of the burgeoning market for AI accelerators, NVIDIA has been on a tear this year. And the only place that’s even more apparent than the company’s rapidly growing revenues is in the company’s stock price and market capitalization. After breaking into the top 5 most valuable companies only earlier this year, NVIDIA has reached the apex of Wall Street, closing out today as the world’s most valuable company.
With a closing price of $135.58 on a day that saw NVIDIA’s stock pop up another 3.5%, NVIDIA has topped both Microsoft and Apple in valuation, reaching a market capitalization of $3.335 trillion. This follows a rapid rise in the company’s stock price, which has increased by 47% in the last month alone – particularly on the back of NVIDIA’s most recent estimates-beating earnings report – as well as a recent 10-for-1 stock split. And looking at the company’s performance over a longer time period, NVIDIA’s stock jumped a staggering 218% over the last year, or a mere 3,474% over the last 5 years.
NVIDIA’s ascension continues a trend over the last several years of tech companies all holding the top spots in the market capitalization rankings. Though this is the first time in quite a while that the traditional tech leaders of Apple and Microsoft have been pushed aside.
| Market Capitalization Rankings | ||
| Market Cap | Stock Price | |
| NVIDIA | $3.335T | $135.58 |
| Microsoft | $3.317T | $446.34 |
| Apple | $3.285T | $214.29 |
| Alphabet | $2.170T | $176.45 |
| Amazon | $1.902T | $182.81 |
Driving the rapid growth of NVIDIA and its market capitalization has been demand for AI accelerators from NVIDIA, particularly the company’s server-grade H100, H200, and GH200 accelerators for AI training. As the demand for these products has spiked, NVIDIA has been scaling up accordingly, repeatedly beating market expectations for how many of the accelerators they can ship – and what price they can charge. And despite all that growth, orders for NVIDIA’s high-end accelerators are still backlogged, underscoring how NVIDIA still isn’t meeting the full demands of hyperscalers and other enterprises.
Consequently, NVIDIA’s stock price and market capitalization have been on a tear on the basis of these future expectations. With a price-to-earnings (P/E) ratio of 76.7 – more than twice that of Microsoft or Apple – NVIDIA is priced more like a start-up than a 30-year-old tech company. But then it goes without saying that most 30-year-old tech companies aren’t tripling their revenue in a single year, placing NVIDIA in a rather unique situation at this time.
Like the stock market itself, market capitalizations are highly volatile. And historically speaking, it’s far from guaranteed that NVIDIA will be able to hold the top spot for long, never mind day-to-day fluctuations. NVIDIA, Apple, and Microsoft’s valuations are all within $50 billion (1.%) of each other, so for the moment at least, it’s still a tight race between all three companies. But no matter what happens from here, NVIDIA gets the exceptionally rare claim of having been the most valuable company in the world at some point.
(Carousel image courtesy MSN Money)
GPUsOptical connectivity – and especially silicon photonics – is expected to become a crucial technology to enable connectivity for next-generation datacenters, particularly those designed HPC applications. With ever-increasing bandwidth requirements needed to keep up with (and keep scaling out) system performance, copper signaling alone won't be enough to keep up. To that end, several companies are developing silicon photonics solutions, including fab providers like TSMC, who this week outlined its 3D Optical Engine roadmap as part of its 2024 North American Technology Symposium, laying out its plan to bring up to 12.8 Tbps optical connectivity to TSMC-fabbed processors.
TSMC's Compact Universal Photonic Engine (COUPE) stacks an electronics integrated circuit on photonic integrated circuit (EIC-on-PIC) using the company's SoIC-X packaging technology. The foundry says that usage of its SoIC-X enables the lowest impedance at the die-to-die interface and therefore the highest energy efficiency. The EIC itself is produced at a 65nm-class process technology.
TSMC's 1st Generation 3D Optical Engine (or COUPE) will be integrated into an OSFP pluggable device running at 1.6 Tbps. That's a transfer rate well ahead of current copper Ethernet standards – which top out at 800 Gbps – underscoring the immediate bandwidth advantage of optical interconnects for heavily-networked compute clusters, never mind the expected power savings.
Looking further ahead, the 2nd Generation of COUPE is designed to integrate into CoWoS packaging as co-packaged optics with a switch, allowing optical interconnections to be brought to the motherboard level. This version COUPE will support data transfer rates of up to 6.40 Tbps with reduced latency compared to the first version.
TSMC's third iteration of COUPE – COUPE running on a CoWoS interposer – is projected to improve on things one step further, increasing transfer rates to 12.8 Tbps while bringing optical connectivity even closer to the processor itself. At present, COUPE-on-CoWoS is in the pathfinding stage of development and TSMC does not have a target date set.
Ultimately, unlike many of its industry peers, TSMC has not participated in the silicon photonics market up until now, leaving this to players like GlobalFoundries. But with its 3D Optical Engine Strategy, the company will enter this important market as it looks to make up for lost time.
Social Plugin