A few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
StorageWhile neuromorphic computing remains under research for the time being, efforts into the field have continued to grow over the years, as have the capabilities of the specialty chips that have been developed for this research. Following those lines, this morning Intel and Sandia National Laboratories are celebrating the deployment of the Hala Point neuromorphic system, which the two believe is the highest capacity system in the world. With 1.15 billion neurons overall, Hala Point is the largest deployment yet for Intel’s Loihi 2 neuromorphic chip, which was first announced at the tail-end of 2021.
The Hala Point system incorporates 1152 Loihi 2 processors, each of which is capable of simulating a million neurons. As noted back at the time of Loihi 2’s launch, these chips are actually rather small – just 31 mm2 per chip with 2.3 billion transistors each, as they’re built on the Intel 4 process (one of the only other Intel chips to do so, besides Meteor Lake). As a result, the complete system is similarly petite, taking up just 6 rack units of space (or as Sandia likes to compare it to, about the size of a microwave), with a power consumption of 2.6 kW. Now that it’s online, Hala Point has dethroned the SpiNNaker system as the largest disclosed neuromorphic system, offering admittedly just a slightly larger number of neurons at less than 3% of the power consumption of the 100 kW British system.

A Single Loihi 2 Chip (31 mm2)
Hala Point will be replacing an older Intel neuromorphic system at Sandia, Pohoiki Springs, which is based on Intel’s first-generation Loihi chips. By comparison, Hala Point offers ten-times as many neurons, and upwards of 12x the performance overall,
Both neuromorphic systems have been procured by Sandia in order to advance the national lab’s research into neuromorphic computing, a computing paradigm that behaves like a brain. The central thought (if you’ll excuse the pun) is that by mimicking the wetware writing this article, neuromorphic chips can be used to solve problems that conventional processors cannot solve today, and that they can do so more efficiently as well.
Sandia, for its part, has said that it will be using the system to look at large-scale neuromorphic computing, with work operating on a scale well beyond Pohoiki Springs. With Hala Point offering a simulated neuron count very roughly on the level of complexity of an owl brain, the lab believes that a larger-scale system will finally enable them to properly exploit the properties of neuromorphic computing to solve real problems in fields such as device physics, computer architecture, computer science and informatics, moving well beyond the simple demonstrations initially achieved at a smaller scale.
One new focus from the lab, which in turn has caught Intel’s attention, is the applicability of neuromorphic computing towards AI inference. Because the neural networks themselves behind the current wave of AI systems are attempting to emulate the human brain, in a sense, there is an obvious degree of synergy with the brain-mimicking neuromorphic chips, even if the algorithms differ in some key respects. Still, with energy efficiency being one of the major benefits of neuromorphic computing, it’s pushed Intel to look into the matter further – and even build a second, Hala Point-sized system of their own.
According to Intel, in their research on Hala Point, the system has reached efficiencies as high as 15 TOPS-per-Watt at 8-bit precision, albeit while using 10:1 sparsity, making it more than competitive with current-generation commercial chips. As an added bonus to that efficiency, the neuromorphic systems don’t require extensive data processing and batching in advance, which is normally necessary to make efficient use of the high density ALU arrays in GPUs and GPU-like processors.
Perhaps the most interesting use case of all, however, is the potent... CPUs
G.Skill on Tuesday introduced its ultra-low-latency DDR5-6400 memory modules that feature a CAS latency of 30 clocks, which appears to be the industry's most aggressive timings yet for DDR5-6400 sticks. The modules will be available for both AMD and Intel CPU-based systems.
With every new generation of DDR memory comes an increase in data transfer rates and an extension of relative latencies. While for the vast majority of applications, the increased bandwidth offsets the performance impact of higher timings, there are applications that favor low latencies. However, shrinking latencies is sometimes harder than increasing data transfer rates, which is why low-latency modules are rare.
Nonetheless, G.Skill has apparently managed to cherry-pick enough DDR5 memory chips and build appropriate printed circuit boards to produce DDR5-6400 modules with CL30 timings, which are substantially lower than the CL46 timings recommended by JEDEC for this speed bin. This means that while JEDEC-standard modules have an absolute latency of 14.375 ns, G.Skill's modules can boast a latency of just 9.375 ns – an approximately 35% decrease.
G.Skill's DDR5-6400 CL30 39-39-102 modules have a capacity of 16 GB and will be available in 32 GB dual-channel kits, though the company does not disclose voltages, which are likely considerably higher than those standardized by JEDEC.
The company plans to make its DDR5-6400 modules available both for AMD systems with EXPO profiles (Trident Z5 Neo RGB and Trident Z5 Royal Neo) and for Intel-powered PCs with XMP 3.0 profiles (Trident Z5 RGB and Trident Z5 Royal). For AMD AM5 systems that have a practical limitation of 6000 MT/s – 6400 MT/s for DDR5 memory (as this is roughly as fast as AMD's Infinity Fabric can operate at with a 1:1 ratio), the new modules will be particularly beneficial for AMD's Ryzen 7000 and Ryzen 9000-series processors.
G.Skill notes that since its modules are non-standard, they will not work with all systems but will operate on high-end motherboards with properly cooled CPUs.
The new ultra-low-latency memory kits will be available worldwide from G.Skill's partners starting in late August 2024. The company did not disclose the pricing of these modules, but since we are talking about premium products that boast unique specifications, they are likely to be priced accordingly.
MemoryWhile neuromorphic computing remains under research for the time being, efforts into the field have continued to grow over the years, as have the capabilities of the specialty chips that have been developed for this research. Following those lines, this morning Intel and Sandia National Laboratories are celebrating the deployment of the Hala Point neuromorphic system, which the two believe is the highest capacity system in the world. With 1.15 billion neurons overall, Hala Point is the largest deployment yet for Intel’s Loihi 2 neuromorphic chip, which was first announced at the tail-end of 2021.
The Hala Point system incorporates 1152 Loihi 2 processors, each of which is capable of simulating a million neurons. As noted back at the time of Loihi 2’s launch, these chips are actually rather small – just 31 mm2 per chip with 2.3 billion transistors each, as they’re built on the Intel 4 process (one of the only other Intel chips to do so, besides Meteor Lake). As a result, the complete system is similarly petite, taking up just 6 rack units of space (or as Sandia likes to compare it to, about the size of a microwave), with a power consumption of 2.6 kW. Now that it’s online, Hala Point has dethroned the SpiNNaker system as the largest disclosed neuromorphic system, offering admittedly just a slightly larger number of neurons at less than 3% of the power consumption of the 100 kW British system.

A Single Loihi 2 Chip (31 mm2)
Hala Point will be replacing an older Intel neuromorphic system at Sandia, Pohoiki Springs, which is based on Intel’s first-generation Loihi chips. By comparison, Hala Point offers ten-times as many neurons, and upwards of 12x the performance overall,
Both neuromorphic systems have been procured by Sandia in order to advance the national lab’s research into neuromorphic computing, a computing paradigm that behaves like a brain. The central thought (if you’ll excuse the pun) is that by mimicking the wetware writing this article, neuromorphic chips can be used to solve problems that conventional processors cannot solve today, and that they can do so more efficiently as well.
Sandia, for its part, has said that it will be using the system to look at large-scale neuromorphic computing, with work operating on a scale well beyond Pohoiki Springs. With Hala Point offering a simulated neuron count very roughly on the level of complexity of an owl brain, the lab believes that a larger-scale system will finally enable them to properly exploit the properties of neuromorphic computing to solve real problems in fields such as device physics, computer architecture, computer science and informatics, moving well beyond the simple demonstrations initially achieved at a smaller scale.
One new focus from the lab, which in turn has caught Intel’s attention, is the applicability of neuromorphic computing towards AI inference. Because the neural networks themselves behind the current wave of AI systems are attempting to emulate the human brain, in a sense, there is an obvious degree of synergy with the brain-mimicking neuromorphic chips, even if the algorithms differ in some key respects. Still, with energy efficiency being one of the major benefits of neuromorphic computing, it’s pushed Intel to look into the matter further – and even build a second, Hala Point-sized system of their own.
According to Intel, in their research on Hala Point, the system has reached efficiencies as high as 15 TOPS-per-Watt at 8-bit precision, albeit while using 10:1 sparsity, making it more than competitive with current-generation commercial chips. As an added bonus to that efficiency, the neuromorphic systems don’t require extensive data processing and batching in advance, which is normally necessary to make efficient use of the high density ALU arrays in GPUs and GPU-like processors.
Perhaps the most interesting use case of all, however, is the potent... CPUs
G.Skill on Tuesday introduced its ultra-low-latency DDR5-6400 memory modules that feature a CAS latency of 30 clocks, which appears to be the industry's most aggressive timings yet for DDR5-6400 sticks. The modules will be available for both AMD and Intel CPU-based systems.
With every new generation of DDR memory comes an increase in data transfer rates and an extension of relative latencies. While for the vast majority of applications, the increased bandwidth offsets the performance impact of higher timings, there are applications that favor low latencies. However, shrinking latencies is sometimes harder than increasing data transfer rates, which is why low-latency modules are rare.
Nonetheless, G.Skill has apparently managed to cherry-pick enough DDR5 memory chips and build appropriate printed circuit boards to produce DDR5-6400 modules with CL30 timings, which are substantially lower than the CL46 timings recommended by JEDEC for this speed bin. This means that while JEDEC-standard modules have an absolute latency of 14.375 ns, G.Skill's modules can boast a latency of just 9.375 ns – an approximately 35% decrease.
G.Skill's DDR5-6400 CL30 39-39-102 modules have a capacity of 16 GB and will be available in 32 GB dual-channel kits, though the company does not disclose voltages, which are likely considerably higher than those standardized by JEDEC.
The company plans to make its DDR5-6400 modules available both for AMD systems with EXPO profiles (Trident Z5 Neo RGB and Trident Z5 Royal Neo) and for Intel-powered PCs with XMP 3.0 profiles (Trident Z5 RGB and Trident Z5 Royal). For AMD AM5 systems that have a practical limitation of 6000 MT/s – 6400 MT/s for DDR5 memory (as this is roughly as fast as AMD's Infinity Fabric can operate at with a 1:1 ratio), the new modules will be particularly beneficial for AMD's Ryzen 7000 and Ryzen 9000-series processors.
G.Skill notes that since its modules are non-standard, they will not work with all systems but will operate on high-end motherboards with properly cooled CPUs.
The new ultra-low-latency memory kits will be available worldwide from G.Skill's partners starting in late August 2024. The company did not disclose the pricing of these modules, but since we are talking about premium products that boast unique specifications, they are likely to be priced accordingly.
MemoryWith the rise of the handheld gaming PC market, we've seen PC vendors and their partners toy with a number of tricks and tweaks to improve improve framerates in games, with some of their latest efforts on display at this year's Computex trade show. Perhaps the most interesting find thus far comes from ADATA sub-brand XPG, who is demoing their prototype "Nia" handheld PC, which uses eye tracking and dynamic foveated rendering to further improve their rendering performance.
For those unfamiliar, dynamic foveated rendering is a graphics technique that is sometimes used to boost performance in virtual reality (VR) and augmented reality (AR) applications by taking advantage of how human vision works. Typically, humans can only perceive detailed imagery in the relatively small central area of our vision called the fovea, while our peripheral vision is much less detailed. Dynamic foveated rendering, in turn, exploits this by using real-time eye tracking to determine where the user is looking, and then rendering just that area in high/full resolution, while rendering the peripheral areas in lower resolution. The net result is that only a fraction of the screen is rendered at full detail, which cuts down on the total amount of rendering work required and boosting framerates on performance-limited devices.
As stated before, this technology is sometimes used in high-end AR/VR headsets, where high resolution displays are placed mere inches from one's face. This ends up being an ideal use case for the technique, since at those distances, only a small fraction of the screen is within the fovea.
Using dynamic foveated rendering for a handheld, on the other hand, is a more novel application. All of the same visual principles apply, but the resolutions at play are lower, and the screen is farther from the users' eyes. This makes a handheld device a less ideal use case, at least on paper, as a larger portion of the screen is going to be in the fovea, and thus will need to be rendered at full resolution. None the less, it will be interesting to see how XPG's efforts pan out, and if dynamic foveated rendering is beneficial enough for handheld PCs. As we sometimes see with trade show demos, not everything makes it out of the prototype stage.
According to a press release put out by ADATA ahead of the trade show, the eye tracking technology is being provided by AMD collaborator Eyeware. Notably, their software-based approach runs on top of standard webcams, rather than requiring IR cameras. So the camera hardware itself should be pretty straight-forward.
Foveated rendering aside, XPG is making sure that the Nia won't be a one-trick pony. The handheld's other major claim to fame is its hardware swappability. The prototype handheld not only features a removable M.2-2230 SSD, but the company is also taking advantage of the recently-introduced LPCAMM2 memory module standard to introduce removable DRAM. Via a hatch in the back of the handheld, device owners would be able to swap out LPCAMM2 LPDDR5X modules for higher capacity versions. This would give the handheld an additional degree of future-proofness over current handhelds, which use non-replaceable soldered-down memory.
Rounding out the package, the current prototype is based on an AMD's Zen 4 Phoenix APU, which is used across both of the company's current mobile lines (Ryzen Mobile 7000/8000 and Ryzen Z1). Meanwhile, the unit's display is adjustable, allowing it to be angled away from the body of the handheld.
Assuming all goes well with the prototype, XPG aims to release a finished product in 2025.
ADATASabrent tends to get into news when it launches ultra-high-performance SSDs for enthusiast-grade desktops, but this week the company introduced a completely different type of product: a small form-factor M.2-2242 SSD aimed at Lenovo's Legion Go handheld and ultra-thin laptops that don't accomodate M.2-2280 drives. And even though it's not an enthusiast-grade drive, the Rocket Nano still boasts with quite decent performance and capacity.
The Sabrent Rocket Nano 2242 (SB-2142) drive is based on the Phison E27T platform, a PCIe 4.0 x4 controller that is that is designed for mainstream DRAM-less SSDs, and in the case of the Rocket Nano, is paired with 3D TLC memory. The SSD is available in a single 1TB configuration, and is rated for read speeds up to 5 GB/s. Interestingly, the Phison E27T controller itself is rated for read speeds up to 7 GB/s, so it appears that the petite Rocket Nano isn't making full use of the controller's performance.
Sabrent positions its Rocket Nano 2242 SSD as drives for upgrading Lenovo's Legion Go portable game console, select Lenovo ThinkPad laptops, and other M.2-2242-sized PCs that can't accomodate larger 2280 drives. Keeping in mind that most devices shipping with M.2-2242 SSDscome with pretty slow stock drives, Sabrent solution seems to be a viable product for such upgrades. All the while, Sabrent's Rocket Nano 2242 will also work in systems with a PCIe 3.0 x4 M.2 slots, so the market for these drives is pretty wide.
Sabrent's Rocket Nano 2242 SSD 1 TB (SB-2142-1TB) SSD has a recommended price of $99.99, which is more or less in line with other 1 TB drives in the same form-factor and offering comparable performance. The SSD is currently available at Amazon for $101.
Sources: Tom's Hardware, Sabrent
StorageThe Ultra Ethernet Consortium (UEC) has announced this week that the next-generation interconnection consortium has grown to 55 members. And as the group works towards developing the initial version of their ultra-fast Ethernet standard, they have released some of the first technical details on the upcoming standard.
Formed in the summer of 2023, the UEC aims to develop a new standard for interconnection for AI and HPC datacenter needs, serving as a de-facto (if not de-jure) alternative to InfiniBand, which is largely under the control of NVIDIA these days. The UEC began to accept new members back in November, and just in five months' time it gained 45 new members, which highlights massive interest for the new technology. The consortium now boasts 55 members and 715 industry experts, who are working across eight technical groups.
There is a lot of work at hand for the UEC, as the group has laid out in their latest development blog post, as the consortium works to to build a unified Ethernet-based communication stack for high-performance networking supporting artificial intelligence and high-performance computing clusters. The consortium's technical objectives include developing specifications, APIs, and source code for Ultra Ethernet communications, updating existing protocols, and introducing new mechanisms for telemetry, signaling, security, and congestion management. In particular, Ultra Ethernet introduces the UEC Transport (UET) for higher network utilization and lower tail latency to speed up RDMA (Remote Direct Memory Access) operation over Ethernet. Key features include multi-path packet spraying, flexible ordering, and advanced congestion control, ensuring efficient and reliable data transfer.
These enhancements are designed to address the needs of large AI and HPC clusters — with separate profiles for each type of deployment — though everything is done in a surgical manner to enhance the technology, but reuse as much of the existing Ethernet as possible to maintain cost efficiency and interoperability.
The consortium's founding members include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), HPE, Intel, Meta, and Microsoft. After the Ultra Ethernet Consortium (UEC) began to accept new members in October, 2023, numerous industry heavyweights have joined the group, including Baidu, Dell, Huawei, IBM, Nokia, Lenovo, Supermicro, and Tencent.
The consortium currently plans to release the initial 1.0 version of the UEC specification publicly sometime in the third quarter of 2024.
"There was always a recognition that UEC was meeting a need in the industry," said J Metz, Chair of the UEC Steering Committee. "There is a strong desire to have an open, accessible, Ethernet-based network specifically designed to accommodate AI and HPC workload requirements. This level of involvement is encouraging; it helps us achieve the goal of broad interoperability and stability."
While it is evident that then Ultra Ethernet Consortium is gaining support across the industry, it is still unclear where other industry behemoths like AWS and Google stand. While the hardware companies involved can design Ultra Ethernet support into their hardware and systems, the technology ultimately exists to serve large datacenter and HPC system operators. So it will be interesting to see what interest they take in (and how quickly they adopt) the nascent Ethernet backbone technology once hardware incorporating it is ready.
NetworkingSK hynix early in Friday announced that the company has finished the development of it's PCB01 PCIe Gen5 SSD, the company's forthcoming high-end SSD for OEMs. Based on the company's new Alistar platform, the PCB01 is designed to deliver chart-topping performance for client machines. And, as a sign of the times, SK hynix is positioning the PCB01 for AI PCs, looking to synergize with the overall industry interest in anything and everything AI.
The bare, OEM-focused drives have previously been shown off by SK hynix, and make no attempt to hide what's under the hood. The PCB01 relies on SK hynix's Alistar controller, which features a PCIe Gen5 x4 host interface on the front end and eight NAND channels on the back end, placing it solidly in the realm of high-end SSDs. Paired with the Alistar controller is the company's latest 238-layer TLC NAND (H25T1TD48C & H25T2TD88C), which offers a maximum transfer speed of 2400 MT/second. Being that this is a high-end client SSD, there's also a DRAM chip on board, though the company isn't disclosing its capacity.
As with other high-end PCIe 5.0 client SSDs, SK hynix is planning on hitting peak read speeds of up to 14GB/second on the drive, while peak sequential write speeds should top 12GB/second (with pSLC caching, of course) – performance figures well within the realm of possibility for an 8 channel drive. As for random performance, at Computex the company was telling attendees that the drives should be able to sustain 4K random read and write rates of 2 million IOPS, which is very high as well. The SSDs are also said to consume up to 30% less power than 'predecessors,' according to SK hynix, though the company didn't elaborate on that figure. Typically in the storage industry, energy figures are based on iso-performance (rather than peak performance) – essentially measuring energy efficiency per bit rather than toal power consumption – and that is likely the case here as well.
At least initially, SK Hynix plans to release its PCB01 in three capacities – 512 GB, 1 TB, and 2 TB. The company has previously disclosed that their 238L TLC NAND has a capacity of 512Gbit, so these are typical capacity figures for single-sided drives. And while the focus of the company's press release this week was on OEM drives, this is the same controller and NAND that is also going into the company's previously-teased retail Platinum P51 SSD, so this week's reveal offers a bit more detail into what to expect from that drive family as well.
Specs aside, Ahn Hyun, the Head of the N-S Committee at SK hynix, said that multiple global CPU providers for on-device AI PCs are seeking collaboration for the compatibility validation process, which is underway, so expect PCB01 drives inside PCs in this back-to-school and holiday seasons.
"We will work towards enhancing our leadership as the global top AI memory provider also in the NAND solution space by successfully completing the customer validation and mass production of PCB01, which will be in the limelight," Ahn Hyun said.
SSDsTo say that the global foundry market is booming right now would be an understatement. Demand for leading-edge process technologies driven by AI and HPC applications is unprecedented, and with Intel joining the contract chipmaking game, this market segment is once again becoming rather competitive as well. Yet, this is exactly the market segment that Rapidus, a foundry startup backed by the Japanese government and several major Japanese companies, is going to enter in 2027, when its first fab comes online, just a few years from now.
In a fresh update on the status of bringing up the company's first leading-edge fab, Rapidus has revealed that they are intending to get in to the chip packaging game as well. Once complete, the ¥5 trillion ($32 billion) fab will be offering both chip lithography on a 2nm node, as well as packaging services for chips produced within the facility – a notable distinction in an industry where, even if packaging isn't outsourced entirely (OSAT), it's still normally handled at dedicated facilities.
Ultimately, while the company wants to serve the same clients as TSMC, Samsung, and Intel Foundry, the firm plans to do things almost completely differently than its competitors in a bid to speed up chipmaking from finishing design to getting a working chip out of the fab.
"We are very proud of being Japanese," said Henri Richard, general manager and president of Rapidus's subsidiary in the U.S. "[…] I know that some people may be looking at this thinking [that] Japan is known for quality, attention to detail, but not necessarily for speed, or flexibility. But I will tell you that Atsuyoshi Koike (the head of Rapidus) is a very special executive. That is, he has all the quality of Japan, with a lot of American thinking. So he is quite a unique guy, and certainly extraordinarily focused on creating a company that will be extremely flexible and extremely quick on its feet."
Perhaps the most significant difference between Rapidus and traditional foundries is that the company will offer only leading-edge manufacturing technologies to its clients: 2 nm in 2027 (phase 1) and then 1.4 nm in the future (phase 2). This is a stark contrast with other contract fabs, including Intel, which tend to offer their customers a full range of fabrication processes to land more clients and produce more chips. Apparently, Rapidus hopes that that there will be enough Japanese and American chip developers that are inclined to use its 2 nm fabrication process to produce their designs. With that said, the number of chip designers that are using the most advanced production node at any given time is relatively small – limited to large firms who need first-mover advantage and have the margins to justify taking the risk – so it remains to be seen whether Rapidus's business model becomes successful. The company believes it will, since the market of chips made on advanced nodes is growing rapidly.
"Until recently IDC was giving a an estimation of the 2nm and below market as about $80 billion and I think we are going to see soon a revision of the potential to $150 billion," said Richard. "[…] TSMC is the 800 pound gorilla in the space. Samsung is there and Intel is going to enter that space. But the market growth is so significant and the demand is so high, that it does not take a lot of market share for Rapidus to be successful. One of the things that gives me great comfort is that when I talk to our EDA partners, when I talk to our potential clients, it is obvious that the entire industry is looking for alternative supply from a fully independent foundry. There is a place for Samsung in this industry, there is a place for Intel in this industry, the industry is currently owned by TSMC. But another totally independent foundry is more than welcome by all of the ecosystem partners and by the customers. So, I feel really, really good about Rapidus's positioning."
Speaking of advanced process technologies, it is notable that Rapidus does not plan to use ASML's High-NA Twinscan EXE lithography scanners for 2 nm production. Instead, Rapidus is sticking to ASML's proven Low-NA scanners, which will reduce costs of Rapidus's fab, though it will entail usage of EUV double patterning, which brings up costs and lengthens the production cycle in other ways. Even with those trade-offs, SemiAnalysis analysts believe that given the cost of High-NA EUV litho tools and halved imaging field, ... Semiconductors
While neuromorphic computing remains under research for the time being, efforts into the field have continued to grow over the years, as have the capabilities of the specialty chips that have been developed for this research. Following those lines, this morning Intel and Sandia National Laboratories are celebrating the deployment of the Hala Point neuromorphic system, which the two believe is the highest capacity system in the world. With 1.15 billion neurons overall, Hala Point is the largest deployment yet for Intel’s Loihi 2 neuromorphic chip, which was first announced at the tail-end of 2021.
The Hala Point system incorporates 1152 Loihi 2 processors, each of which is capable of simulating a million neurons. As noted back at the time of Loihi 2’s launch, these chips are actually rather small – just 31 mm2 per chip with 2.3 billion transistors each, as they’re built on the Intel 4 process (one of the only other Intel chips to do so, besides Meteor Lake). As a result, the complete system is similarly petite, taking up just 6 rack units of space (or as Sandia likes to compare it to, about the size of a microwave), with a power consumption of 2.6 kW. Now that it’s online, Hala Point has dethroned the SpiNNaker system as the largest disclosed neuromorphic system, offering admittedly just a slightly larger number of neurons at less than 3% of the power consumption of the 100 kW British system.

A Single Loihi 2 Chip (31 mm2)
Hala Point will be replacing an older Intel neuromorphic system at Sandia, Pohoiki Springs, which is based on Intel’s first-generation Loihi chips. By comparison, Hala Point offers ten-times as many neurons, and upwards of 12x the performance overall,
Both neuromorphic systems have been procured by Sandia in order to advance the national lab’s research into neuromorphic computing, a computing paradigm that behaves like a brain. The central thought (if you’ll excuse the pun) is that by mimicking the wetware writing this article, neuromorphic chips can be used to solve problems that conventional processors cannot solve today, and that they can do so more efficiently as well.
Sandia, for its part, has said that it will be using the system to look at large-scale neuromorphic computing, with work operating on a scale well beyond Pohoiki Springs. With Hala Point offering a simulated neuron count very roughly on the level of complexity of an owl brain, the lab believes that a larger-scale system will finally enable them to properly exploit the properties of neuromorphic computing to solve real problems in fields such as device physics, computer architecture, computer science and informatics, moving well beyond the simple demonstrations initially achieved at a smaller scale.
One new focus from the lab, which in turn has caught Intel’s attention, is the applicability of neuromorphic computing towards AI inference. Because the neural networks themselves behind the current wave of AI systems are attempting to emulate the human brain, in a sense, there is an obvious degree of synergy with the brain-mimicking neuromorphic chips, even if the algorithms differ in some key respects. Still, with energy efficiency being one of the major benefits of neuromorphic computing, it’s pushed Intel to look into the matter further – and even build a second, Hala Point-sized system of their own.
According to Intel, in their research on Hala Point, the system has reached efficiencies as high as 15 TOPS-per-Watt at 8-bit precision, albeit while using 10:1 sparsity, making it more than competitive with current-generation commercial chips. As an added bonus to that efficiency, the neuromorphic systems don’t require extensive data processing and batching in advance, which is normally necessary to make efficient use of the high density ALU arrays in GPUs and GPU-like processors.
Perhaps the most interesting use case of all, however, is the potent... CPUs
0 Comments