Cerebras WSE-3: Third Generation Superchip for AI

0
286
Cerebras WSE-3: Third Generation Superchip for AI



Sunnyvale, Calif., AI supercomputer agency Cerebras says its subsequent era of waferscale AI chips can do double the efficiency of the earlier era whereas consuming the identical quantity of energy. The Wafer Scale Engine 3 (WSE-3) incorporates 4 trillion transistors, a greater than 50 p.c improve over the earlier era because of using newer chipmaking expertise. The firm says it’s going to use the WSE-3 in a brand new era of AI computer systems, which at the moment are being put in in a datacenter in Dallas to kind a supercomputer able to 8 exaflops (8 billion billion floating level operations per second). Separately, Cerebras has entered right into a joint growth settlement with Qualcomm that goals to spice up a metric of value and efficiency for AI inference 10-fold.

The firm says the CS-3 can prepare neural community fashions as much as 24-trillion parameters in dimension, greater than 10 instances the dimensions of in the present day’s largest LLMs.

With WSE-3, Cerebras can preserve its declare to producing the most important single chip on the planet. Square-shaped with 21.5 centimeters to a facet, it makes use of practically a complete 300-millimeter wafer of silicon to make one chip. Chipmaking tools is usually restricted to producing silicon dies of not more than about 800 sq. millimeters. Chipmakers have begun to flee that restrict through the use of 3D integration and different superior packaging expertise3D integration and different superior packaging expertise to mix a number of dies. But even in these techniques, the transistor rely is within the tens of billions.

As ordinary, such a big chip comes with some mind-blowing superlatives.

Transistors 4 trillion
Square millimeters of silicon 46,225
AI cores 900,000
AI compute 125 petaflops
On chip reminiscence 44 gigabytes
Memory bandwidth 21 petabytes
Network material bandwidth 214 petabits

You can see the impact of Moore’s Law within the succession of WSE chips. The first, debuting in 2019, was made utilizing TSMC’s 16-nanometer tech. For WSE-2, which arrived in 2021, Cerebras moved on to TSMC’s 7-nm course of. WSE-3 is constructed with the foundry big’s 5-nm tech.

The variety of transistors has greater than tripled since that first megachip. Meanwhile, what they’re getting used for has additionally modified. For instance, the variety of AI cores on the chip has considerably leveled off, as has the quantity of reminiscence and the inner bandwidth. Nevertheless, the development in efficiency by way of floating-point operations per second (flops) has outpaced all different measures.

CS-3 and the Condor Galaxy 3

The laptop constructed across the new AI chip, the CS-3, is designed to coach new generations of big massive language fashions, 10 instances bigger than OpenAI’s GPT-4 and Google’s Gemini. The firm says the CS-3 can prepare neural community fashions as much as 24-trillion parameters in dimension, greater than 10 instances the dimensions of in the present day’s largest LLMs, with out resorting to a set of software program methods wanted by different computer systems. According to Cerebras, meaning the software program wanted to coach a one-trillion parameter mannequin on the CS-3 is as simple as coaching a one billion parameter mannequin on GPUs.

As many as 2,048 techniques will be mixed, a configuration that may chew by coaching the favored LLM Llama 70B from scratch in simply sooner or later. Nothing fairly that massive is within the works, although, the corporate says. The first CS-3-based supercomputer, Condor Galaxy 3 in Dallas, will likely be made up of 64 CS-3s. As with its CS-2-based sibling techniques, Abu Dhabi’s G42 owns the system. Together with Condor Galaxy 1 and a pair of, that makes a community of 16 exaflops.

“The existing Condor Galaxy network has trained some of the leading open-source models in the industry, with tens of thousands of downloads,” stated Kiril Evtimov, group CTO of G42 in a press launch. “By doubling the capacity to 16 exaflops, we look forward to seeing the next wave of innovation Condor Galaxy supercomputers can enable.”

A Deal With Qualcomm

While Cerebras computer systems are constructed for coaching, Cerebras CEO Andrew Feldman says it’s inference, the execution of neural community fashions, that’s the actual restrict to AI’s adoption. According to Cerebras estimates, if each individual on the planet used ChatGPT, it could price US $1 trillion yearly—to not point out an awesome quantity of fossil-fueled vitality. (Operating prices are proportional to the dimensions of neural community mannequin and the variety of customers.)

So Cerebras and Qualcomm have fashioned a partnership with the objective of bringing the price of inference down by an element of 10. Cerebras says their answer will contain making use of neural networks methods resembling weight information compression and sparsity—the pruning of unneeded connections. The Cerebras-trained networks would then run effectively on Qualcomm’s new inference chip, the AI 100 Ultra, the corporate says.

From Your Site Articles

Related Articles Around the Web

LEAVE A REPLY

Please enter your comment!
Please enter your name here