This AI Supercomputer Has 13.5 Million Cores—and Was Built in Just Three Days

0
270
This AI Supercomputer Has 13.5 Million Cores—and Was Built in Just Three Days


Artificial intelligence is on a tear. Machines can converse, write, play video games, and generate authentic photos, video, and music. But as AI’s capabilities have grown, so too have its algorithms.

A decade in the past, machine studying algorithms relied on tens of tens of millions of inside connections, or parameters. Today’s algorithms recurrently attain into the a whole bunch of billions and even trillions of parameters. Researchers say scaling up nonetheless yields efficiency positive aspects, and fashions with tens of trillions of parameters could arrive briefly order.

To practice fashions that massive, you want highly effective computer systems. Whereas AI within the early 2010s ran on a handful of graphics processing items—pc chips that excel on the parallel processing essential to AI—computing wants have grown exponentially, and prime fashions now require a whole bunch or 1000’s. OpenAI, Microsoft, Meta, and others are constructing devoted supercomputers to deal with the duty, they usually say these AI machines rank among the many quickest on the planet.

But whilst GPUs have been essential to AI scaling—Nvidia’s A100, for instance, remains to be one of many quickest, mostly used chips in AI clusters—weirder alternate options designed particularly for AI have popped up in recent times.

Cerebras presents one such different.

Making a Meal of AI

The dimension of a dinner plate—about 8.5 inches to a facet—the corporate’s Wafer Scale Engine is the largest silicon chip on this planet, boasting 2.6 trillion transistors and 850,000 cores etched onto a single silicon wafer. Each Wafer Scale Engine serves as the center of the corporate’s CS-2 pc.

Alone, the CS-2 is a beast, however final yr Cerebras unveiled a plan to hyperlink CS-2s along with an exterior reminiscence system referred to as MemoryX and a system to attach CS-2s referred to as SwarmX. The firm stated the brand new tech might hyperlink as much as 192 chips and practice fashions two orders of magnitude bigger than at present’s largest, most superior AIs.

“The industry is moving past 1-trillion-parameter models, and we are extending that boundary by two orders of magnitude, enabling brain-scale neural networks with 120 trillion parameters,” Cerebras CEO and cofounder Andrew Feldman stated.

At the time, all this was theoretical. But final week, the firm introduced they’d linked 16 CS-2s collectively right into a world-class AI supercomputer.

Meet Andromeda

The new machine, referred to as Andromeda, has 13.5 million cores able to speeds over an exaflop (one quintillion operations per second) at 16-bit half precision. Due to the distinctive chip at its core, Andromeda isn’t simply in comparison with supercomputers operating on extra conventional CPUs and GPUs, however Feldman advised HPC Wire Andromeda is roughly equal to Argonne National Laboratory’s Polaris supercomputer, which ranks seventeenth quickest on this planet, in keeping with the most recent Top500 listing.

In addition to efficiency, Andromeda’s speedy construct time, price, and footprint are notable. Argonne started putting in Polaris in the summertime of 2021, and the supercomputer went dwell a few yr later. It takes up 40 racks, the filing-cabinet-like enclosures housing supercomputer parts. By comparability, Andromeda price $35 million—a modest worth for a machine of its energy—took simply three days to assemble, and makes use of a mere 16 racks.

Cerebras examined the system by coaching 5 variations of OpenAI’s massive language mannequin GPT-3 in addition to Eleuther AI’s open supply GPT-J and GPT-NeoX. And in keeping with Cerebras, maybe crucial discovering is that Andromeda demonstrated what they name “near-perfect linear scaling” of AI workloads for giant language fashions. In brief, which means as extra CS-2s are added, coaching occasions lower proportionately.

Typically, the corporate stated, as you add extra chips, efficiency positive aspects diminish. Cerebras’s WSE chip, however, could show to scale extra effectively as a result of its 850,000 cores are linked to one another on the identical piece of silicon. What’s extra, every core has a reminiscence module proper subsequent door. Taken collectively, the chip slashes the period of time spent shuttling knowledge between cores and reminiscence.

“Linear scaling means when you go from one to two systems, it takes half as long for your work to be completed. That is a very unusual property in computing,” Feldman advised HPC Wire. And, he stated, it will probably scale past 16 linked methods.

Beyond Cerebras’s personal testing, the linear scaling outcomes had been additionally demonstrated throughout work at Argonne National Laboratory the place researchers used Andromeda to coach the GPT-3-XL massive language algorithm on lengthy sequences of the Covid-19 genome.

Of course, although the system could scale past 16 CS-2s, to what diploma linear scaling persists stays to be seen. Also, we don’t but understand how Cerebras performs head-to-head in opposition to different AI chips. AI chipmakers like Nvidia and Intel have begun participating in common third-party benchmarking by the likes of MLperf. Cerebras has but to participate.

Space to Spare

Still, the strategy does seem like carving out its personal area of interest on this planet of supercomputing, and continued scaling in massive language AI is a primary use case. Indeed, Feldman advised Wired final yr that the corporate was already speaking to engineers at OpenAI, a pacesetter in massive language fashions. (OpenAI founder, Sam Altman, can also be an investor in Cerebras.)

On its launch in 2020, OpenAI’s massive language mannequin GPT-3, modified the sport each by way of efficiency and dimension. Weighing in at 175 billion parameters, it was the largest AI mannequin on the time and stunned researchers with its skills. Since then, language fashions have reached into the trillions of parameters, and bigger fashions could also be forthcoming. There are rumors—simply that, to date—that OpenAI will launch GPT-4 within the not-too-distant future and it will likely be one other leap from GPT-3. (We’ll have to attend and see on that depend.)

That stated, regardless of their capabilities, massive language fashions are neither excellent nor universally adored. Their flaws embrace output that may be false, biased, and offensive. Meta’s Galactica, educated on scientific texts, is a current instance. Despite a dataset one would possibly assume is much less liable to toxicity than coaching on the open web, the mannequin was simply provoked into producing dangerous and inaccurate textual content and pulled down in simply three days. Whether researchers can resolve language AI’s shortcomings stays unsure.

But it appears doubtless that scaling up will proceed till diminishing returns kick in. The subsequent leap might be simply across the nook—and we could have already got the {hardware} to make it occur.

Image Credit: Cerebras

LEAVE A REPLY

Please enter your comment!
Please enter your name here