January has been notable for the variety of vital bulletins in AI. For me, two stand out: the US authorities’s assist for the Stargate Project, an enormous information middle costing $500 billion, with investments coming from Oracle, Softbank, and OpenAI; and DeepSeeok’s launch of its R1 reasoning mannequin, skilled at an estimated price of roughly $5 million—a big quantity however a fraction of what it price OpenAI to coach its o1 fashions.
US tradition has lengthy assumed that larger is healthier, and that costlier is healthier. That’s actually a part of what’s behind the most costly information middle ever conceived. But we’ve got to ask a really totally different query. If DeepSeeok was certainly skilled for roughly a tenth of what it price to coach o1, and if inference (producing solutions) on DeepSeeok prices roughly one-thirtieth what it prices on o1 ($2.19 per million output tokens versus $60 per million output tokens), is the US expertise sector headed in the best route?
It clearly isn’t. Our “bigger is better” mentality is failing us.
I’ve lengthy believed that the important thing to AI’s success could be minimizing the price of coaching and inference. I don’t imagine there’s actually a race between the US and Chinese AI communities. But if we settle for that metaphor, the US—and OpenAI specifically—is clearly behind. And a half-trillion-dollar information middle is a part of the issue, not the answer. Better engineering beats “supersize it.” Technologists within the US must study that lesson.