Meta’s Yann LeCun: Scaling AI Won’t Make It Smarter
For years the AI industry has abided by a set of principles known as “scaling laws.” OpenAI researchers outlined them in the seminal 2020 paper, “Scaling Laws for Neural Language Models.”
“Model performance depends most strongly on scale, which consists of three factors: the number of model parameters N (excluding embeddings), the size of the dataset D, and the amount of compute C used for training,” the authors wrote.
In essence, more is more when it comes to building highly intelligent AI. This idea has fueled huge investments in data centers that allow AI models to process and learn from huge amounts of existing information.
But recently, AI experts across Silicon Valley have started to challenge that doctrine.
“Most interesting problems scale extremely badly,” Meta’s chief AI scientist, Yann LeCun, said at the National University of Singapore on Sunday. “You cannot just assume that more data and more compute means smarter AI.”
LeCun’s point hinges on the idea that training AI on vast amounts of basic subject matter, like internet data, won’t lead to some sort of superintelligence. Smart AI is a different breed.
“The mistake is that very simple systems, when they work for simple problems, people extrapolate them to think that they’ll work for complex problems,” he said. “They do some amazing things, but that creates a religion of scaling that you just need to scale systems more and they’re going to naturally become more intelligent.”
Right now, the impact of scaling is magnified because many of the latest breakthroughs in AI are actually “really easy,” LeCun said. The biggest large language models today are trained on roughly the amount of information in the visual cortex of a four-year-old, he said.
“When you deal with real-world problems with ambiguity and uncertainty, it’s not just about scaling anymore,” he added.
AI advancements have been slowing lately. This is due, in part, to a dwindling corpus of usable public data.
LeCun is not the only prominent researcher to question the power of scaling. Scale AI CEO Alexandr Wang said scaling is “the biggest question in the industry” at the Cerebral Valley conference last year. Cohere CEO Aidan Gomez called it the “dumbest” way to improve AI models.
LeCun advocates for a more world-based training approach.
“We need AI systems that can learn new tasks really quickly. They need to understand the physical world — not just text and language but the real world — have some level of common sense, and abilities to reason and plan, have persistent memory — all the stuff that we expect from intelligent entities,” he said during his talk Sunday.
Last year, on an episode of Lex Fridman’s podcast, LeCun said that in contrast to large language models, which can only predict their next steps based on patterns, world models have a higher level of cognition. “The extra component of a world model is something that can predict how the world is going to evolve as a consequence of an action you might take.”