Has AI Progress Really Slowed Down?

21 November 2024 at 17:53

For over a decade, companies have bet on a tantalizing rule of thumb: that artificial intelligence systems would keep getting smarter if only they found ways to continue making them bigger. This wasn’t merely wishful thinking. In 2017, researchers at Chinese technology firm Baidu demonstrated that pouring more data and computing power into machine learning algorithms yielded mathematically predictable improvements—regardless of whether the system was designed to recognize images, speech, or generate language. Noticing the same trend, in 2020, OpenAI coined the term “scaling laws,” which has since become a touchstone of the industry.

[time-brightcove not-tgx=”true”]

This thesis prompted AI firms to bet hundreds of millions on ever-larger computing clusters and datasets. The gamble paid off handsomely, transforming crude text machines into today’s articulate chatbots.

But now, that bigger-is-better gospel is being called into question.

Last week, reports by Reuters and Bloomberg suggested that leading AI companies are experiencing diminishing returns on scaling their AI systems. Days earlier, The Information reported doubts at OpenAI about continued advancement after the unreleased Orion model failed to meet expectations in internal testing. The co-founders of Andreessen Horowitz, a prominent Silicon Valley venture capital firm, have echoed these sentiments, noting that increasing computing power is no longer yielding the same “intelligence improvements.”

What are tech companies saying?

Though, many leading AI companies seem confident that progress is marching full steam ahead. In a statement, a spokesperson for Anthropic, developer of the popular chatbot Claude, said “we haven’t seen any signs of deviations from scaling laws.” OpenAI declined to comment. Google DeepMind did not respond for comment. However, last week, after an experimental new version of Google’s Gemini model took GPT-4o’s top spot on a popular AI-performance leaderboard, the company’s CEO, Sundar Pichai posted to X saying “more to come.”

Recent releases paint a somewhat mixed picture. Anthropic has updated its medium sized model, Sonnet, twice since its release in March, making it more capable than the company’s largest model, Opus, which has not received such updates. In June, the company said Opus would be updated “later this year,” but last week, speaking on the Lex Fridman podcast, co-founder and CEO Dario Amodei declined to give a specific timeline. Google updated its smaller Gemini Pro model in February, but the company’s larger Gemini Ultra model has yet to receive an update. OpenAI’s recently released o1-preview model outperforms GPT-4o in several benchmarks, but in others it falls short. o1-preview was reportedly called “GPT-4o with reasoning” internally, suggesting the underlying model is similar in scale to GPT-4.

Parsing the truth is complicated by competing interests on all sides. If Anthropic cannot produce more powerful models, “we’ve failed deeply as a company,” Amodei said last week, offering a glimpse at the stakes for AI companies that have bet their futures on relentless progress. A slowdown could spook investors and trigger an economic reckoning. Meanwhile, Ilya Sutskever, OpenAI’s former chief scientist and once an ardent proponent of scaling, now says performance gains from bigger models have plateaued. But his stance carries its own baggage: Suskever’s new AI start up, Safe Superintelligence Inc., launched in June with less funding and computational firepower than its rivals. A breakdown in the scaling hypothesis would conveniently help level the playing field.

“They had these things they thought were mathematical laws and they’re making predictions relative to those mathematical laws and the systems are not meeting them,” says Gary Marcus, a leading voice on AI, and author of several books including Taming Silicon Valley. He says the recent reports of diminishing returns suggest we have finally “hit a wall”—something he’s warned could happen since 2022. “I didn’t know exactly when it would happen, and we did get some more progress. Now it seems like we are stuck,” he says.

Have we run out of data?

A slowdown could be a reflection of the limits of current deep learning techniques, or simply that “there’s not enough fresh data anymore,” Marcus says. It’s a hypothesis that has gained ground among some following AI closely. Sasha Luccioni, AI and climate lead at Hugging Face, says there are limits to how much information can be learned from text and images. She points to how people are more likely to misinterpret your intentions over text messaging, as opposed to in person, as an example of text data’s limitations. “I think it’s like that with language models,” she says.

The lack of data is particularly acute in certain domains like reasoning and mathematics, where we “just don’t have that much high quality data,” says Ege Erdil, senior researcher at Epoch AI, a nonprofit that studies trends in AI development. That doesn’t mean scaling is likely to stop—just that scaling alone might be insufficient. “At every order of magnitude scale up, different innovations have to be found,” he says, noting that it does not mean AI progress will slow overall.

It’s not the first time critics have pronounced scaling dead. “At every stage of scaling, there are always arguments,” Amodei said last week. “The latest one we have today is, ‘we’re going to run out of data, or the data isn’t high quality enough or models can’t reason.,” “…I’ve seen the story happen for enough times to really believe that probably the scaling is going to continue,” he said. Reflecting on OpenAI’s early days on Y-Combinator’s podcast, company CEO Sam Altman partially credited the company’s success with a “religious level of belief” in scaling—a concept he says was considered “heretical” at the time. In response to a recent post on X from Marcus saying his predictions of diminishing returns were right, Altman posted saying “there is no wall.”

Though there could be another reason we may be hearing echoes of new models failing to meet internal expectations, says Jaime Sevilla, director of Epoch AI. Following conversations with people at OpenAI and Anthropic, he came away with a sense that people had extremely high expectations. “They expected AI was going to be able to, already write a PhD thesis,” he says. “Maybe it feels a bit.. anti-climactic.”

A temporary lull does not necessarily signal a wider slowdown, Sevilla says. History shows significant gaps between major advances: GPT-4, released just 19 months ago, itself arrived 33 months after GPT-3. “We tend to forget that GPT three from GPT four was like 100x scale in compute,” Sevilla says. “If you want to do something like 100 times bigger than GPT-4, you’re gonna need up to a million GPUs,” Sevilla says. That is bigger than any known clusters currently in existence, though he notes that there have been concerted efforts to build AI infrastructure this year, such as Elon Musk’s 100,000 GPU supercomputer in Memphis—the largest of its kind—which was reportedly built from start to finish in three months.

In the interim, AI companies are likely exploring other methods to improve performance after a model has been trained. OpenAI’s o1-preview has been heralded as one such example, which outperforms previous models on reasoning problems by being allowed more time to think. “This is something we already knew was possible,” Sevilla says, gesturing to an Epoch AI report published in July 2023.

Policy and geopolitical implications

Prematurely diagnosing a slowdown could have repercussions beyond Silicon Valley and Wall St. The perceived speed of technological advancement following GPT-4’s release prompted an open letter calling for a six-month pause on the training of larger systems to give researchers and governments a chance to catch up. The letter garnered over 30,000 signatories, including Musk and Turing Award recipient Yoshua Bengio. It’s an open question whether a perceived slowdown could have the opposite effect, causing AI safety to slip from the agenda.

Much of the U.S.’s AI policy has been built on the belief that AI systems would continue to balloon in size. A provision in Biden’s sweeping executive order on AI, signed in October 2023 (and expected to be repealed by the Trump White House) required AI developers to share information with the government regarding models trained using computing power above a certain threshold. That threshold was set above the largest models available at the time, under the assumption that it would target future, larger models. This same assumption underpins export restrictions (restrictions on the sale of AI chips and technologies to certain countries) designed to limit China’s access to the powerful semiconductors needed to build large AI models. However, if breakthroughs in AI development begin to rely less on computing power and more on factors like better algorithms or specialized techniques, these restrictions may have a smaller impact on slowing China’s AI progress.

“The overarching thing that the U.S. needs to understand is that to some extent, export controls were built on a theory of timelines of the technology,” says Scott Singer, a visiting scholar in the Technology and International Affairs Program at the Carnegie Endowment for International Peace. In a world where the U.S. “stalls at the frontier,” he says, we could see a national push to drive breakthroughs in AI. He says a slip in the U.S.’s perceived lead in AI could spur a greater willingness to negotiate with China on safety principles.

Whether we’re seeing a genuine slowdown or just another pause ahead of a leap remains to be seen. “It’s unclear to me that a few months is a substantial enough reference point,” Singer says. “You could hit a plateau and then hit extremely rapid gains.”

Landmark Bill to Ban Children From Social Media Introduced in Australia’s Parliament

Tech – TIME

Rod McGuirk ／ AP

21 November 2024 at 07:30

MELBOURNE — Australia’s communications minister introduced a world-first law into Parliament on Thursday that would ban children under 16 from social media, saying online safety was one of parents’ toughest challenges.

Michelle Rowland said TikTok, Facebook, Snapchat, Reddit, X and Instagram were among the platforms that would face fines of up to 50 million Australian dollars ($33 million) for systemic failures to prevent young children from holding accounts.

“This bill seeks to set a new normative value in society that accessing social media is not the defining feature of growing up in Australia,” Rowland told Parliament.

[time-brightcove not-tgx=”true”]

“There is wide acknowledgement that something must be done in the immediate term to help prevent young teens and children from being exposed to streams of content unfiltered and infinite,” she added.

X owner Elon Musk warned that Australia intended to go further, posting on his platform: “Seems like a backdoor way to control access to the Internet by all Australians.”

The bill has wide political support. After it becomes law, the platforms would have one year to work out how to implement the age restriction.

“For too many young Australians, social media can be harmful,” Rowland said. “Almost two-thirds of 14- to 17-years-old Australians have viewed extremely harmful content online including drug abuse, suicide or self-harm as well as violent material. One quarter have been exposed to content promoting unsafe eating habits.”

Government research found that 95% of Australian care-givers find online safety to be one of their “toughest parenting challenges,” she said. Social media had a social responsibility and could do better in addressing harms on their platforms, she added.

“This is about protecting young people, not punishing or isolating them, and letting parents know that we’re in their corner when it comes to supporting their children’s health and wellbeing,” Rowland said.

Child welfare and internet experts have raised concerns about the ban, including isolating 14- and 15-year-olds from their already established online social networks.

Rowland said there would not be age restrictions placed on messaging services, online games or platforms that substantially support the health and education of users.

“We are not saying risks don’t exist on messaging apps or online gaming. While users can still be exposed to harmful content by other users, they do not face the same algorithmic curation of content and psychological manipulation to encourage near-endless engagement,” she said.

The government announced last week that a consortium led by British company Age Check Certification Scheme has been contracted to examine various technologies to estimate and verify ages.

In addition to removing children under 16 from social media, Australia is also looking for ways to prevent children under 18 from accessing online pornography, a government statement said.

Age Check Certification Scheme’s chief executive Tony Allen said Monday the technologies being considered included age estimation and age inference. Inference involves establishing a series of facts about individuals that point to them being at least a certain age.

Rowland said the platforms would also face fines of up to AU$50 million ($33 million) if they misused personal information of users gained for age-assurance purposes.

Information used for age assurances must be destroyed after serving that purpose unless the user consents to it being kept, she said.

Digital Industry Group Inc., an advocate for the digital industry in Australia, said with Parliament expected to vote on the bill next week, there might not be time for “meaningful consultation on the details of the globally unprecedented legislation.”

“Mainstream digital platforms have strict measures in place to keep young people safe, and a ban could push young people on to darker, less safe online spaces that don’t have safety guardrails,” DIGI managing director Sunita Bose said in a statement. “A blunt ban doesn’t encourage companies to continually improve safety because the focus is on keeping teenagers off the service, rather than keeping them safe when they’re on it.”

Reading view

What are tech companies saying?

Have we run out of data?

Policy and geopolitical implications