AI’s carbon footprint can be managed, computer science professors say

Eleven months after ChatGPT’s release, Yale computer science professors discussed the carbon footprint associated with artificial intelligence and how growing industry management might better control its energy use.

Hanwen Zhang 1:59 am, Nov 01, 2023

Staff Columnist

Ellie Park, Photography Editor

ChatGPT has acquired over 180.5 million users and set a record for the fastest-growing consumer application in history, all while stirring fears about job replacement and plagiarized essays since its release last November. Nearing its one-year launch anniversary, though, ChatGPT’s carbon footprint has garnered concern by computer scientists.

With studies projecting large-language models like ChatGPT to potentially consume between 85.4 to 134.0 TWh of electricity — the equivalent of Sweden or Argentina’s annual electricity use — by 2027, the growing energy demands of artificial intelligence could likely make it a sneaky electricity guzzler. Foretelling its future, however, is still complicated. The News reached out to Yale computer science professors, who acknowledged the concerns regarding AI’s high energy consumption but also pointed to ways it could attain greater electrical efficiency.

“The problem of AI … is a sub-specialization of just the general issue of how much carbon or energy computing requires,” computational biology professor Mark Gerstein told the News.

Gerstein explained that the problem of AI’s energy-use is not entirely new. He added that large-scale data centers and cryptocurrency mining are other notorious consumers of energy, which makes AI’s computational efficiency concerns no different from those of its predecessors.

The heart of AI’s energy problem lies in its “huge” operational demands, according to Amin Karbasi, professor of electrical engineering and computer science. Karbasi explained that large-language models are most energy-intensive in their training phase, during which researchers input massive datasets to refine “hundreds of billions” of parameters. This training process — which ultimately allows models to predict word placement or develop sentences — can take weeks and requires thousands of graphics processing units. This makes for “staggering” figures of electricity consumption, Karbasi said.

Data center electricity use has accounted for one percent of global electricity use in recent years, according to a recent paper published in CellBy the paper’s projections, AI could account for anywhere between 0.3 to 0.5 percent of the world’s electricity use four years from now.

By comparison, cryptocurrency mining — another energy-intensive process in which computers perform complex math problems to verify transactions — consumed an estimated 161 TWh of electricity last year, or 0.6 percent of the world’s electricity use.

Stephen Slade, professor of computer science, said that AI’s carbon footprint is not impossible to fix — or at least to reduce. Extrapolating from the current electricity usage of large-language models often does not consider the potential effects of scale or increased algorithmic efficiency, he explained. Advancing AI doesn’t always entail more GPUs.

“It’s one thing for the hardware to become more powerful,” Slade told the News. “But there’s a greater impact made in software if you can get algorithms that are more efficient.”

Increasing algorithmic efficiency has been the focus of Karbasi’s lab. In a collaboration with Google and Carnegie Mellon University, Karbasi added that the “simpler” algorithms developed by his team have helped ease some of the more taxing computational processes used by AI while attaining the same results. By streamlining the AI’s “self-attention unit”— the mechanism that allows large-language models to assess the relative importance and order of words in a sentence — his lab has lowered some computation demands by 50 times their original amount.

Karbasi said that smarter algorithms could eventually compress some AI onto local “edge devices” such computers and smartwatches, many of whose applications must connect to outside servers. By migrating AI onto individual computer processing units, devices might decrease their reliance on cloud networks and reduce the strain on its servers.

According to Gerstein, training models with “imprecise calculations” or even embedding certain capabilities within devices could help increase AI’s energy efficiency. Gerstein added that most processes, such as facial recognition and email, currently require pushing data onto the cloud which makes the processing more inefficient..

Karbasi said he predicts that most AI models will likely be consolidated in the future, which would also drive down energy demands. AI models such as this ai faceswapper would be maintained by just a handful of large enterprises, on which users could then fine tune with their own data on these pre-trained models. While Karbasi noted that “fine tuning can be very expensive,” he said it is much more efficient than training individual models from scratch.

In the meantime, Karbasi added that “smaller models can be extremely beneficial” for understanding larger ones at the scale of ChatGPT or Bard. By acting as guinea pigs for potential improvements that accelerate the training process, his lab’s models have helped Google experiment with new methods and fixes to increase efficiency, he said

As their data centers increase, companies like Google and Amazon have also made attempts at any hardware and physical improvements they can, said Gerstein. He added that strategically positioning certain chips or placing data centers in cooler locations have helped mitigate some of the energy concerns, especially as computers have increased their complexity and power over the decades.

Earlier this year, Alphabet chairman John Hennessy had mentioned that an exchange with a large-language model cost 10 times more for the company than a standard Google search.

GPT-4, OpenAI’s latest system, can score a 90th percentile in the Uniform Bar Exam.

HANWEN ZHANG

Tweets by yaledailynews