Deep learning is behind machine learning’s most high-profile successes, such as advanced image recognition, the board game champion AlphaGo, and language models like GPT-3. But this incredible performance comes at a cost: training deep-learning models requires huge amounts of energy.
Now, new research shows how scientists who use cloud platforms to train deep-learning algorithms can dramatically reduce the energy they consume, and therefore the emissions this work generates. Simple changes to cloud settings are the key.
Since the first paper studying this technology’s impact on the environment was published three years ago, a movement has grown among researchers to self-report the energy consumed and emissions generated from their work. Having accurate numbers is an important step toward making changes, but actually gathering those numbers can be a challenge.
“You can’t improve what you can’t measure,” says Jesse Dodge, a research scientist at the Allen Institute for AI in Seattle. “The first step for us, if we want to make progress on reducing emissions, is we have to get a good measurement.”
To that end, the Allen Institute recently collaborated with Microsoft, the AI company Hugging Face, and three universities to create a tool that measures the electricity usage of any machine-learning program that runs on Azure, Microsoft’s cloud service. With it, Azure users building new models can view the total electricity consumed by graphics processing units (GPUs)—computer chips specialized for running calculations in parallel—during every phase of their project, from selecting a model to training it and putting it to use. It’s the first major cloud provider to give users access to information about the energy impact of their machine-learning programs.
While tools already exist that measure energy use and emissions from machine-learning algorithms running on local servers, those tools don’t work when researchers use cloud services provided by companies like Microsoft, Amazon, and Google. Those services don’t give users direct visibility into the GPU, CPU, and memory resources their activities consume—and the existing tools, like Carbontracker, Experiment Tracker, EnergyVis, and CodeCarbon, need those values in order to provide accurate estimates.
The new Azure tool, which debuted in October, currently reports energy use, not emissions. So Dodge and other researchers figured out how to map energy use to emissions, and they presented a companion paper on that work at FAccT, a major computer science conference, in late June. Researchers used a service called Watttime to estimate emissions based on the zip codes of cloud servers running 11 machine-learning models.
They found that emissions can be significantly reduced if researchers use servers in specific geographic locations and at certain times of day. Emissions from training small machine-learning models can be reduced up to 80% if the training starts at times when more renewable electricity is available on the grid, while emissions from large models can be reduced over 20% if the training work is paused when renewable electricity is scarce and restarted when it’s more plentiful.
Energy-conscious cloud users can lower their emissions by adjusting those factors through preference settings on the three biggest cloud services (Microsoft Azure, Google Cloud, and Amazon Web Services).
But Lynn Kaack, cofounder of Climate Change AI, an organization that studies the impact of machine learning on climate change, says cloud providers should pause and restart these projects automatically to optimize for lower emissions.
“You can schedule, of course, when to run the algorithm, but it’s a lot of manual work,” says Kaack. “You need policy incentives, probably, to really do this at scale.” She says policies like carbon pricing could incentivize cloud providers to build workflows that enable automatic pauses and restarts and allow users to opt in.
A lot more work still needs to be done to make machine learning more environmentally friendly, especially while most countries are still dependent on fossil fuels. And Dodge says that Azure’s tool only measures the electricity consumed by GPUs. A more accurate calculation of machine learning’s energy consumption would include CPU and memory usage—not to mention the energy for building and cooling the physical servers.
And changing habits can take time. Only 13% of Azure users running machine-learning programs have looked at the energy measurement tool since it debuted in October, Dodge says. And Raghavendra Selvan, who helped create Carbontracker, said even he has trouble persuading researchers to use the tool in their machine-learning research.
“I don’t think I have been able to convince my own group,” Selvan says.
But he is optimistic. More researchers are getting into the habit of reporting energy use in their papers, encouraged by major conferences like NeurIPS that suggest it. Selvan says if more people start to factor in these energy and emissions costs when planning future projects, it could start to reduce machine learning’s impact on climate change.