Introduction

We often overlook the environmental impact of our computations. However, as our dependency on digital services grows, their environmental impact becomes increasingly significant. This impact is typically measured in terms of carbon footprint, which refers to the total amount of CO2 emitted directly and indirectly as a result of an activity. It is measured in tonnes of CO2 equivalent (CO2e).

Advances in hardware technology have enabled us to train deep learning models with impressive accuracy. But this comes at a cost. Long training periods results increased consumption of energy leading to a higher environmental impact. For instance, training BERT_base, a language model with 110 million parameters, on 64 Tesla V100 GPS takes approximately 79 hours generates a carbon footprint of 1438 pounds of CO2e and costs roughly equivalent to one trans-American flight.

In 2020, the total greenhouse gas (GHG) emissions from the Information and Communication Technology (ICT) sector were estimated to be around 764 million tonnes of CO2e. This accounted for approximately 1.4% of global GHG emissions that year. With the continued expansion of the sector, these emissions are expected to rise, potentially influencing our global efforts to achieve net-zero carbon emissions in either a positive or negative way. It is thus crucial to raise awareness among researchers and industry professionals about the carbon emissions generated by their computations and to explore ways to reduce them.

The carbon footprint of a computer depends on two main factors : its hardware specifications and the power it consumes. This makes calculating carbon emission somewhat challenging, as it requires gathering and analyzing this data.

The codegreen-core package is a comprehensive tool that enables users to calculate the carbon emissions of a computational task and provides strategies for minimizing those emissions.