The world’s leading publication for data science, AI, and ML professionals.

Carbon Emissions of an ML Engineering Team

The hidden cost of development that really matter

Everybody is aware of the climate crisis due to global warming as a result of human activities. To prevent its catastrophic consequences [1], the world needs to reduce our greenhouse gas emissions drastically, with many countries setting a target of net zero emissions by 2050.

The technology boom of AI in recent years has also raised concerns about its environmental cost. If we only look at its direct contributions, this will come through the use of electricity to train and power models. For example, training ChatGPT-3 with its 175 billion parameters generated a whopping 502 tonnes of carbon equivalent emissions (tCO2e) [2]. The new kid on the block Llama2 outputs a similar 539 tCO2e for training its family of four models [3]. For context, each of these is equivalent to the emissions of a passenger taking a one-way flight from New York to San Francisco 500 times.

I work in a machine learning engineering team, and this question also nudged me constantly. How much carbon emissions do we contribute through electricity consumption? Are there ways to reduce it? And here starts a first attempt at Carbon Accounting ourselves.

Methods

There is no single and direct way to measure our electricity consumption and consequently our carbon impact. This is due to the variety of platforms and services we use. I will not dive deep into the technical implementations, but at a high level, it consists of three methods.

  1. Provided: The exact carbon emission figures are already computed for us. This was given by our cloud service provider (CSP).
  2. Tools: We used a few software tools like Powermetics, Nvidia-SMI, and Turbostat to measure power in Watts, which tracks the CPU and GPU compute for our laptops and on-premise server.
  3. Self-Calculated: When the above is not possible, we use proxy methods to calculate. This consists of recording the duration of the compute, estimating the percentage utilisation of the chip(s), as well as finding the thermal design power (TDP) of each chip type to calculate the power consumed. The rest of the platforms are calculated this way.

For the latter two methods, power is then converted into energy (kWh), and where available, the Power Usage Efficiency (PUE) of the supporting data centres are used to obtain a more accurate energy consumption. The Grid Emission Factor (in kgCO2e/kWh) of the country or region is then finally used to compute the greenhouse gas emissions.

Results & Thoughts

The results are displayed in the pie chart below.

It isn’t particularly surprising on the ranking of the platforms in terms of carbon usage, but I was surprised with the percentages. I did not expect our development laptops and CICD service, with very heavy usage, to produce only a minuscule of carbon. While on the next extreme, I also did not expect our on-premise server for development and model training to burn x3 more carbon compared to our cloud usage.

In hindsight, we have recently switched to the latest Apple Silicon M2 chip for our laptops which is well-known to be highly efficient. Our CICD platform, while having thousands of minutes of pipeline runtime, uses the lowest compute chips, and is essentially serverless, only running when necessary.

For our on-premise server, we discovered that idle Nvidia GPU chips still consume significant electricity, resulting in a bloated electricity consumption. We will need to investigate if there were any misconfigurations, and if not, whether there is any way to better manage them.

Green Computing

Now that we have a better awareness of our carbon usage, how can we truly transform our development team to adopt more green solutions?

The term green computing has been out there for a while, and it has been organised or classified in different forms, but I feel that the six broad themes below will help my team manage the green transition with better clarity.

1. Green AI

This refers to finding ways to more efficiently train and infer models with little or acceptable loss in quality. It would basically translate to faster training and inference time, as well as smaller model sizes, using less compute power. The use of more complex neural networks demands ever larger datasets and increasingly advanced, prohibitively expensive and energy-hungry GPU chips.

Luckily, this has also been a hotbed for the latest optimisation research. In the past few years, I heard from my data scientist colleagues from using more efficient architectures in each domain, transfer-learning, compression techniques like quantisation or knowledge distillation, ONNX, to using deepspeed, PEFT and others in today’s era of large language models. No doubt we will need to keep up with the latest implementations the open-source world comes up with as their benefits have proven to be significant.

2. Green Apps

Models are useless without the code that builds around them to process the data, train the model, and ultimately serve them. Fundamental understanding of time and space complexities, the algorithms to implement, and also the various pre-built functions to apply them is required. Profilers to find bottlenecks in latency and memory should also be used.

Another important software engineering skill to build green apps is to understand how tasks and processes are managed, executed, and coordinated. This requires a firm grasp of concepts like parallelism, concurrency, asynchrony, multi-processing and threading, queuing, I/O and CPU limiting tasks.

To take a step further, the programming language used also comes into play. Python has emerged to be one of the top languages used in data science and general programming due to its widespread support and ease of use. However, as an interpreted language, it is significantly inferior compared to its compiled cousins like Go in terms of energy and speed (about x20) [4]. It is thus worth investing the effort to learn a second compiled language to cater for some jobs that require heavy processing.

3. Green Servers

To train and serve ML applications, compute power is required. This is provided by servers hosted either on-premise or in the cloud. If possible, going cloud is the best way to stay green since CSPs are incentivised to run their data centres efficiently, and you have the flexibility to switch resources based on project demands. Either way, we should ensure two key factors; choose the right hardware for the task, and use the compute only when required.

Major CSPs all have a diverse selection of servers to choose from. E.g., AWS has seven instance families each having a range of different chips, memory and other specifications, enough to cater for all scenarios like GPU, CPU, memory-intensive processes, or even ARM or x86 architecture. We should choose those that best maximise our use case so that compute is efficiently allocated through its hardware specifications.

How do we compute only when required? For a start, stock-take and turn off any resources that are not in use. You will be surprised how many idle services are left from legacy projects. In terms of architecture design, we can choose to use serverless compute like AWS lambda which only uses the resource when there is traffic, or provide a basic long-live compute with horizontal scaling that responds automatically to increased load.

4. Green Storage

Storage comes in many forms, e.g. object, block and file storage, container registries and databases. We can use two general guides to manage storage efficiently; reduce the storage size and choose the right storage type.

Storage size of data can be reduced simply by compression, some of the common ones being gzip or for archival tar.gz, which can reduce the size by half. Using a more efficient data structure can also be a better alternative. Using a columnar format like parquet not only occupies less space (>50%), but also enables faster queries (x30) due to the nature of its columnar structure.

Taking the example of object storage, there are storage classes that use less energy. In AWS S3, we can choose to keep less important data in one zone rather than replicate it across zones. For infrequently access, long-term storage, we can place them in "cold storage" (S3 glacier), where tape drives which consume a lot less energy compared to SSDs and HDDs are used. Lifecycle policies can also be set to automate transit between storage classes or even delete the data when a project reaches its conclusion.

5. Green Transfers

Data needs to be transmitted to and fro between servers, storage and other devices. Network communication requires energy too, to support a complex weave of networking devices, data centers, transmission infrastructure and end-user devices. For developers like us, we can reduce our carbon impact through the use of efficient transfer protocols, as well as the reducing the frequency and distance of transfers.

Where possible, the http/2 transfer protocol with the gRCP framework should be considered, since it can transmit in a more compact binary format rather than the traditional text (JSON) payloads in http/1. Of course with that comes lower latency and energy used.

Bringing data closer to the source of usage, and scheduling their transfers can also decrease the energy required. For example, dependencies required to run our automated test cases can be cached and rebuilt only when new changes are detected. Images do not need to be pulled from Dockerhub every time; we can store them in our CSP’s registry and only update them periodically when new patches are available.

6. Green Templates

This refers to the reusability and repeatability of efficient code, infrastructure and processes. In essence, this is an indirect form of reducing electricity consumption, since the actual implementations are from the previous five themes. However, I consider this as the most important one since it is the sum of the team knowledge.

It can come in the form of documentation or playbooks that set the standards for the team functions and how projects are executed, or cookie-cutter templates for repositories, CICD pipelines, infrastructure setups (e.g., Terraform) and configurations (e.g., Ansible).

Within each of the six themes, I have given some examples, but that is only the tip of the iceberg. The recommendations to implement within each theme are numerous and daunting. However, a progressive transition can be done by placing each of them in a decision quadrant through estimations of how difficult they are to implement in our existing workflows as well as if their impacts are significant and synergistic. This will provide some guidance on which ones to prioritise.

Design Principles

This transformation is neither straightforward nor easy. Even for passionate tree-hugging developers like us who care, we still need to prioritise our business needs. One way we can deal with this is not to think of Sustainability and carbon efficiency in ML development as antagonistic or mutually exclusive to other needs. This will ensure that you are still aligned to business goals and also get easier buy-in with management who are always pressured with delivery.

We can visualise this as a Venn diagram using AWS’s six design pillars of its Well-Architected Framework, where functional or business needs overlap with sustainability.

In fact, if you think about it, there are more often than not, synergistic impacts. Let’s take some examples:

  • Compressing your data storage can reduce x2 size and results in terms of cost savings and bandwidth, and also less energy to store and transfer them
  • Quantisation of neural network models has better performance in terms of faster inference, and hence consuming less energy
  • Removing unused dependencies in your docker images will improve security through the reduction of potentially exploitable surface area, increase the speed of deployment with its smaller size, and reduce the energy to store and transfer them to and from your registry.

Conclusion

All-in-all, this was a good kickstart of our journey to reduce our carbon impact in the engineering team. There will be a lot more work to identify, quantify, standardise and educate each recommendation that falls under each theme of green computing.

I would like to hear about your journey in measuring and reducing your carbon footprint in your development team. Please share in the comments below!


Acknowledgements: This carbon accounting attempt is a personal project done together with my peers, Yang Kewen and Chong-Yaw Wee.

Disclaimer: Opinions and recommendations expressed here are done in the author’s personal capacity.

References


Related Articles