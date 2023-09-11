ChatGPT has made a splash across industries due to its ability to create humanlike, conversational dialogue.

But to produce the desired output, the LLMs behind generative AI applications require a tremendous amount of energy to train, develop and expand, which can have serious adverse effects on the environment. Explore where LLMs consume the most energy and methods to begin reducing their energy consumption and environmental impact.

The problem with LLMs

The environmental problems with LLMs spring from the large aspect. The amount of power consumed by the current generation of LLMs is associated with the size of the data sets they are trained on. An LLM's size can be characterized in part by the number of parameters used in its inference operations. More parameters means more data to move around and more computations to make use of that data.

Today's LLMs have orders of magnitude more parameters than earlier models. For example, Google's Bidirectional Encoder Representation from Transformers, or BERT, LLM, which achieved state-of-the-art performance when it was released in 2018, had 340 million parameters. In contrast, GPT-3.5, the LLM behind ChatGPT, has 175 billion.

Paralleling parameter counts, the power necessary to train some LLMs has jumped by four to six orders of magnitude. Power consumption has become a significant consideration when deciding how much training to perform -- along with cost, as some LLMs cost millions of dollars to train.

Training cycles consume the full attention of energy-hungry GPUs and CPUs. Extensive computational loads plus storing and moving massive amounts of data, contribute to large electrical draw and huge heat exhaust.

Heat load, in turn, means that more power goes toward cooling. Some data centers use water-based liquid cooling. But this method raises water temperatures, which can have adverse impacts on local ecosystems. Moreover, some water-based methods pollute the water used.

In comparison to training, the power consumed by an individual inference for a deployed model can seem miniscule. But that comparatively tiny amount must be multiplied by the number of inferences run when using that model in production.

In addition, many deployed models can only be used for a short time -- weeks or months -- before the model needs to be retrained. Addressing the problem of model drift requires repeating steps from the original training process and consuming a similar amount of power.