The big picture around data center environmental impacts

The environmental impact of ever larger data centers is currently a topic of heated discussion. In this post we take a step back and look into the overall landscape of environmental reporting of the digital economy footprints, sketching the big picture of what is being calculated and reported and some challenges associated with this task.

Page content

Concern about the environmental footprint of data center infrastructure is relatively new.1 In the early stages of the sustainability movement the volume of digital infrastructure deployed in data centers was dwarfed by the environmental impact of other industrial sectors (e.g., transport, manufacturing, agriculture etc.). Hence even as milestones such as the Paris Agreement (2015) accelerated worldwide efforts towards a sustainable economy, the Information and Communication Industry (ICT) sector received comparatively little attention.

An additional explanatory factor (besides relative novelty) is that, at least as far as GHG emissions are concerned, the ICT sector does not belong to the so-called hard to abate sectors. Utilizing renewable electricity sources can significantly reduce environmental footprint (with caveats around the availability of grid capacity, storage technologies etc.). Finally, an other unusual aspect of this domain is the expectation that digital services will exert benign influence on the footprints of other sectors (via better data collection, automatization etc.).

Yet following rapid (even explosive) growth in recent years, the study and disclosure of the environmental impact of digital services is acquiring more urgency. As a result both the current and future impact of data centers (and digital services more generally) see now a growing body of research and reporting.2 3 4 5

In a recent report5, the capital expenditure (CapEx) of just five digital-technology related6 companies is seen to be larger than the global investment in oil and natural gas production. The global electricity demand of data centers was 205 TWh in 2018, which represented about 1% of total global electricity demand. It has reached 485 TWh in 2025 and is projected to reach 950 TWh in 2030, accounting for around 3% of global electricity demand. Electricity consumption from so-called “AI” data centres is projected to triple in the same period. Parallel to energy consumption, e-waste is considered the fastest-growing waste stream in the world. The rapid technology refresh rate in data centers, driven by increasing demands, contributes this challenge. 53.6 million metric tons of e-waste were generated in 2019, only 17.4 percent of which was formally collected and recycled.7

Other environmental impacts such as water usage in water stressed areas are also coming into focus. In the US, direct water usage in data centers more than doubled in the decade 2014-2024, and is expected to further multiply going forward.3 Indirect water usage, as required for the production of electricity used in data centers is not generally reported but could be significantly higher.8

In this post we take a step back and review key concepts used in establishing the environmental impact of data centers and some challenges associated with this task. We’ll start with discussing first some terminology. Perhaps as might be expected in a fast changing domain, definitions of various key terms are not particularly sharp. This can stunt the development of meaningful practices and over time erode trust.

The Data Center Terminology Mess

A solid departure point for the terminology discussion is the definition of a Data Center . There is no formal definition (which is part of the problem) but loosely speaking it is an identifiable physical facility, typically one or more special-purpose buildings, or a designated space within a general purpose building. The facility is used almost exclusively to house a variety of electronic equipment. Such equipment comprises computing servers that perform calculations, data storage systems that store information in digital form, and communication networks that connect storage and computation, internally within the data center and, importantly, to the outside world. These components are collectively termed IT Resources. They are the components that are directly involved in enabling the digital services function of the data center.

Data Center Photo By Tedder, CC BY-SA 4.0
For the avoidance of doubt of what a data center looks like, an aerial view of a large modern data center. These facilities are designed with the explicit purpose of hosting a variety of IT equipment (servers, data storage, networks). In addition, a variety of non-IT equipment (e.g., cooling) and connections to industrial-scale electricity and water supply are essential components which differentiate data center buildings from e.g., generic commercial real estate. (Credit: Wikipedia Contributors)

IT equipment cannot function without a range of critical supporting elements, such components in data centers are: heating ventilation and air-conditioning (HVAC) equipment, uninterruptible power supplies (UPS), building lighting etc. They can account for a significant electricity demand.9 These are collectively termed Non-IT Resources. A data center is thus a well-defined, dedicated space that brings together these two classes of elements. In addition it involves industrial-scale connections to utilities, most prominently for our purposes, electricity and water provision.

Cenk Endustri, CC BY-SA 3.0, via Wikimedia Commons
Industrial scale cooling equipment. While computing might be perceived as abstract and ethereal, it has an unavoidable energy basis. While the energy consumption per computation has been reducing systematically over the decades (Koomey's law), the amount of computation has been increasing. Energy consumption invariably requires heat dissipation (otherwise the computer melts). Data center cooling facilities are the supersized analog of the fan keeping our laptop functioning. An important technological bifurcation that became prominent with the ever more intense energy consumption is the employment of liquid instead air cooling technologies. (Credit: Wikipedia Contributors)

Despite what the name might suggest in a literal interpretation, data centers are not just Data Storage Centers. The essential purpose of data centers is the provision of a variable menu of Digital Services , which is a very dynamic and fast changing set. Fundamentally it means using IT resources to provide digital information storage, but also to process and distribute information to outside users. Importantly, end-users of said digital services can be located elsewhere and practically anywhere. There is no limit to how far end-users could be located, except for network latency, the time it takes for information to travel back and forth over networks. Latency does indeed reduce the type of digital services that are viable for any given technology mix. The network and computational capability elements are thus an essential part of the data center value proposition.

Data centers are evolving nowadays to be the main production units (factories) of the digital services available in a digitized economy. A more appropriate name for them might thus be digital service centers. But we should keep in mind that any digital service provided by a data center always has an end-user that requires additional, non-data center infrastructure which may have a significant footprint.

Despite data centers having more or less uniform physical functions, there is no formal definition and classification of their economic role. Terms such as hyperscale, cloud (public, private and hybrid), internal, on-premise, enterprise, edge, hosting or colocation data centers are used widely and variably without clear definitions. There is also confusion between a hyperscale data center versus a hyperscaler corporate entity. The result is that even the simple counting how many data centers operate in a certain area is an ill-defined task. The quote below from a World Bank report captures this challenge.10

"There is little agreement on the number of data centers in the world. One reason is the variety ranging from small company owned facilities to gigantic hyperscale data centers. In respect to the ICT sector, boundaries are critical since only data centers owned and operated by companies formally part of the ICT sector should be included"

The proper accounting and attribution of environmental impacts from data center operations is not just a technical exercise in energy and water engineering. It must cope with diverse business models, regulatory obligations, substantial differences in technologies used, diversity of environmental impacts and strong dependence of these impacts on location. Minimally, alignment with established sustainability reporting frameworks for other business sectors will help for like-for-like evaluation and prioritisation. Thus, in this light, proper classification of data centers does become less of an academic exercise and a pre-requisite for meaningful attribution of environmental impacts from digital services.

Let us review and try to streamline some of these characterisations. The “right” classification obviously depends on the use one has in mind. For our purposes this would be the logical and transparent attribution of environmental impacts to the entities responsible. The starting point for this task would be the below diagram.

Data Services
The raison-d'etre of data centers is to provide digital services. This can be either to the owner of the facility, internal or corporate use and/or other, external clients. Services are provided by aggregating within a certain perimeter (lower block) a variety of IT equipment. The count and power intensity of IT equipment that provides the digital services determines its energy use and other environmental impacts (indicated by servers of different colors, some grey, some green, some red hot!). Importantly, the lifecyle of digital services is not complete without considering a range of elements outside the perimeter of the data center, both "upstream", enabling the data center operations and "downstream", enabling end-user utility.

The business sector of the data center owners and, respectively, data center users, along with the type of digital services being provided defines the business model and how these relations are embedded in an economic network. Having a faithful such map is the starting point of proper attribution. The challenges start creeping in given that the same physical pattern (a concentration of IT resources used by a range of end-users) can manifest in the economic realm in various ways:

  • multiple corporate entities, potentially in different business sectors and providing different types of digital services can rent out data center space from the same data center owner.
  • there can be commingling of diverse business models by large congolomerate entities.
  • even the definition of end-users can be ambiguous in an economic sense, e.g., the paying clients of data centers can be different from the actual users of services, as it is the case, e.g., with social media.

Taking a step back from modern-day complexity, it is instructive to start the discussion of about data center classification with the earliest such example, the so-called high performance computing (HPC) centers, originating in a less complex era.

High Performance Computing Centers

HPC data centers (also Supercomputing Centers) are rather unique facilities, operated and used primarily by government entities. Internally these facilities always used the latest, highest-end, computational and networking technologies, to tackle seriously demanding computational challenges.

IBM Blue Gene Supercomputer
Depicted is the Blue Gene/P supercomputer "Intrepid" at Argonne National Laboratory (2007). Featuring 164000 processors, grouped in 40 racks/cabinets connected by high-speed networks. While traditional corporate data centers must balance costs against economic benefit and are based, as much as possible, on off-the-shelf computer equipment (standard CPU's and networks), the computational needs of HPC utilized from early on advanced forms of vectorized and/or parallel computing, which requires also ultra-fast network connections, more intense power and cooling requirements etc. In this sense the HPC Centers are the forerunners of the current "AI" data centers. (Credit: Argonne National Laboratory / Wikipedia)

By-and-large HPC centers are owned and operated by the public sector. Hence attribution of environmental impact is in principle straightforward. While there are only a relatively small number of public-sector run HPC centers per country, modern so-called accelerated data centers (“AI factories”) are similar to HPC in terms of their power intensity and may exceed them in size. Given HPC centers are operating under different incentives and disclosure regimes, they can in principle contribute important insights into advanced data center environmental impacts, see e.g.11. Yet at present none of the world’s Top 25 HPC computing systems (Top 500 list) reports their carbon footprint.12

Enterprise Data Centers

For a long time private sector (commercial) data centers were mainly of the enterprise (also called internal data centers) type. Such internal data center types come in a very wide range of sizes, from simple server closets, to server rooms, localized data centers, mid/high-end tier data centers, and now also “hyperscale” data centers. The objective of such facilities is to serve the business needs of a single corporate entity. These can be internal needs (for the use of company employees) but also in support of company clients and other external users. To the extend that traditional business models undergo so-called digital transformation, corporate entities increasingly need to own and operate enterprise data centers.

In practice the data centers belonging to a corporate entity operating in almost any sector will be classified as enterprise data centers. The sole important exception, which we discuss separately below under Cloud Data Centers, is the provision of generic digital services as a business model. If we take the revised NACE 2.1 Classification as an illustrative guide of how entities are classified economically, this would be Group 63.1, two level deep within the overall ICT sector Section K.

  • NACE 2.1 Section K Telecommunication, Computer Programming, Consulting, Computing Infrastructure And Other Information Service Activities
  • NACE 2.1 Division 63 Computing Infrastructure, Data Processing, Hosting And Other Information Service Activities
  • NACE 2.1 Group 63.1 Computing Infrastructure, Data Processing, Hosting And Related Activities

The size and intensity of enterprise data center operations reflects the volume and nature of the organization’s own size and business model needs. The IT equipment processes corporate owned data and company owned applications with which external users may interact.

Data Center Interior
Historically enterprise data centers comprised aggregations of large numbers of server computers, each one with one or more central processing (CPU) units, arranged vertically in racks. A large enterprise data center might host between 2,000 and 5,000 servers, its surface are might be between 20,000 and 100,000 square feet, and its energy draw around 100MW (Source: IBM). There is no a-priory restriction how extensive these data centers might be or what power intensity profile they might have. (Credit: Wikipedia)

Colocation Data Centers

A colocation data center is a facility owned and operated by a real estate company that rents out the physical data center space to third parties (e.g. corporations) to place their servers and other network equipment. These third parties owning IT equipment within such a data center are said to be colocated within the data center. This is useful to organizations that may not have the resources needed to maintain their own enterprise data center or simply find it more cost-efficient to do externally. The non-IT equipment at the colocation center might be shared by all its customers.

In the US many colocation centers are structured as Data Center REITs (Real Estate Investment Trusts) that are specialized public companies that own, operate, and manage facilities designed to house critical IT infrastructure. In principle any private or public corporate entity provide a similar service.

In practice companies both in the ICT and non-ICT sectors may utilize colocation data centers.

Edge Data Centers

An Edge Data Center 13 is an emerging class of smaller sized data centers that are located closer to end-users, with the primary motivation to address network latency requirements. While the most energy intensive applications (large LLM models etc.) are currently developed in large centralized data centers, these alternative designs may become more important in the future with the development of more localized patterns. The smaller size of such centers is correspondingly matched by their larger numbers.

Cloud Data Centers

Cloud Data Centers are data centers owned by specialized (ICT sector) entities that provide scalable (flexible as to the size and/or type and intensity) digital services to others. Here the business model is focused on providing a range of generic digital services to external customers. What counts of generic is of course a gray area. There are large numbers of cloud services clients, individuals, corporates or the public sector. Cloud companies offer access to storage, databases, algorithms, and other computational services.

Indicatively the NACE classification suggest the following services:

  • provision of computing infrastructure, including cloud infrastructure and platform provision (IaaS, PaaS)
  • cloud computing (except software publishing and computer systems design), whether or not in combination with infrastructure provision
  • provision of technical infrastructure related to streaming services, data processing services and related activities:
    • complete processing of data supplied by clients
    • generation of specialised reports from data supplied by clients
    • blockchain / distributed ledger technology (DLT) data processing activities
  • specialised hosting activities, e.g.:
    • web hosting
    • application hosting
  • general time-share provision of mainframe facilities to clients
  • digitalisation of files (for further processing of data)
  • provision of data entry services
  • data centre colocation activities (i.e. rental of server and networking space in data centres, including routine monitoring of servers)
  • digital data storage
  • issuing of crypto-assets without a corresponding liability (not by a monetary authority)

NB: Notice the inclusion of colocation into cloud which is rather circular, it is typically cloud services that uses colocation. Unlike the collocation data centers discussed above, clients are not tenants. Therefore, energy consumption related emissions are fully allocated to the cloud providers Scope 1 and 2 and the clients Scope 3.10 Being focused on the provision of generic digital services, such data center operators aggregate very large numbers of servers. This gives rise to the term “hyperscale”. The power intensity of such data centers is also variable: if clients require accelerated computing cloud data centers will provide it.

We have laid out now the basic characteristics of the data center categories commonly used in environmental impact reports. The classification dimensions that seem relevant from an environmental accounting perspective are the following:

  • The Legal Ownership: The nature of the legal entity that controls the data center (one or more corporate entities, the public sector etc.). This will be the entity reporting environmental impact under Scope 1 and Scope 2.
  • The Business Model: The purpose of facility, the type of equipment, nature of digital services provided (and to whom). This determines the Scope 3 reporting, both upstream (embodied in IT equipment) and downstream (Client Use)
  • The Size: The extend of the facility, e.g., the number of servers, its surface area, its total energy and water use etc. The provision of digital services to end-users scales in line with those variables.
  • The Power Intensity: The intensity or IT resource utilization, which in turns links to digital service provision rates. E.g. the number of operations per second, the volume of data send over the network per second, the heat that must be carried away per area or volume etc. These rates will in general depend on the technologies used and the nature of digital service provided.

We can organize them as follows, indicating in each column the most typical situation, keeping in mind there are counter-examples in almost all cases.

Category IT Owner Facility Owner Downstream Users Intensity Size
HPC Public Sector Public Sector Public Sector Highest Large
Enterprise (non-ICT) Company (non-ICT) Company Company Normal Medium - Large
Colocation Multiple Entities Real Estate Diverse Normal Medium - Large
Edge Diverse Diverse Diverse Normal Small
Cloud (ICT) Company (ICT) Company Diverse Normal - High Very Large

In reality there are further complications. E.g. corporate entities may own enterprise data centers, cloud data centers, and make use of colocation facilities as well. The mixing of a variety of ownership, business models and facility types by data center owners or operators means that environmental attribution to more sharply defined sectors can be highly problematic.

The above classification dimensions are by no means covering all possible aspects that characterise data centers. E.g., a standard classification of data centers is linked to the degree of redundancy and other protections offered by the data center, e.g., quantified as the target uptime (the fraction of time during a period a data center operates without any hindrance), with Tier 1 data centers offering 99.671% whereas Tier 4 ensure 99.995%. We omit this aspect as it does not have obvious implications for environmental footprint.

The Distracting Hyperscaler Pattern

The current discourse about data center impacts is dominated by the development of very large data centers, operated by so-called hyperscalers. Understanding the impact of these gargantuan facilities is obviously important, not just quantitatively but also qualitatively, in terms of the trends and possibilities they reveal. Yet ultimately what matters for an informed assessment of the footprint of digital services (as those are being increasingly adopted by digital society) is having a good handle on the nature of digital services supply and demand and how this served holistically by IT equipment that resides outside data centers.

As we saw with the example of edge facilities, aggregation of digital services in ever larger data centers is not the only means to organize a digital economy and may not even be the most impactful when all things are counted. High concentration of IT resources in central nodes contrasts with decentralized distribution of IT resources, where the production and consumption of digital services is spread more evenly.

Historically computational resource concentration has been oscillating. Initial room-sized computers and so-called mainframes gave way to an era with capable personal computers and workstations, alongside with the maturing of the Enterprise Data Center . In the current juncture end-user devices have been dominated by thinner designs (mobile phones) but there may yet be another era of more powerful personal computing.

Data Services
The optimal spatial configuration of digital service provision in a fully digitized economy is not a-priory given. Some economies of scale argue for more centralization, but they must compete against decentralization advantages, such as lower latency, lower or better allocated environmental impact, more data privacy, more resilience etc. How those factors interplay depends on ever changing patterns of both supply and demand.

Understanding holistically how environmental impact varies with alternative topologies is not only a prerequisite for completeness in reporting. It can also guide better decisions by considering the entire system, the incentives of the different actors, properly attributing the externalities generated, preventing the possibilities of “leakage” (impacts hiding elsewhere in the economy) is thus an important discussion that is relevant also in digital services context14.

Practically this means that while the focus and more urgent data collection and disclosure is on Scope 1 and Scope 2 impacts of large data centers, taking into account the full structure of digital service provision is ultimately required.

The Secrecy about Environmental Impacts

Data centers are significant and costly industrial facilities. They have been studied extensively in terms of their energy and cooling needs, initially purely for economic reasons (to minimize costly energy consumption). Like any significant economic activity that involves equipment, data center operations leave a variety of both direct and indirect imprints on the environment. The list of noteworthy data center environmental impacts is not a-priory fixed. It is determined by the technologies used to provide digital services, and may change as those evolve.

An indicative list of impacts considered in this context is provided by the ITU Life-Cycle methodology for ICT goods and services9. The list includes impacts such as:

  • CO2 induced climate change (from all GHG inducing gases)
  • Ozone depletion (from CFC gases)
  • Human toxicity (from e-waste and any chemical affluent)
  • Depletion of water resources (from direct and indirect water intake)
  • Land use changes
  • Other ecosystem impacts (noise, eutrophication, acidification etc.)

For each one of these, lifecycle analysis is the detailed bottom-up approach that conceptually provides a comprehensive inventory of direct and indirect environmental impacts. In most current studies concerning data centers the focus is on what have been suggested as the most critical aspects, namely the carbon and water footprints.

The challenge is that individual data centers don’t automatically report environmental impacts (though it is not technically inconceivable that they would!). It is corporate entities who collect low level data and (optionally) report certain summaries. This brings in the complexity of commercial and regulatory systems, which determines who reports relevant metrics and how transparent and fit-for-purpose those are.

Foremost is the issue of data availability, which hinges on the ability and willingness of data center operators to disclose relevant data points15. In any case, it is a fact that current reporting does not correspond 1:1 to any discrete data center. Reported enterprise (and hyperscale) data centers impacts may be just one part of the overall inventory of an entity. Corporate reports will not in general split out individual data center impacts. Co-location centers may have multiple tenants renting fractions of the data center to diverse clientele. Reporting by the operator will in general not attribute impact to clients.

In principle both direct and indirect impacts are relevant. Direct impacts are associated with the operation of the data center (under the control of the owner of IT/non-IT equipment). Indirect impacts are associated with the construction of data centers, the impact from the production of energy used by the data center but which has been produced elsewhere, and the impacts from the downstream chain leading to the end-users of digital services.

Green House Gas Protocol

The most developed means to organize these different types of direct and indirect impacts are provided by the GHG Protocol. This is a set of detailed principles and methodologies developed by the World Resources Institute (WRI) and other institutions. It forms the basis of much current GHG emissions accounting across the world and in different context and business sectors. The Protocol comprises a number of guidances that apply to general classes of circumstances (e.g., new commercial projects, regular corporate sustainability reporting, assessment of products impacts, impact of local government operations etc.).

At high level the applicable GHG Protocol guidance for data centers is the Product Life Cycle Accounting and Reporting Standard16. More sector specific guidance comes in the ICT sector specific document 17, and in particular Chapter 4, which focuses on Data Centers. Climate change impacts are established by assessing energy use and data center location that determines the characteristics of the electrical grid (energy mix), local climate etc. 2. While GHG emissions is the environmental impact currently most discussed, see 1 for an early discussion of the electricity / GHG emissions, Water Use is coming increasingly in focus 8.

Direct and Indirect GHG Emissions and Water Usage

Following the philosophy of the GHG Protocol, from a perspective centered on the entity (or entities) owning/operating a data center, impacts can be segmented into Scopes. While the GHG Protocol focuses on climate change considerations it is useful to adopt the parallel taxonomy for water usage. Emissions in data centers are both direct (e.g., electricity produced by on-site generators (Scope 1), procured from the grid (Scope 2) and indirect, embodied emissions, e.g., in the upstream Scope 3 production of the equipment. The embodied impacts from the manufacturing of IT equipment (Upstream Scope 3) are not negligible and are well investigated. Over time IT equipment faces failure (must be replaced) or obsolescence/depreciation (no longer competitive). Finally, there are also emissions from the use of digital services by end-users that are tightly tied to the data center. Those would be downstream Scope 3 impacts.

Similarly to emissions, water usage in data centers can be classified as both direct (e.g., used in cooling IT equipment (Scope 1), indirect from water used in the production of the energy used in the data center (Scope 2) and another type of indirect usage, namely embodied in the upstream (Scope 3) production of the equipment.

We can organize all the above as follows with indicative examples:

Impact Scope 1 Scope 2 Scope 3 Upstream Scope 3 Downstream
Water Usage Direct (Cooling) Indirect (Grid Electricity) Indirect (Embodied) -
GHG Emissions Direct (Onsite Energy) Indirect (Grid Electricity) Indirect (Embodied) Networks, End Users

The alignment of Scopes between Water Usage and GHG Emissions means the reporting entity avails (in principle) of similar tools in mitigating impacts. In the case of Scope 2, this might involve the selection of sites, discussions with utilities etc. while for Scope 3, discussions with upstream equipment suppliers etc. The importance of reporting indirect water usage has been emphasized in.18

Inventory Boundaries and Granularity

The GHG Protocol for the ICT sector works on the basis of a product lifecycle inventory 17 and stipulates that cloud and data center services create emissions in three main places:

  • Data centers: switches, routers, and servers/storage devices used for receiving, sending, and storing data and the associated critical systems, facilities, and utilities
  • Network: routers, switches, cables, and other equipment associated with the transfer of data between the data center and end user
  • End-user devices: PCs, laptops, tablets, phones, or other devices used to access cloud services

Unless the networks and end-user devices are in a reporting entity’s control, their impacts would more appropriately be Scope 3 downstream impacts. The GHG Protocol outlines a number of options for calculating impacts (always for the GHG emissions side only) for data centers, going into some detail into bottom up approaches but leaving top down approaches less specified. Taking into account actual practices, research and reports, the granularity could be to any of the following levels:

  • Low Level, or Bottom Up, e.g., looking at individual servers, racks etc. As mentioned already, components are split into the IT and non-IT groups. IT components are further split into servers (of different types), data storage and networks. These estimates (counts and types of equipment) are the building blocks of the so-called bottom-up approach.
  • Top Down or facility level, where each discrete data center facility is characterised by a variety of metrics: total rated power, surface area etc.
  • Top Down, but at campus level, which reflects the fact that data center facilities belonging to the same entity tend to be clustered in nearby locations. Two or more nearby data centers will sharing the same local grid, water inputs and other facilities.
  • Top Down, but at corporate level. This level aggregates the impact of the total data center portfolio of a corporate entity, but distinct from its other operations and their environmental impacts.

Corporate disclosures that do not isolate data center impact (and any associated digital services) according to some granularity are not following the principles of the GHG protocol ICT sector guidance and are of less utility.

Finally there are other aggregation levels that are of potential utility but which do not follow the above sectoral / corporate hierarchies. This would be for example an Extended Campus aggregation, which considers data centers from multiple corporate entities - if they operate in a close geographic proximity that is relevant from an environmental perspective.

Credit: OpenStreetMap
Map fragment illustrating typical clustering of data centers in extended campuses (within technology parks, industrial districts etc.). Many of the buildings on this map are data centers. Several buildings belong to one corporate entity, forming a corporate campus. Corporate sustainability reporting may or may not provide campus level granularity. There are several distinct corporate entities present forming an extended campus. GHG emission impacts are agnostic to geospatial proximity but water usage may be sensitive, especially in so-called water-stressed regions. For extended campus reports to be compilable, entities active in a particular area must report using a standardized protocol.

Ideally these different granularity levels are connected with theoretical accounting identities:

  • The top-down facility level impact is the sum of all bottom up IT and non-IT component impacts.
  • The corporate campus impact is the sum of individual buildings within said campus.
  • The extended campus is the sum of all facility impacts in a defined area.
  • The total corporate impact is the sum of all data center impacts plus any additional corporate impact that is not data center related etc.

In practice, given very limited data availability, validating these identities is not realistic.

Beyond the GHG Protocol

Let us now discuss a complication that is emerging in the big picture, namely a qualitative difference between accounting for GHG emissions and accounting for Water Consumption. First, let us state that energy consumption and water usage inside the data center are closely related. In fact, direct water usage is in a sense a facet of the same physical system and can be modeled as such.19. The departures from the typical GHG accounting come because the specific location of a data center matters for water consumption in qualitatively different ways than it does for GHG emissions. In GHG emission accounting, emissions are a globally fungible metric that can be analysed on a marginal basis independently of any other emissions. The location of a data center for GHG emissions is relevant to the extend that it determines e.g., the local energy mix, which the informs emissions intensity. In contrast, the significance of water use impact is determined locally, in the watershed and local water cycle where the data center is operating. This means that, unless there is no real water stress in a locality, the impact of an additional data center can only be assessed by taking into account other existing data centers (and also any other competing local uses of water).

The impact of a large data center (which economically may provide digital services to a much wider geographic area), can thus be disproportionate to its host location. In 2 research reveals that in the US, one-fifth of data center direct water footprint comes from moderately to highly water stressed watersheds, while nearly half are fully or partially powered by power plants located within water stressed regions. In simple terms, the impact of 1 L of water of additional water drawn in a region must take into account this region’s conditions, whereas 1 kg of CO2 of additional emissions is evaluated in the context of the much larger (essentially planetary) budget.

This is why in reporting water usage it is important to be able to compile a consistent and credible extended campus inventory, even while that involves multiple corporate entities, potentially with commingled business models etc. It is the cumulative impact of their combined direct and indirect water usage that is the most relevant metric.

The Uncertainty Calculus

The environmental impact of the digital services sector currently involves considerable uncertainty. Uncertainties, especially if undisclosed, can affect the usability and credibility of estimates in decision making. Therefore major efforts to produce objective and reliable error estimates around GHG emissions such as the IPCC methodologies20 and the GHG Protocol place a lot of emphasis on uncertainty estimation. The simplest methodology propagates uncertainties in the activity data, emission factors and other estimation parameters through the error propagation equation. If correlations exist, then additional terms are needed. At its most basic and simplified, environmental impact reporting frameworks decompose impact metrics as products of activity units times the impact per unit (intensity) as per the equation:

$$ \mbox{Impact} = \mbox{Intensity} \times \mbox{Activity} $$

which can be written more succinctly as

$$ I = f \times A $$

where $f$ is the intensity of impact and $A$ is the activity producing the impact. Uncertainty estimates can thus be schematically decomposed as the sum of two underlying uncertainties (ignoring their possible correlation).

$$ \Delta I = \sqrt{(\Delta f \times A)^2 + (f \times \Delta A)^2} $$

In the next step we must specify the nature of uncertainty we are want to quantify. There are two important subcategories:

  • Data quality related uncertainties. This is what we might call accounting uncertainty. This uncertainty is introduced by the deficiencies in data collection and/or disclosure practices. While theoretically this uncertainty can be minimized with better data quality, in practice it may be very significant. Estimates of this type of uncertainty will typically accompany reports of the current impact from one or more data centers.
  • Temporal uncertainties. These are intrinsic uncertainties about the trajectories of new and evolving markets and technologies. The data centers of the near future will vary in size, technical specifications, services provided, business models and thus the corresponding activity and impact footprints. Estimates of this uncertainty (while obviously more subjective and model driven) provide a cone of uncertainty around best-estimate projections. In practice this function might be served with elaborating on different scenarios (and assigning them probabilities).

Uncertainty Drivers

A brief review of the above two main uncertainty drivers reveals a complex picture for both of them:

The intensity multiplier of data centers depends on complex interplays between the available underlying technologies and requirements from new use cases. In certain technology scenarios better energy efficiency is achieved through better chip designs. Yet the utilization of higher power density chips (accelerators) for more intense computations works in the other direction. While there are standard ways to estimate the power needs of any existing system, without distinction between “classic” and accelerated services there can be significant accounting uncertainty.

Another looming risk regarding accounting accuracy is the extend to which environmental impact indicators cease to be good measures, following the dynamic of Goodhart’s Law . One manifestation of this would be “leakage” of environmental impact elsewhere in the digital economy, or a mutation of its nature. For example building data centers in jurisdictions where environmental protections are less strict, or the development of more efficient energy or cooling technologies but which have side-effects elsewhere, violating the Do No Significant Harm Principle .

The higher power-density of accelerated computing has knock on implications in how the equipment is cooled, driven by the growing power rating of the chips, as well as by the need to cluster more chips into a single rack for more efficient sharing of memory. NB: Multiple accelerator chips must be networked tightly together to function effectively as a single computer. This is necessary to estimate very large parameter algorithms on gigantic data sets. As an example, Nvidia’s Ampere architecture from 2020 had a rated power of 400 W per chip and clustered 32 chips per rack, resulting in a power density of around 13 kW per rack. The current Blackwell architecture has a rated power of 1000 W per chip and 72 chips per rack, with a power density of 130 kW per rack, an order of magnitude higher5. Architectural changes can also reduce energy intensity, for example the introduction of the ARM CPU architectures versus older x86 designs may (for the same computation) achieve up to a fivefold efficiency gain.

The activity multiplier (how much of each technology is used) is more open ended. Depends on the supply and demand of digital services which is a complex economic even cultural phenomenon. Dramatically larger amounts of digital services activity are plausible.

Credit: New York Times
The famous New York Times chart from 2008 that visually demonstrates the accelerated pace of adoption of technologies. The universal adoption of digital technologies (smartphones, networks) means that new activities that re-use the ever denser underlying digital infrastructure can scale dramatically in very short timescales. (Credit: New York Times).

This jump in required computational activity happened e.g., when the processing and distribution of digital audio and video content became as prevalent as the resource-wise much more frugal processing of digital text (email, messages etc.). A simple switch of media type can mean orders of mangitude different computational requirements.

In an ever more digitally interconnected world new digital services can see extremely rapid adoption. Digital technologies have indeed exhibited some of the most accelerated adoption patters in modern industrial history. In addition, the lowering of overall costs for digital services (which is an established long running trend) may become another manifestation of the Jevons Paradox, generating further adoption, thus more environmental impact.5 We can conclude that the biggest uncertainty in future environmental impact is likely to be driven by changes in activity rather than changes in technology.

References


  1. A. Shehabi et al. Data center design and location: Consequences for electricity use and greenhouse-gas emissions. Building and Environment, 46 (2011) ↩︎ ↩︎

  2. Siddik, et al. The environmental footprint of data centers in the United States. Environ. Res. Lett. 16, 2021 ↩︎ ↩︎ ↩︎

  3. Lawrence Berkeley National Laboratory, United States Data Center Energy Usage Report, 2024. ↩︎ ↩︎

  4. IEA, Energy and AI, 2025 Report ↩︎

  5. IEA, Key questions on Energy and AI, World Energy Outlook Special Report, 2026 ↩︎ ↩︎ ↩︎ ↩︎

  6. A poorly defined category to intermingles many business sectors besides ICT. ↩︎

  7. ITU and The World Bank. Green data centers: towards a sustainable digital transformation - A practitioner’s guide. 2023. ↩︎

  8. Alex de Vries-Gao, The carbon and water footprints of data centers and what this could mean for artificial intelligence, Patterns 7, January 9, 2026 ↩︎ ↩︎

  9. International Telecommunication Union, Methodology for environmental life cycle assessments of information and communication technology goods, networks and services, ITU-T L.1410 (2024) ↩︎ ↩︎

  10. The World Bank and ITU. Measuring the Emissions & Energy Footprint of the ICT Sector: Implications for Climate Action. 2024. ↩︎ ↩︎

  11. Y. Jiang et al. ThirstyFLOPS: Water Footprint Modeling and Analysis. Toward Sustainable HPC Systems. 2025 ACM/IEEE International Conference for High Performance Computing. 2025 ↩︎

  12. V.Rao and AChien. Modeling the Carbon Footprint of HPC: The Top 500 and EasyC. In Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC Workshops 2025). ↩︎

  13. L.Xue et al. Towards Decentralized and Sustainable Foundation Model Training with the Edge, ACM SIGENERGY Energy Informatics Review, Volume 1 Issue 1, 2021 ↩︎

  14. S. Bastianoni et al. The problem of assigning responsibility for greenhouse gas emissions, Ecological Economics 49 (2004) 253–257 ↩︎

  15. E.Masanet et al. To better understand AI’s growing energy use, analysts need a data revolution, Joule 8, 2427–2448, 2024 ↩︎

  16. World Resources Institute, Product Life Cycle Accounting and Reporting Standard, 2011 ↩︎

  17. GHG Protocol, ICT Sector Guidance built on the GHG Protocol Product Life Cycle Accounting and Reporting Standard, 2017 ↩︎ ↩︎

  18. Pengfei Li, Jianyi Yang, Mohammad A. Islam, and Shaolei Ren. Making AI Less ‘Thirsty’. Commun. ACM 68, 7 (2025) ↩︎

  19. N.Lei and E.Masanet, Climate and technology specific PUE and WUE estimations for U.S. data centers using a hybrid statistical and thermodynamics-based approach. Resources, Conservation & Recycling 182 (2022) ↩︎

  20. IPCC Guidelines for National Greenhouse Gas Inventories, Chapter 3, Uncertainties (2006) ↩︎