<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Geographical Server Relocation: Opportunities and Challenges</title></titleStmt>
			<publicationStmt>
				<publisher>ACM SIGEnergy Energy Informatics Review</publisher>
				<date>12/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10661264</idno>
					<idno type="doi">10.1145/3727200.3727207</idno>
					<title level='j'>ACM SIGEnergy Energy Informatics Review</title>
<idno>27705331</idno>
<biblScope unit="volume">4</biblScope>
<biblScope unit="issue">5</biblScope>					

					<author>Yejia Liu</author><author>Pengfei Li</author><author>Daniel Wong</author><author>Shaolei Ren</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<p>The enormous growth of AI computing has led to a surging demand for electricity. To stem the resulting energy cost and environmental impact, this paper explores opportunities enabled by the increasing hardware heterogeneity and introduces the concept of Geographical Server Relocation (GSR). Specifically, GSR<italic>physically</italic>balances the available AI servers across geographically distributed data centers subject to AI computing demand and power capacity constraints in each location. The key idea of GSR is to relocate older and less energy-efficient servers to regions with more renewables, better water efficiencies and/or lower electricity prices. Our case study demonstrates that, even with modest flexibility of relocation, GSR can substantially reduce the total operational environmental footprints and operation costs of AI computing. We conclude this paper by discussing major challenges of GSR, including service migration, software management, and algorithms.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>As we embark on the era of artificial intelligence (AI) characterized by the widespread adoption of advanced models such as ChatGPT and MidJourney, the significant energy consumption involved in training, inference, and fine-tuning these AI models is increasingly worrisome. For example, training a single large language model takes millions of GPU hours and consumes electricity in the order of thousands of megawatt hours <ref type="bibr">[2,</ref><ref type="bibr">29]</ref>.</p><p>Consequently, concerns regarding the environmental footprints and energy costs of data centers housing AI servers have garnered significant attention. A recent estimate conducted by the International Energy Agency projects a sharp increase in the global AI energy demand, reaching at least ten times the current level and exceeding the annual electricity consumption of a small country like Belgium by 2026 <ref type="bibr">[11]</ref>. In light of the surging AI demand, there has been a pressing need to implement cost-efficient and eco-friendly solutions to ensure a sustainable future for AI development.</p><p>Numerous strategies have been pursued to address the huge electricity cost and environmental impacts of AI computing. For example, reducing AI model sizes through model compression, speeding up AI training and inference, and/or adopting GPUs and purposebuilt accelerators <ref type="bibr">[8,</ref><ref type="bibr">16,</ref><ref type="bibr">17,</ref><ref type="bibr">21]</ref> can yield substantial energy efficiency improvement. On the other hand, different locations exhibit significant degrees of geographical heterogeneities in terms of their electricity prices, average carbon intensities of the grids, and/or the climate conditions that affect the water efficiencies. Thus, another line of efforts being extensively studied involves leveraging the spatial and temporal flexibility inherent in AI computing workloads <ref type="bibr">[10]</ref>. This entails dynamically adjusting the location and timing of AI computing to better align with periods and locations where low-carbon and/or low-cost energy sources are available <ref type="bibr">[3,</ref><ref type="bibr">15]</ref>. The emergence of third-party energy information services, such as offering real-time data on energy's carbon intensity at high resolutions, has lowered the barrier for this approach and made it more viable <ref type="bibr">[1,</ref><ref type="bibr">7,</ref><ref type="bibr">9]</ref>. Importantly, such AI workload shifting across different geographical locations has been increasingly adopted by major technology companies as an effective enabler for sustainable computing <ref type="bibr">[22]</ref>.</p><p>While the potential of geographically shifting AI computing workloads has been well-recognized, another complementary knobphysically moving AI computing servers around geographically distributed data centers -has remained largely over-looked for sustainability and cost-saving. We refer to this approach as Geographical Server Relocation (GSR).</p><p>The rationale that motivates our pursuit of GSR comes from the increasing hardware heterogeneity. Concretely, despite the development of more powerful and energy-efficient servers, the high cost remains a barrier, making it impractical or financially challenging to replace all the servers with the latest, expensive, and more energy-efficient AI hardware at once. Instead, partial refreshment is more common in the data center upgrade lifecycle <ref type="bibr">[23]</ref>. This practice has also been reinforced by the increasing emphasis on reducing the servers' embodied environmental footprint during the manufacturing process <ref type="bibr">[6]</ref>. As a result, today's AI data centers often feature heterogeneous architecture compositions, comprising a mix of older and newer AI servers. Therefore, the total environmental footprints and operational costs of AI computing can be reduced by strategically relocating older and less energy-efficient servers to regions where renewable energy sources are more abundant, water efficiency is higher, and/or electric prices are lower. We show an illustrative example in Figure <ref type="figure">1</ref> as a thought experiment. Consider two AI data center locations, such as California and Virginia in Figure <ref type="figure">1</ref>, labeled as A and B, respectively. Virginia's carbon intensity is roughly three times higher than California's, based on their average carbon intensity (around 130g/kWh for California vs. 369g/kWh for Virginia) in April 2024 according to Electricity Maps <ref type="bibr">[18]</ref>. There are two types of AI servers: normalized performance per watt is 10 for newer servers and 1 for older ones. Suppose that we have two units of AI workloads, equally split between the two types of servers. In other words, the normalized quantities of older servers and newer servers are 1 and 1/10, respectively, in each data center before relocation.</p><p>&#8226; Before relocation: The total normalized carbon emissions at both locations is (1/1 + 1/10) * 1 + (1/1 + 1/10) * 3 = 4.4.</p><p>&#8226; After relocation: In this case, we relocate less energy-efficient older servers from Virginia to California which has a lower carbon intensity. Meanwhile, to meet the pre-relocation AI computing demand at each location, we relocate the newer servers from California to Virginia. Thus, the total normalized carbon emissions of these two locations become (1/1 + 1/1) * 1 + (1/10 + 1/10) * 3 = 2.6, resulting in &#8764; 40.9% reduction in operational carbon emission compared to the pre-relocation level.</p><p>Despite the oversimplification of many practical considerations, the illustrative example above demonstrates a clear potential of GSR to reduce AI's surging environmental footprint in light of the increasing hardware heterogeneity. In this paper, we further formalize the problem of GSR and conduct a case study to highlight the potential reductions in carbon emissions, water consumption, and electricity costs that GSR may achieve empirically. Nonetheless, compared to shifting AI computing workloads around different locations, GSR presents additional challenges in terms of service migration, software management, and algorithms, among others. Thus, to offer a more balanced view, we will highlight these challenges in this paper, which we hope can shape some interesting research directions for the community to realize the full potential of GSR for sustainability and cost saving.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">OPPORTUNITIES FOR GSR</head><p>In this section, we present the emerging opportunities for GSR enabled by the hardware and geographical heterogeneities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Hardware Heterogeneity</head><p>Most AI computing workloads run on GPUs nowadays, whose energy efficiency has increased dramatically in recent years due to design optimization and architectural advances <ref type="bibr">[20]</ref>. Figure <ref type="figure">2</ref> shows the normalized performance per watt of data center-grade GPUs released by Nvidia from 2014 to 2023, demonstrating a more than 10x improvement in terms of GFLOPS per watt.</p><p>Despite the significant improvement, the high upfront costs compounded by supply chain constraints make it impractical or financially challenging to replace all the servers with the latest, expensive, and more energy-efficient AI hardware at once. As a result, it is a common practice for AI developers to partially refresh and upgrade their AI server fleet, resulting in a mixture of new and old AI servers <ref type="bibr">[6,</ref><ref type="bibr">23]</ref>. Further, the improved server reliability and increasing emphasis on reducing AI servers' embodied environmental footprint during the manufacturing process has propelled a growing trend of keeping servers for a longer lifespan before retirement <ref type="bibr">[6]</ref>. More recently, composing servers using retired components Fig. <ref type="figure">2</ref>. Normalized ratio of performance (GFLOPS) to power consumption (Wa s) for data center-grade GPUs over the past 10 years (2014-2023) <ref type="bibr">[20]</ref>. The manufacturer-reported data points are plo ed in dots and labeled as "real". We also offer three different synthetic curves (i.e., "exponential", "linear", and "sublinear") for hypothetical studies.</p><p>(e.g., DRAMs and CPUs) has also been proven effective for cutting servers' lifecycle carbon footprints.</p><p>These practices have led to a significant AI hardware heterogeneity in terms of the performance per watt in many data centers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Geographical Heterogeneity</head><p>To serve users worldwide, AI data centers are located in different regions, which also exhibit significant geographical heterogeneities.</p><p>&#8226; Electricity price: There is a significant spatial variation of electricity prices across different states and countries <ref type="bibr">[3]</ref>. For example, country-wide electricity prices can differ by more than 10x throughout the world <ref type="bibr">[24]</ref>.</p><p>&#8226; Carbon intensity: The regional differences in energy sources for electricity generation naturally result in significant disparities in carbon intensities for each kWh of electricity consumption <ref type="bibr">[18]</ref>. Even though technology companies have increasingly adopted carbonfree energy for powering their global data centers, such regional differences still persist. For example, 97% of the energy usage by Google's data center in Finland is carbon-free, whereas this number drops to 4-18% for its data centers in Asia <ref type="bibr">[6]</ref>.</p><p>&#8226; Water efficiency: In addition to carbon emissions, AI computing also has a significant water footprint, which has emerged as a hidden sustainability roadblock <ref type="bibr">[14]</ref>. Water efficiency in terms of water consumption per kWh of IT energy usage, a.k.a., water usage effectiveness (WUE), also varies significantly across different locations (e.g., by more than 20x across Microsoft's global data center locations). Importantly, a data center with better carbon efficiency may have worse water efficiency <ref type="bibr">[14]</ref>. This necessitates AI computing's water consumption as a separate sustainability metric to address.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Opportunities Enabled by Heterogeneity</head><p>Many AI training and inference tasks can run a diverse set of GPUs without necessarily having to use a specific type of GPU. That is, to meet the same AI computing demand, there exist an increasingly wider set of AI servers, each with different performances per watt.</p><p>On the other hand, geographical heterogeneities mean that even with the same utilization, the same server can have very different energy costs and environmental footprints if put in different data centers. As such, where to place the available AI servers to meet the demand in each data center becomes an important question.</p><p>This motivates our pursuit of GSR to tap into the potential opportunities enabled by hardware and geographical heterogeneities for sustainability and cost saving. For example, as illustrated in Figure <ref type="figure">1</ref>, GSR can relocate older and less energy-efficient servers to regions with more renewables, better water efficiencies and/or lower electricity prices subject to AI computing demand and power capacity constraints in each data center.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Problem Formulation</head><p>Suppose that there are data centers and up to types of AI hardware/servers in each data center with different performances per watt. We denote the default/current and the new configurations of AI servers as y = { , | &#8712; , &#8712; } and x = { , | &#8712; , &#8712; }, representing the pre-GSR and the post-GSR quantities of typeservers in each data center , respectively.</p><p>The computational capacity of type-AI server in data center is defined as ( , ), which can be measured in terms of GFLOPS or other metrics that AI developers use for capacity planning purposes. Given the average utilization, the energy consumption of type-AI servers in data center is denoted as ( , ). Similarly, we denote its power consumption as ( , ), which indicates the required power capacity to support the server deployment.</p><p>Operational cost. The operational cost (energy cost, carbon footprint, water consumption, or a combination of them) is proportional to the total energy consumption of all the AI servers in each data center. Thus, we use a linear function , ( =1 ( , )) = =1 ( , ) to represent the cost of data center , where is the average electricity price, carbon intensity, WUE, or a combination. The coefficient also absorbs the average power usage effectiveness (PUE) to account for non-IT energy overheads if applicable.</p><p>Relocation cost. GSR introduces server relocation costs, such as the shipping costs and carbon emission overheads due to logistics (which are usually small compared to servers' operational emissions in the lifecycle). Here, to capture the relocation costs, we use the difference (x, y) = &#8741;x -y&#8741; between pre-GSR configuration x and post-GSR server configuration y as a proxy measure. Given two different configurations x and y, the actual relocation cost can be obtained by optimizing the server relocation schedule (e.g., where and which servers in a data center should be relocated).</p><p>Constraints. We introduce &#8712; [0, 1] to denote the fraction of AI computing capacity that needs to be retained in data center . When = 1, we must ensure that the post-GSR and pre-GSR computational capacities of AI servers are the same; when there is maximum flexibility at = 0, we can even shut down data center entirely and relocate all the AI servers to elsewhere, which can apply to AI developers who rent data center spaces from third-party colocation providers (e.g., Equinix).</p><p>Additionally, we use &#8805; 1 to denote the extra power capacity available normalized by the pre-GSR usage level in data center . Typically, data center operators reserve extra capacity to absorb additional loads and accommodate for future growth. If an AI developer rents power capacity from a third-party provider, it can have even more flexibility (i.e., a larger ). For notational conveniences, we also absorb physical space constraints into for data center .</p><p>Next, we formalize the problem of GSR as follows:</p><p>. .</p><p>The objective (1a) is a weighted sum of the operational cost and the relocation cost, with the weight hyperparameter &#8805; 0 denoting the unit relocation cost. The constraint (1b) specifies the minimum post-GSR AI computing capacity relative to the pre-GSR level, the constraint (1c) specifies the power capacity constraint, and the constraint (1d) means that GSR does not retire any available AI servers (which is a separate decision beyond the scope of GSR).</p><p>Remark. GSR only relocates existing servers that have already been purchased (or refurbished if re-built from older servers <ref type="bibr">[28]</ref>); it does not decide whether or not to buy new AI hardware, which can be an interesting future study but is beyond the current scope of GSR. As such, all the potential benefits of GSR shown in this paper lie in the operational costs and environmental footprints, rather than the capital expenses and embodied footprints.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">CASE STUDY</head><p>In this section, we conduct a case study to evaluate the potential empirical effectiveness of GSR under a synthetic setting based on the reported GPU energy efficiency over the last 10 years.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Experimental Setup</head><p>We examine a set of 10 geographically-distributed data centers based Micosoft's current data center sites <ref type="bibr">[19]</ref>. This set comprises four situated in the United States (Virginia, Georgia, Texas, and Nevada), four in Europe (Belgium, the Netherlands, Germany, and Denmark), and two in Asia (Singapore and Japan).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1">Datasets.</head><p>Data center-grade GPU energy efficiency. We utilize the general information regarding Nvidia GPUs tailored for data center usage over the past decade (2014-2023) as documented by <ref type="bibr">[20]</ref>. Specifically, we consider the performance using the provided GFLOPS (Singleprecision) to measure the computational capacity , and thermal design power (TDP) in Watts to gauge the energy consumption of each type-AI server. To ensure comparability across the ten-year timeframe, we normalize performance-to-power ratios by establishing 2014 as the baseline, setting its value to 1. We show the normalized performance (GFLOPS) per watt in Figure <ref type="figure">2</ref>. Based on the real manufacturer-reported data represented by the dotted line, we can observe a clear trend of rapid increases in performance per watt as the year progresses.</p><p>Electricity price, WUE, PUE, and carbon intensity. We obtain yearly average electricity price for each data center location in Europe and Asia from <ref type="bibr">[12]</ref>. For the U.S. data centers, we collect the electricity prices from their respective ISOs as documented in <ref type="bibr">[25]</ref>. In terms of environmental footprint minimization, we primarily focus on operational carbon emission and water consumption <ref type="bibr">[14,</ref><ref type="bibr">29]</ref>. Specifically, we use the on-site cooling WUE for these 10 data centers reported by <ref type="bibr">[14]</ref>. As for the carbon intensity, we gather yearly average data across these 10 data center locations for the most recent years from <ref type="bibr">[18]</ref>. We use the annualized average PUE for each data center based on Microsoft's most recent disclosure [19].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2">Default configuration and metric.</head><p>Because of competitive reasons, there is no precise information in the public domain regarding the current configuration of AI servers in each data center. Thus, for the pre-GSR setting, we assume that the AI servers are uniformly distributed in terms of their power consumption. That is, before GSR for the latest three years (2021, 2022, 2023), we assume the same amount of power consumption by AI servers purchased from each year in each data center. We will also consider other settings such as non-uniform pre-GSR configurations and longer-time scales (see Appendix A).</p><p>We evaluate the effectiveness of GSR by quantifying the percentage of savings in operational electricity costs as well as reductions in the environmental footprint before and after GSR.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Numerical Results</head><p>Our empirical results demonstrate that, even with modest flexibility of relocation, GSR can dramatically reduce the total environmental footprints and operational costs compared to the pre-GSR level.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3.2.1</head><p>Results for manufacturer-reported data. In Figure <ref type="figure">3</ref>, we present the reduction in operational carbon footprint, water consumption, as well as the electricity cost savings before and after GSR under various combinations of &#8712; [0, 1] and &#8805; 1 values, considering different time frames of GPU performance per watt data. Specifically, we only minimize the individual cost metric (e.g., electricity cost) without considering the other metrics or relocation costs (i.e., setting = 0). Thus, the values in Figure <ref type="figure">3</ref> represent the maximum savings for the respective metrics under different and .</p><p>As &gt; 0 decreases, GSR can potentially relocate more AI servers since less AI computing demand needs to be processed in the same data center after GSR. Likewise, with a larger &#8805; 1, the extra power capacity available for GSR is larger, which enhances the flexibility of GSR. Therefore, we observe that as the flexibility of GSR increases, the potential saving becomes significantly larger. Importantly, with a modest flexibility (e.g., = 0.5 and = 1.5), GSR can roughly yield 20%, 50+% and 20% savings in terms of the operational carbon footprint, water consumption, and electricity cost, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3.2.2</head><p>The impact of idle power. Our results in Figure <ref type="figure">3</ref> focus on the dynamic GPU power only. In practice, however, active servers also have idle power even when they are not processing any AI workloads. Thus, we now investigate the impact of such idle power on GSR. Specifically, each modern GPU-based AI servers typically houses 4-8 GPUs. Considering the peak power consumption of 130W for CPU (e.g., Intel Xeon W-2125 processor) and the typical power draw of around 5W for a 16GB DDR4 RAM, we add an effective amortized idle power of 30W to each GPU when calculating the performance-to-power value.</p><p>By using the same pre-GSR configuration as in Figure <ref type="figure">3</ref>, we show the cost savings for carbon, water, and electricity while accounting for idle power in Figure <ref type="figure">4</ref>. Despite slightly decreased savings, the carbon, water and electricity cost reductions achieved by GSR are highly similar to those in Figure <ref type="figure">3</ref>, emphasizing that the primary driver for savings comes from the spatial and server heterogeneity. For example, relocating servers to regions with a lower carbon footprint can significantly decrease carbon emissions, even with some idle power consumption added to the servers. Similarly, the spatial heterogeneity in water efficiency and electricity prices plays a crucial role in savings for water and electricity cost, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3">Carbon vs. water (electricity cost) tradeoffs.</head><p>Carbon efficiency, water efficiency, and electricity cost efficiency are three important, but often conflicting, objectives <ref type="bibr">[13]</ref>. For example, California has a higher water consumption rate due to its drier and hotter climate than Virginia, but its carbon intensity for electricity generation is much lower than Virginia's. Likewise, despite the cleaner energy sources for electricity generation, the electricity price in California is higher than that in many other U.S. states.</p><p>Thus, we show the tradeoff between carbon emission reduction vs. water consumption reduction, and carbon emission reduction vs. electricity cost reduction. Specifically, we minimize the weighted sum of carbon emission and water consumption/electricity cost, and vary the weight. The results are shown in Figure <ref type="figure">5</ref>, where we set = 0.5, = 1.5 and = 0, based on the manufacturer-reported GPU performance-to-power data spanning the latest 3 years (2021, 2022, 2023), the latest 6 years (2018-2023), and the latest 10 years (2014-2023), respectively.</p><p>While different metrics may not be perfectly aligned, GSR can still simultaneously reduce AI's carbon emission, water consumption, and electricity cost, which may not be achievable by geographical load balancing alone that only shifts workloads across different data centers <ref type="bibr">[13]</ref>. Interestingly, when aggressively minimizing carbon emissions, we may end up with a higher electricity cost in some cases, which corroborates with the prior finding that carbon-efficient locations may not be cost-effective <ref type="bibr">[4]</ref>.</p><p>Due to space limitations, we defer additional results to Appendix A, including the impact of relocation costs and the results for synthetic trends of GPU performance per watt.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">CHALLENGES FOR GSR</head><p>While GSR could potentially reduce the total environmental footprints and operational costs, it also creates new challenges.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Services Impacted by GSR</head><p>By consolidating the available hardware as a resource pool, modern cluster management can easily handle individual server replacement/installation without affecting the running services <ref type="bibr">[27]</ref>, and can even handle unexpected data center-wide failures by temporarily relocating all the impacted workloads to other data centers <ref type="bibr">[26]</ref>. The physical relocation process in GSR requires unplugging moveout servers and plugging move-in servers, and also requires spare server capacity (which typically exists to handle workload variations and growth) to temporarily process the impacted workloads. Thus, GSR can be viewed as a planned global-level maintenance event, which presents additional systems challenges. Alternatively, one can optimize the server relocation schedule and execute the relocation decision for one data center after another to minimize the impacted AI servers as well as the hosted workloads.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">So ware and Workloads</head><p>GSR may present several challenges for software that is highly tuned for specific hardware features. For example, to maintain performance in virtualized environments, software runtimes may depend on hardware isolation mechanisms, such as cache partitioning with Intel Cache Allocation Technology (CAT). Similarly, certain software may rely on vectorization for performance using AVX-512, while other processors may only support AVX2 extensions. These variations in hardware features may exist in only certain families of CPUs, which can present challenges to the software when running on relocated servers.</p><p>Besides software challenges, workloads that run on the relocated hardware may also need to adapt due to differences in cache/memory hierarchy, processor core type (performance cores vs efficiency cores), and parallelism available in the processors. This challenge is already commonly experienced in high-performance computing (HPC) systems. For example, thousands of man-hours are spent porting workloads from one HPC system to a new generation of HPC systems. Due to this, there is a strong focus on portability to achieve high performance across diverse hardware. To better support GSR, this focus on software portability needs to be adopted in cloud environments.</p><p>Workload changes may also need to adapt to changing hardware features. For example, older GPUs may not have hardware features such as tensor cores, or support for system-wide atomics. The absence of tensor cores would require algorithms to fall back to traditional arithmetic units for computation. System-wide atomics greatly simplifies the implementation of distributed GPU algorithms. The absence of system-wide atomics would require an increased amount of synchronization, leading to programmer burden and decreased performance.</p><p>Nonetheless, these challenges may be less of an issue if the software and/or workloads closely tied to specific hardware features can be relocated together with the associated servers and run elsewhere (i.e., &lt; 1 in our formulation (1b)).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Demand and Hardware Uncertainties</head><p>When planning for GSR, we need the projected AI resource demand as the input for decision optimization. For example, the parameter governing the fractions of AI demand that cannot be relocated needs to provided for GSR optimization. Additionally, when the future GPU energy efficiency improves, we might need to relocate some of the AI servers again, potentially resulting in higher movement costs. In other words, we need to solve a sequential decision-making problem with movement costs subject to future uncertainties. This is commonly referred to as smoothed online optimization which penalizes frequent changes in decisions <ref type="bibr">[5]</ref>, and is known to be challenging even under simplified assumptions. Thus, GSR presents an interesting online optimization problem, which can be of interest to the operational research and optimization community.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">CONCLUDING REMARKS</head><p>In this paper, we explore potential opportunities enabled by the increasing hardware heterogeneity and introduce the novel concept of GSR. By relocating older and less energy-efficient servers to regions with more renewables, better water efficiencies and/or less electricity prices, GSR can substantially reduce the total operational environmental footprints and operation costs of AI computing. We also discuss the major challenges of GSR, including service impacted by GSR, software management, and optimization algorithms.</p><p>Being complementary to the well-studied geographic workload balancing, GSR represents an untapped knob that holds a great potential to cut AI's enormous operational energy cost, carbon emissions, and/or water consumption. The challenges of implementing GSR can potentially define future research directions to realize the full potential of GSR.    </p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Volume 4 Issue 5, December 2024</p></note>
		</body>
		</text>
</TEI>
