skip to main content


Search for: All records

Creators/Authors contains: "Shenoy, Prashant"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 4, 2025
  2. WattScope is a system for non-intrusively estimating the power consumption of individual applications using external measurements of a server's aggregate power usage and without requiring direct access to the server's operating system or applications. Our key insight is that, based on an analysis of production traces, the power characteristics of datacenter workloads, e.g., low variability, low magnitude, and high periodicity, are highly amenable to disaggregation of a server's total power consumption into application-specific values. WattScope adapts and extends a machine learning-based technique for disaggregating building power and applies it to server- and rack-level power measurements that are already available in datacenters. We evaluate WattScope's accuracy on a production workload and show that it yields high accuracy, e.g., often <∼10% normalized mean absolute error, and is thus a potentially useful tool for datacenters in externally monitoring application-level power usage.

     
    more » « less
    Free, publicly-accessible full text available February 22, 2025
  3. Traditionally, multi-tenant cloud and edge platforms use fair-share schedulers to fairly multiplex resources across applications. These schedulers ensure applications receive processing time proportional to a configurable share of the total time. Unfortunately, enforcing time-fairness across applications often violates energy-fairness, such that some applications consume more than their fair share of energy. This occurs because applications either do not fully utilize their resources or operate at a reduced frequency/voltage during their time-slice. The problem is particularly acute for machine learning (ML) applications using GPUs, where model size largely dictates utilization and energy usage. Enforcing energy-fairness is also important since energy is a costly and limited resource. For example, in cloud platforms, energy dominates operating costs and is limited by the power delivery infrastructure, while in edge platforms, energy is often scarce and limited by energy harvesting and battery constraints. To address the problem, we define the notion of Energy-Time Fairness (ETF), which enables a configurable tradeoff between energy and time fairness, and then design a scheduler that enforces it. We show that ETF satisfies many well-accepted fairness properties. ETF and the new tradeoff it offers are important, as some applications, especially ML models, are time/latency-sensitive and others are energy-sensitive. Thus, while enforcing pure energy-fairness starves time/latency-sensitive applications (of time) and enforcing pure time-fairness starves energy-sensitive applications (of energy), ETF is able to mind the gap between the two. We implement an ETF scheduler, and show that it improves fairness by up to 2x, incentivizes energy efficiency, and exposes a configurable knob to operate between energy- and time-fairness. 
    more » « less
    Free, publicly-accessible full text available December 6, 2024
  4. Cloud platforms are increasing their emphasis on sustainability and reducing their operational carbon footprint. A common approach for reducing carbon emissions is to exploit the temporal flexibility inherent to many cloud workloads by executing them in periods with the greenest energy and suspending them at other times. Since such suspend-resume approaches can incur long delays in job completion times, we present a new approach that exploits the elasticity of batch workloads in the cloud to optimize their carbon emissions. Our approach is based on the notion of carbon scaling, similar to cloud autoscaling, where a job dynamically varies its server allocation based on fluctuations in the carbon cost of the grid's energy. We develop a greedy algorithm for minimizing a job's carbon emissions via carbon scaling that is based on the well-known problem of marginal resource allocation. We implement a CarbonScaler prototype in Kubernetes using its autoscaling capabilities and an analytic tool to guide the carbon-efficient deployment of batch applications in the cloud. We then evaluate CarbonScaler using real-world machine learning training and MPI jobs on a commercial cloud platform and show that it can yield i) 51% carbon savings over carbon-agnostic execution; ii) 37% over a state-of-the-art suspend-resume policy; and iii) 8 over the best static scaling policy.

     
    more » « less
    Free, publicly-accessible full text available December 7, 2024
  5. Datacenter capacity is growing exponentially to satisfy the increasing demand for many emerging computationally-intensive applications, such as deep learning. This trend has led to concerns over datacenters’ increasing energy consumption and carbon footprint. The most basic prerequisite for optimizing a datacenter’s energy- and carbon-efficiency is accurately monitoring and attributing energy consumption to specific users and applications. Since datacenter servers tend to be multi-tenant, i.e., they host many applications, server- and rack-level power monitoring alone does not provide insight into the energy usage and carbon emissions of their resident applications. At the same time, current application-level energy monitoring and attribution techniques are intrusive: they require privileged access to servers and necessitate coordinated support in hardware and software, neither of which is always possible in cloud environments. To address the problem, we design WattScope, a system for non-intrusively estimating the power consumption of individual applications using external measurements of a server’s aggregate power usage and without requiring direct access to the server’s operating system or applications. Our key insight is that, based on an analysis of production traces, the power characteristics of datacenter workloads, e.g., low variability, low magnitude, and high periodicity, are highly amenable to disaggregation of a server’s total power consumption into application-specific values. WattScope adapts and extends a machine learning-based technique for disaggregating building power and applies it to server- and rack-level power meter measurements that are already available in data centers. We evaluate WattScope’s accuracy on a production workload and show that it yields high accuracy, e.g., often 10% normalized mean absolute error, and is thus a potentially useful tool for datacenters in externally monitoring application-level power usage. 
    more » « less
    Free, publicly-accessible full text available November 1, 2024
  6. We introduce and study the online pause and resume problem. In this problem, a player attempts to find the k lowest (alternatively, highest) prices in a sequence of fixed length T, which is revealed sequentially. At each time step, the player is presented with a price and decides whether to accept or reject it. The player incurs aswitching cost whenever their decision changes in consecutive time steps, i.e., whenever they pause or resume purchasing. This online problem is motivated by the goal of carbon-aware load shifting, where a workload may be paused during periods of high carbon intensity and resumed during periods of low carbon intensity and incurs a cost when saving or restoring its state. It has strong connections to existing problems studied in the literature on online optimization, though it introduces unique technical challenges that prevent the direct application of existing algorithms. Extending prior work on threshold-based algorithms, we introducedouble-threshold algorithms for both the minimization and maximization variants of this problem. We further show that the competitive ratios achieved by these algorithms are the best achievable by any deterministic online algorithm. Finally, we empirically validate our proposed algorithm through case studies on the application of carbon-aware load shifting using real carbon trace data and existing baseline algorithms.

     
    more » « less
    Free, publicly-accessible full text available December 7, 2024
  7. To reduce their environmental impact, cloud datacenters' are increasingly focused on optimizing applications' carbon-efficiency, or work done per mass of carbon emitted. To facilitate such optimizations, we present Carbon Containers, a simple system-level facility, which extends prior work on power containers, that automatically regulates applications' carbon emissions in response to variations in both their work-load's intensity and their energy's carbon-intensity. Specifically, Carbon Containers enable applications to specify a maximum carbon emissions rate (in g.CO2e/hr), and then transparently enforce this rate via a combination of vertical scaling, container migration, and suspend/resume while maximizing either energy-efficiency or performance. Carbon Containers are especially useful for applications that i) must continue running even during high-carbon periods, and ii) execute in regions with few variations in carbon-intensity. These low-variability regions also tend to have high average carbon-intensity, which increases the importance of regulating carbon emissions. We implement a Carbon Container prototype by extending Linux Containers to incorporate the mechanisms above and evaluate it using real workload traces and carbon-intensity data from multiple regions. We compare Carbon Containers with prior work that regulates carbon emissions by suspending/resuming applications during high/low carbon periods. We show that Carbon Containers are more carbon-efficient and improve performance while maintaining similar carbon emissions. 
    more » « less
    Free, publicly-accessible full text available October 30, 2024
  8. The impact of human activity on the climate is a major global challenge that affects human well-being. Buildings are a major source of energy consumption and carbon emissions worldwide, especially in advanced economies such as the United States. As a result, making grids and buildings sustainable by reducing their carbon emissions is emerging as an important step toward societal decarbonization and improving overall human well-being. While prior work on demand response methods in power grids and buildings has targeted peak shaving and price arbitrage in response to price signals, it has not explicitly targeted carbon emission reductions. In this paper, we analyze the flexibility of building loads to quantify the upper limit on their potential to reduce carbon emissions, assuming perfect knowledge of future demand and carbon intensity. Our analysis leverages real-world demand patterns from 1000+ buildings and carbon-intensity traces from multiple regions. It shows that by manipulating the demand patterns of electric vehicles, heating, ventilation, and cooling (HVAC) systems, and battery storage, we can reduce carbon emissions by 26.93% on average and by 54.90% at maximum. Our work advances the understanding of sustainable infrastructure by highlighting the potential for infrastructure design and interventions to significantly reduce carbon footprints, benefiting human well-being. 
    more » « less
    Free, publicly-accessible full text available November 15, 2024
  9. Free, publicly-accessible full text available July 9, 2024
  10. Continued advances in technology have led to falling costs and a dramatic increase in the aggregate amount of solar capacity installed across the world. A drawback of increased solar penetration is the potential for supply-demand mismatches in the grid due to the intermittent nature of solar generation. While energy storage can be used to mask such problems, we argue that there is also a need to explicitly control the rate of solar generation of each solar array in order to achieve high penetration while also handling supply-demand mismatches. To address this issue, we present the notion of smart solar arrays that can actively modulate their solar output based on the notion of proportional fairness. We present a decentralized algorithm based on Lagrangian optimization that enables each smart solar array to make local decisions on its fair share of solar power it can inject into the grid and then present a sense-broadcast-respond protocol to implement our decentralized algorithm into smart solar arrays. We also study the benefits of using energy storage when we rate control solar. To do so, we present a decentralized algorithm to charge and discharge batteries for each smart solar. Our evaluation on a city-scale dataset shows that our approach enables 2.6× more solar penetration while causing smart arrays to reduce their output by as little as 12.4%. By employing an adaptive gradient approach, our decentralized algorithm has 3 to 30× faster convergence. Finally, we demonstrate energy storage can help netmeter more solar energy while ensuring fairness and grid constraints are met. 
    more » « less
    Free, publicly-accessible full text available June 28, 2024