NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Vision: Preventing Tech-related Physiological Health Issues using Commodity Wearables

https://doi.org/10.1145/3722570.3726883

Chhaglani, Bhawana; Gomasta, Sarmistha Sarna; Shenoy, Prashant (May 2025, ACM)

Free, publicly-accessible full text available May 6, 2026
TailClipper: Reducing Tail Response Time of Distributed Services Through System-Wide Scheduling

https://doi.org/10.1145/3698038.3698554

Ng, Nathan; Souza, Abel; Ali-Eldin, Ahmed; Irwin, David; Towsley, Don; Shenoy, Prashant (November 2024, ACM)

Reducing tail latency has become a crucial issue for optimizing the performance of online cloud services and distributed applications. In distributed applications, there are many causes of high end-to-end tail latency, including operating system delays, request re-ordering due to fan-out/fanin, and network congestion. Although recent research has focused on reducing tail latency for individual application components, such as by replicating requests and scheduling, in this paper, we argue for a holistic approach for reducing the end-to-end tail latency across application components. We propose TailClipper, a distributed scheduler that tags each arriving request with an arrival timestamp, and propagates it across the microservices' call chain. TailClipper then uses arrival timestamps to implement an oldest request first scheduler that combines global first-come first serve with a limited form of processor sharing to reduce end-to-end tail latency. In doing so, TailClipper can counter the performance degradation caused by request reordering in multi-tiered and microservices-based applications. We implement TailClipper as a userspace Linux scheduler and evaluate it using cloud workload traces and a real-world microservices application. Compared to state-of-the-art schedulers, our experiments reveal that TailClipper improves the 99th percentile response time by up to 81%, while also improving the mean response time and the system throughput by up to 54% and 29% respectively under high loads.
more » « less
Free, publicly-accessible full text available November 20, 2025
PADS: Power Budgeting with Diagonal Scaling for Performance-Aware Cloud Workloads

https://doi.org/10.1109/IGSC64514.2024.00012

Savasci, Mehmet; Souza, Abel; Irwin, David; Ali-Eldin, Ahmed; Shenoy, Prashant (November 2024, IEEE)

Cloud platforms’ rapid growth raises significant concerns about their electricity consumption and resulting carbon emissions. Power capping is a known technique for limiting the power consumption of data centers where workloads are hosted. Today’s data center computer clusters co-locate latency-sensitive web and throughput-oriented batch workloads. When power capping is necessary, throttling only the batch tasks without restricting latency-sensitive web workloads is ideal because guaranteeing low response time for latency-sensitive workloads is a must due to Service-Level Objectives (SLOs) requirements. This paper proposes PADS, a hardware-agnostic workload-aware power capping system. Due to not relying on any hardware mechanism such as RAPL and DVFS, it can keep the power consumption of clusters equipped with heterogeneous architectures such as x86 and ARM below the enforced power limit while minimizing the impact on latency-sensitive tasks. It uses an application-performance model of both latency-sensitive and batch workloads to ensure power safety with controllable performance. Our power capping technique uses diagonal scaling and relies on using the control group feature of the Linux kernel. Our results indicate that PADS is highly effective in reducing power while respecting the tail latency requirement of the latency-sensitive workload. Furthermore, compared to state-of-the-art solutions, PADS demonstrates lower P95 latency, accompanied by a 90% higher effectiveness in respecting power limits.
more » « less
Free, publicly-accessible full text available November 2, 2025
Online Conversion with Switching Costs: Robust and Learning-Augmented Algorithms

https://doi.org/10.1145/3652963.3655074

Lechowicz, Adam; Christianson, Nicolas; Sun, Bo; Bashir, Noman; Hajiesmaili, Mohammad; Wierman, Adam; Shenoy, Prashant (June 2024, ACM)

Full Text Available
INVAR: Inversion Aware Resource Provisioning and Workload Scheduling for Edge Computing

https://doi.org/10.1109/INFOCOM52122.2024.10621417

Wang, Bin; Irwin, David; Shenoy, Prashant; Towsley, Don (May 2024, IEEE)

Full Text Available
AeroSense: Sensing Aerosol Emissions from Indoor Human Activities

https://doi.org/10.1145/3659593

Chhaglani, Bhawana; Zakaria, Camellia; Peltier, Richard; Gummeson, Jeremy; Shenoy, Prashant (May 2024, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

The types of human activities occupants are engaged in within indoor spaces significantly contribute to the spread of airborne diseases through emitting aerosol particles. Today, ubiquitous computing technologies can inform users of common atmosphere pollutants for indoor air quality. However, they remain uninformed of the rate of aerosol generated directly from human respiratory activities, a fundamental parameter impacting the risk of airborne transmission. In this paper, we present AeroSense, a novel privacy-preserving approach using audio sensing to accurately predict the rate of aerosol generated from detecting the kinds of human respiratory activities and determining the loudness of these activities. Our system adopts a privacy-first as a key design choice; thus, it only extracts audio features that cannot be reconstructed into human audible signals using two omnidirectional microphone arrays. We employ a combination of binary classifiers using the Random Forest algorithm to detect simultaneous occurrences of activities with an average recall of 85%. It determines the level of all detected activities by estimating the distance between the microphone and the activity source. This level estimation technique yields an average of 7.74% error. Additionally, we developed a lightweight mask detection classifier to detect mask-wearing, which yields a recall score of 75%. These intermediary outputs are critical predictors needed for AeroSense to estimate the amounts of aerosol generated from an active human source. Our model to predict aerosol is a Random Forest regression model, which yields 2.34 MSE and 0.73 r2 value. We demonstrate the accuracy of AeroSense by validating our results in a cleanroom setup and using advanced microbiological technology. We present results on the efficacy of AeroSense in natural settings through controlled and in-the-wild experiments. The ability to estimate aerosol emissions from detected human activities is part of a more extensive indoor air system integration, which can capture the rate of aerosol dissipation and inform users of airborne transmission risks in real time.
more » « less
Full Text Available
SLO-Power: SLO and Power-aware Elastic Scaling for Web Services

Savasci, M; Souza, A; Wu, L; Irwin, D; Ali-Eldin, A; Shenoy, P (May 2024, IEEE/ACM International Symposium on Cluster, Cloud, and Internet Computing (CCGrid))

Full Text Available
Going Green for Less Green: Optimizing the Cost of Reducing Cloud Carbon Emissions

https://doi.org/10.1145/3620666.3651374

Hanafy, Walid A.; Liang, Qianlin; Bashir, Noman; Souza, Abel; Irwin, David; Shenoy, Prashant (April 2024, Proceedings of ACM ASPLOS Conference)

Full Text Available
CarbonScaler: Leveraging Cloud Workload Elasticity for Optimizing Carbon-Efficiency

https://doi.org/10.1145/3626788

Hanafy, Walid A.; Liang, Qianlin; Bashir, Noman; Irwin, David; Shenoy, Prashant (December 2023, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

Cloud platforms are increasing their emphasis on sustainability and reducing their operational carbon footprint. A common approach for reducing carbon emissions is to exploit the temporal flexibility inherent to many cloud workloads by executing them in periods with the greenest energy and suspending them at other times. Since such suspend-resume approaches can incur long delays in job completion times, we present a new approach that exploits the elasticity of batch workloads in the cloud to optimize their carbon emissions. Our approach is based on the notion of carbon scaling, similar to cloud autoscaling, where a job dynamically varies its server allocation based on fluctuations in the carbon cost of the grid's energy. We develop a greedy algorithm for minimizing a job's carbon emissions via carbon scaling that is based on the well-known problem of marginal resource allocation. We implement a CarbonScaler prototype in Kubernetes using its autoscaling capabilities and an analytic tool to guide the carbon-efficient deployment of batch applications in the cloud. We then evaluate CarbonScaler using real-world machine learning training and MPI jobs on a commercial cloud platform and show that it can yield i) 51% carbon savings over carbon-agnostic execution; ii) 37% over a state-of-the-art suspend-resume policy; and iii) 8 over the best static scaling policy.
more » « less
Full Text Available
Energy Time Fairness: Balancing Fair Allocation of Energy and Time for GPU Workloads

https://doi.org/10.1145/3583740.3628435

Liang, Qianlin; Hanafy, Walid A; Bashir, Noman; Irwin, David; Shenoy, Prashant (December 2023, ACM)

Full Text Available

« Prev Next »

Search for: All records