NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Apiary: An OS for the Modern FPGA

https://doi.org/10.1145/3713082.3730385

Lim, Katie; Giordano, Matthew; Zhang, Irene; Kasikci, Baris; Anderson, Thomas (May 2025, ACM)

Free, publicly-accessible full text available May 14, 2026
Beehive: A Flexible Network Stack for Direct-Attached Accelerators

https://doi.org/10.1109/MICRO61859.2024.00037

Lim, Katie; Giordano, Matthew; Stavrinos, Theano; Zhang, Irene; Nelson, Jacob; Kasikci, Baris; Anderson, Thomas (November 2024, IEEE)

Free, publicly-accessible full text available November 2, 2025
EMPower: The Case for a Cloud Power Control Plane

Park, Jonggyu; Stavrinos, Theano; Peter, Simon; Anderson, Thomas (July 2024, HotCarbon; ACM Energy Informatics Review)

Escalating application demand and the end of Dennard scaling have put energy management at the center of cloud operations. Because of the huge cost and long lead time of provisioning new data centers, operators want to squeeze as much use out of existing data centers as possible, often limited by power provisioning fixed at the time of construction. Workload demand spikes and the inherent variability of renewable energy, as well as increased power unreliability from extreme weather events and natural disasters, make the data center power management problem even more challenging. We believe it is time to build a power control plane to provide fine-grained observability and control over data center power to operators. Our goal is to help make data centers substantially more elastic with respect to dynamic changes in energy sources and application needs, while still providing good performance to applications. There are many use cases for cloud power control, including increased power oversubscription and use of green energy, resilience to power failures, large-scale power demand response, and improved energy efficiency.
more » « less
Full Text Available
Can Storage Devices be Power Adaptive?

https://doi.org/10.1145/3655038.3665945

Xie, Dedong; Stavrinos, Theano; Zhu, Kan; Peter, Simon; Kasikci, Baris; Anderson, Thomas (July 2024, ACM)

Power is becoming a scarce resource for data centers, raising the need for power adaptive system design—the ability to dynamically change power consumption—to match available power. Storage makes up an increasing fraction of total data center power consumption. As such, it holds great potential to contribute to data center power adaptivity. To this end, we conduct a measurement study of power control mechanisms on a variety of modern data center storage devices. By changing device power states and shaping IO, we achieve a power dynamic range of up to 59.4% of the device’s maximum operating power. We also study power control trade-offs, including throughput and latency. Based on our observations, we construct storage device power-throughput models and discuss the implications on power adaptive storage system design.
more » « less
Full Text Available
Splitwise: Efficient Generative LLM Inference Using Phase Splitting

https://doi.org/10.1109/ISCA59077.2024.00019

Patel, Pratyush; Choukse, Esha; Zhang, Chaojie; Shah, Aashaka; Goiri, Íñigo; Maleki, Saeed; Bianchini, Ricardo (June 2024, IEEE)

Full Text Available
Zoomie: A Software-like Debugging Tool for FPGAs

https://doi.org/10.1145/3620666.3651356

Wei, Tianrui; Laeufer, Kevin; Lim, Katie; Zhao, Jerry; Sen, Koushik; Balkind, Jonathan; Asanovic, Krste (April 2024, Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3)

FPGA prototyping has long been an indispensable technique in pre-silicon verification as well as enabling early-stage software development. FPGAs themselves have also gained popularity as hardware accelerators deployed in datacenters. However, FPGA development brings a plethora of problems. These issues constitute a high barrier towards mass adoption of agile development surrounding FPGA-based projects.To address these problems, we have built Zoomie for fast incremental compilation, reusing verification infrastructure, and a software-inspired approach towards open-source emulation. We show that Zoomie achieves 18\texttimes{} speedup over the vendor toolchain in incremental compilation time for million-gate designs. At the same time, Zoomie also provides a software-like debugging experience with breakpoints, stepping the design, and forcing values in a running design.
more » « less
Full Text Available
Characterizing Power Management Opportunities for LLMs in the Cloud

https://doi.org/10.1145/3620666.3651329

Patel, Pratyush; Choukse, Esha; Zhang, Chaojie; Goiri, Íñigo; Warrier, Brijesh; Mahalingam, Nithish; Bianchini, Ricardo (April 2024, Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3)

Recent innovation in large language models (LLMs), and their myriad use cases have rapidly driven up the compute demand for datacenter GPUs. Several cloud providers and other enterprises plan to substantially grow their datacenter capacity to support these new workloads. A key bottleneck resource in datacenters is power, which LLMs are quickly saturating due to their rapidly increasing model sizes.We extensively characterize the power consumption patterns of a variety of LLMs and their configurations. We identify the differences between the training and inference power consumption patterns. Based on our analysis, we claim that the average and peak power utilization in LLM inference clusters should not be very high. Our deductions align with data from production LLM clusters, revealing that inference workloads offer substantial headroom for power oversubscription. However, the stringent set of telemetry and controls that GPUs offer in a virtualized environment make it challenging to build a reliable and robust power management framework.We leverage the insights from our characterization to identify opportunities for better power management. As a detailed use case, we propose a new framework called POLCA, which enables power oversubscription in LLM inference clouds. POLCA is robust, reliable, and readily deployable. Using open-source models to replicate the power patterns observed in production, we simulate POLCA and demonstrate that we can deploy 30% more servers in existing clusters with minimal performance loss.
more » « less
Full Text Available
An Agile Pathway Towards Carbon-aware Clouds

https://doi.org/10.1145/3604930.3605722

Patel, Pratyush; Gregersen, Theo; Anderson, Thomas (July 2023, HotCarbon '23: Proceedings of the 2nd Workshop on Sustainable Computer Systems)

Climate change is a pressing threat to planetary well-being that can be addressed only by rapid near-term actions across all sectors. Yet, the cloud computing sector, with its increasingly large carbon footprint, has initiated only modest efforts to reduce emissions to date; its main approach today relies on cloud providers sourcing renewable energy from a limited global pool of options. We investigate how to accelerate cloud computing's efforts. Our approach tackles carbon reduction from a software standpoint by gradually integrating carbon awareness into the cloud abstraction. Specifically, we identify key bottlenecks to software-driven cloud carbon reduction, including (1) the lack of visibility and disaggregated control between cloud providers and users over infrastructure and applications, (2) the immense overhead presently incurred by application developers to implement carbon-aware application optimizations, and (3) the increasing complexity of carbon-aware resource management due to renewable energy variability and growing hardware heterogeneity. To overcome these barriers, we propose an agile approach that federates the responsibility and tools to achieve carbon awareness across different cloud stakeholders. As a key first step, we advocate leveraging the role of application operators in managing large-scale cloud deployments and integrating carbon efficiency metrics into their cloud usage workflow. We discuss various techniques to help operators reduce carbon emissions, such as carbon budgets, service-level visibility into emissions, and configurable-yet-centralized resource management optimizations.
more » « less
Full Text Available
The Case of Unsustainable CPU Affinity

https://doi.org/10.1145/3604930.3605706

Jiechen Zhao; Katie Lim; Thomas Anderson; Natalie Enright Jerger (July 2023, Proc. 2nd ACM Workshop on Hot Topics in Sustainable Computing Systems (HotCarbon’23)

CPU affinity reduces data copies and improves data locality and has become a prevalent technique for high-performance programs in datacenters. This paper explores the tension between CPU affinity and sustainability. In particular, affinity settings can lead to significant uneven aging of cores on a CPU. We observe that infrastructure threads, used in a wide spectrum of network, storage, and virtualization sub-systems, exercise their affinitized cores up to 23× more when compared to typical 𝜇s-scale application threads. In addition, we observe that the affinitized infrastructure threads generate regional heat hot spots and preclude CPUs from being used with the expected lifetime. Finally, we discuss design options to tackle the unbalanced core-aging problem to improve the overall sustainability of CPUs and call for more attention to sustainabilityaware affinity and mitigation of such problems.
more » « less
Full Text Available
Towards Improved Power Management in Cloud GPUs

https://doi.org/10.1109/LCA.2023.3278652

Patel, Pratyush; Gong, Zibo; Rizvi, Syeda; Choukse, Esha; Misra, Pulkit; Anderson, Thomas; Sriraman, Akshitha (July 2023, IEEE Computer Architecture Letters)

As modern server GPUs are increasingly power intensive, better power management mechanisms can significantly reduce the power consumption, capital costs, and carbon emissions in large cloud datacenters. This letter uses diverse datacenter workloads to study the power management capabilities of modern GPUs. We find that current GPU management mechanisms have limited compatibility and monitoring support under cloud virtualization. They have sub-optimal, imprecise, and non-intuitive implementations of Dynamic Voltage and Frequency Scaling (DVFS) and power capping. Consequently, efficient GPU power management is not widely deployed in clouds today. To address these issues, we make actionable recommendations for GPU vendors and researchers.
more » « less
Full Text Available

« Prev Next »

Search for: All records