Smart phones have revolutionized the availability of computing to the consumer. Recently, smart phones have been aggressively integrating artificial intelligence (AI) capabilities into their devices. The custom designed processors for the latest phones integrate incredibly capable and energy efficient graphics processors (GPUs) and tensor processors (TPUs) to accommodate this emerging AI workload and on-device inference. Unfor- tunately, smart phones are far from sustainable and have a substantial carbon footprint that continues to be dominated by environmental impacts from their manufacture and far less so by the energy required to power their operation. In this paper we explore the possibility of reversing the trend to increase the dedicated silicon dedicated to emerging application workloads in the phone. Instead we consider how in-memory processing using the DRAM already present in the phone could be used in place of dedicated GPU/TPU devices for AI inference. We explore the potential savings in embodied carbon that could be possible with this tradeoff and provide some analysis of the potential of in- memory computing to compete with these accelerators. While it may not be possible to achieve the same throughput, we suggest that the responsiveness to the user may be sufficient using in- memory computing, while both the embodied and operational carbon footprints could be improved. Our approach can save circa 10–15kgCO2e.
more »
« less
Amortizing Embodied Carbon Across Generations
Data centers have been relying on renewable energy integration coupled with energy efficient specialized processing units and accelerators to increase sustainability. Unfortunately, the carbon generated from manufacturing these systems is be- coming increasingly relevant due to these energy decarbonization and efficiency improvements. Furthermore, it is less clear how to mitigate this aspect of embodied carbon. As workloads continue to evolve over each hardware generation we explore the tradeoffs of fabricating new application-tuned hardware compared with more general solutions such as Field Programmable Gate Arrays (FPGAs). We also explore how REFRESH FPGAs can amortize embodied carbon investments from previous generations to meet the requirements of future generations workloads.
more »
« less
- PAR ID:
- 10553765
- Publisher / Repository:
- IEEE
- Date Published:
- ISSN:
- 2993-2084
- ISBN:
- 979-8-3315-0786-2
- Subject(s) / Keyword(s):
- sustainable computing embodied carbon cloud data center hardware accelerators machine learning
- Format(s):
- Medium: X
- Location:
- Austin, Texas
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
FPGAs offer compelling acceleration opportunities for modern applications. However compilation for FPGAs is painfully slow, potentially requiring hours or longer. We approach this problem with a solution from the software domain: the use of a JIT. Code is executed immediately in a software simulator, and compilation is performed in the background. When finished, the code is moved into hardware, and from the user's perspective it simply gets faster. We have embodied these ideas in Cascade: the first JIT compiler for Verilog. Cascade reduces the time between initiating compilation and running code to less than a second, and enables generic printf debugging from hardware. Cascade preserves program performance to within 3× in a debugging environment, and has minimal effect on a finalized design. Crucially, these properties hold even for programs that perform side effects on connected IO devices. A user study demonstrates the value to experts and non-experts alike: Cascade encourages more frequent compilation, and reduces the time to produce working hardware designs.more » « less
-
Embodied carbon has been widely reported as a significant component in the full system lifecycle of various computing systems green house gas emissions. Many efforts have been undertaken to quantify the elements that comprise this embodied carbon, from tools that evaluate semiconductor manufacturing to those that can quantify different elements of the computing system from commercial and academic sources. However, these tools cannot easily reproduce results reported by server vendors' product carbon reports and the accuracy can vary substantially due to various assumptions. Furthermore, attempts to determine green house gas contributions using bottom-up methodologies often do not agree with system-level studies and are hard to rectify. Nonetheless, given there is a need to consider all contributions to green house gas emissions in datacenters, we propose SCARIF, the Server Carbon including Accelerator Reporter with Intelligence-based Formulation tool. SCARIF has three main contributions: (1) We first collect reported carbon cost data from server vendors and design statistic models to predict the embodied carbon cost so that users can get the embodied carbon cost for their server configurations. (2) We provide embodied carbon cost if users configure servers with accelerators including GPUs, and FPGAs. (3) By using case studies, we show that certain design choices of data center management might flip by the insight and observation from using SCARIF. Thus, SCARIF provides an opportunity for large-scale datacenter and hyperscaler design. We release SCARIF as an open-source tool at https://github.com/arc-research-lab/SCARIF.more » « less
-
Field-programmable gate arrays (FPGAs) have largely been used in communication and high-performance computing, and given the recent advances in big data, machine learning and emerging trends in cloud computing (e.g., serverless [1]), FPGAs are increasingly being introduced into these domains (e.g., Microsoft’s datacenters [2] and Amazon Web Services [3]). To address these domains’ processing needs, recent research has focused on using FPGAs to accelerate workloads, ranging from analytics and machine learning to databases and network function virtualization. In this paper, we present a high-performance FPGA-as-a-microservice (FaaM) architecture for the cloud. We discuss some of the technical challenges and propose several solutions for efficiently integrating FPGAs into virtualized environments. Our case study deploying a multi-threaded, multi-user compression as a microservice using FaaM indicates that microservices-based FPGA acceleration can sustain high-performance as compared to a straightforward CPU implementation with minimal to no communication overhead despite the hardware abstraction.more » « less
-
Field-programmable gate arrays (FPGAs) have largely been used in communication and high-performance computing and given the recent advances in big data and emerging trends in cloud computing (e.g., serverless [18]), FPGAs are increasingly being introduced into these domains (e.g., Microsoft’s datacenters [6] and Amazon Web Services [10]). To address these domains’ processing needs, recent research has focused on using FPGAs to accelerate workloads, ranging from analytics and machine learning to databases and network function virtualization. In this paper, we present an ongoing effort to realize a high-performance FPGA-as-a-microservice (FaaM) architecture for the cloud. We discuss some of the technical challenges and propose several solutions for efficiently integrating FPGAs into virtualized environments. Our case study deploying a multithreaded, multi-user compression as a microservice using the FaaM architecture indicate that microservices-based FPGA acceleration can sustain high-performance compared to straightforward implementation with minimal to no communication overhead despite the hardware abstraction.more » « less
An official website of the United States government

