Search for: All records

Award ID contains: 1652132

« Prev Next »

Total Resources

12

Resource Type
Conference Paper

8

Conference Proceeding

0

Dataset

0

Journal Article

4

Workshop Report

0

Availability
Full Text / Resource Available

12

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CAMDNN: Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs

https://doi.org/10.1109/TC.2022.3207137

Heidari, Soroush ; Ghasemi, Mehdi ; Kim, Young Geun ; Wu, Carole-Jean ; Vrudhula, Sarma ( December 2022 , IEEE Transactions on Computers)

Full Text Available
Understanding the Power of Evolutionary Computation for GPU Code Optimization

https://doi.org/10.1109/IISWC55918.2022.00025

Liou, Jhe-Yu ; Awan, Muaaz ; Hofmeyr, Steven ; Forrest, Stephanie ; Wu, Carole-Jean ( November 2022 , IEEE International Symposium on Workload Characterization (IISWC))

Full Text Available
FedGPO: Heterogeneity-Aware Global Parameter optimization for Efficient Federated Learning

https://doi.org/10.1109/IISWC55918.2022.00020

Kim, Young Geun ; Wu, Carole-Jean ( November 2022 , 2022 IEEE International Symposium on Workload Characterization (IISWC))

Full Text Available
EdgeWise: Energy-Efficient CNN Computation on Edge Devices under Stochastic Communication Delays

https://doi.org/10.1145/3530908

Ghasemi, Mehdi ; Rakhmatov, Daler ; Wu, Carole-Jean ; Vrudhula, Sarma ( April 2022 , ACM Transactions on Embedded Computing Systems)

This paper presents a framework to enable the energy-efficient execution of convolutional neural networks (CNNs) on edge devices. The framework consists of a pair of edge devices connected via a wireless network: a performance and energy-constrained device D as the first recipient of data, and an energy-unconstrained device N as an accelerator for D. Device D decides on-the-fly how to distribute the workload with the objective of minimizing its energy consumption while accounting for the inherent uncertainty in network delay and the overheads involved in data transfer. These challenges are tackled by adopting the data-driven modeling framework of Markov Decision Processes (MDP), whereby an optimal policy is consulted by D in O(1) time to make layer-by-layer assignment decisions. As a special case, a linear-time dynamic programming algorithm is also presented for finding optimal layer assignment at once, under the assumption that the network delay is constant throughout the execution of the application. The proposed framework is demonstrated on a platform comprised of a Raspberry PI 3 as D and an NVIDIA Jetson TX2 as N. An average improvement of 31% and 23% in energy consumption is achieved compared to the alternatives of executing the CNNs entirely on D and N. Two state-of-the-art methods were also implemented, and compared with the proposed methods.
more » « less
Full Text Available
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning

https://doi.org/10.1145/3466752.3480129

Kim, Young Geun ; Wu, Carole-Jean ( October 2021 , MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture)

Full Text Available
Energy-Efficient Mapping for a Network of DNN Models at the Edge

https://doi.org/10.1109/SMARTCOMP52413.2021.00024

Ghasemi, Mehdi ; Heidari, Soroush ; Kim, Young Geun ; Lamb, Aaron ; Wu, Carole-Jean ; Vrudhula, Sarma ( August 2021 , 2021 IEEE International Conference on Smart Computing (SMARTCOMP))

Full Text Available
Dynamic Temperature Management of Near-Sensor Processing for Energy-Efficient High-Fidelity Imaging

https://doi.org/10.3390/s21030926

Kodukula, Venkatesh ; Katrawala, Saad ; Jones, Britton ; Wu, Carole-Jean ; LiKamWa, Robert ( February 2021 , Sensors)

Vision processing on traditional architectures is inefficient due to energy-expensive off-chip data movement. Many researchers advocate pushing processing close to the sensor to substantially reduce data movement. However, continuous near-sensor processing raises sensor temperature, impairing imaging/vision fidelity. We characterize the thermal implications of using 3D stacked image sensors with near-sensor vision processing units. Our characterization reveals that near-sensor processing reduces system power but degrades image quality. For reasonable image fidelity, the sensor temperature needs to stay below a threshold, situationally determined by application needs. Fortunately, our characterization also identifies opportunities—unique to the needs of near-sensor processing—to regulate temperature based on dynamic visual task requirements and rapidly increase capture quality on demand. Based on our characterization, we propose and investigate two thermal management strategies—stop-capture-go and seasonal migration—for imaging-aware thermal management. For our evaluated tasks, our policies save up to 53% of system power with negligible performance impact and sustained image fidelity.
more » « less
Full Text Available
GEVO: GPU Code Optimization Using Evolutionary Computation

https://doi.org/10.1145/3418055

Liou, Jhe-Yu ; Wang, Xiaodong ; Forrest, Stephanie ; Wu, Carole-Jean ( December 2020 , ACM Transactions on Architecture and Code Optimization)
null (Ed.)
GPUs are a key enabler of the revolution in machine learning and high-performance computing, functioning as de facto co-processors to accelerate large-scale computation. As the programming stack and tool support have matured, GPUs have also become accessible to programmers, who may lack detailed knowledge of the underlying architecture and fail to fully leverage the GPU’s computation power. GEVO (Gpu optimization using EVOlutionary computation) is a tool for automatically discovering optimization opportunities and tuning the performance of GPU kernels in the LLVM representation. GEVO uses population-based search to find edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. We demonstrate that GEVO improves the execution time of general-purpose GPU programs and machine learning (ML) models on NVIDIA Tesla P100. For the Rodinia benchmarks, GEVO improves GPU kernel runtime performance by an average of 49.48% and by as much as 412% over the fully compiler-optimized baseline. If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline by an average of 51.08%. For the ML workloads, GEVO achieves kernel performance improvement for SVM on the MNIST handwriting recognition (3.24×) and the a9a income prediction (2.93×) datasets with no loss of model accuracy. GEVO achieves 1.79× kernel performance improvement on image classification using ResNet18/CIFAR-10, with less than 1% model accuracy reduction.
more » « less
Full Text Available
GEVO-ML: a proposal for optimizing ML code with evolutionary computation

https://doi.org/10.1145/3377929.3398139

Liou, Jhe-Yu ; Wang, Xiaodong ; Forrest, Stephanie ; Wu, Carole-Jean ( July 2020 , GECCO Companion)

Full Text Available
AutoScale: Energy Efficiency Optimization for Stochastic Edge Inference Using Reinforcement Learning

https://doi.org/10.1109/MICRO50266.2020.00090

Kim, Young Geun ; Wu, Carole-Jean ( January 2020 , IEEE/ACM International Symposium on Microarchitecture (MICRO))

Full Text Available

« Prev Next »