NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Usas: A Sustainable Continuous-Learning Framework for Edge Servers

https://doi.org/10.1109/HPCA57654.2024.00073

Mishra, Cyan Subhra; Sampson, Jack; Kandemir, Mahmut Taylan; Narayanan, Vijaykrishnan; Das, Chita R (March 2024, IEEE)

Edge servers have recently become very popular for performing localized analytics, especially on video, as they reduce data traffic and protect privacy. However, due to their resource constraints, these servers often employ compressed models, which are typically prone to data drift. Consequently, for edge servers to provide cloud-comparable quality, they must also perform continuous learning to mitigate this drift. However, at expected deployment scales, performing continuous training on every edge server is not sustainable due to their aggregate power demands on grid supply and associated sustainability footprints. To address these challenges, we propose Us.as,´ an approach combining algorithmic adjustments, hardware-software co-design, and morphable acceleration hardware to enable the training of workloads on these edge servers to be powered by renewable, but intermittent, solar power that can sustainably scale alongside data sources. Our evaluation of Us.as on a real-world´ traffic dataset indicates that our continuous learning approach simultaneously improves both accuracy and efficiency: Us.as´ offers a 4.96% greater mean accuracy than prior approaches while our morphable accelerator that adapts to solar variance can save up to {234.95kWH, 2.63MWH}/year/edge-server compared to a {DNN accelerator, data center scale GPU}, respectively.
more » « less
Full Text Available
An Efficient Edge-Cloud Partitioning of Random Forests for Distributed Sensor Networks

https://doi.org/10.1109/LES.2022.3207968

Shen, Tianyi; Mishra, Cyan Subhra; Sampson, Jack; Kandemir, Mahmut Taylan; Narayanan, Vijaykrishnan (October 2022, IEEE Embedded Systems Letters)

Full Text Available
Pushing Point Cloud Compression to the Edge

https://doi.org/10.1109/MICRO56248.2022.00031

Ying, Ziyu; Zhao, Shulin; Bhuyan, Sandeepa; Mishra, Cyan Subhra; Kandemir, Mahmut T.; Das, Chita R. (October 2022, International Symposium on Microarchitecture (MICRO) 2022)

As Point Clouds (PCs) gain popularity in processing millions of data points for 3D rendering in many applications, efficient data compression becomes a critical issue. This is because compression is the primary bottleneck in minimizing the latency and energy consumption of existing PC pipelines. Data compression becomes even more critical as PC processing is pushed to edge devices with limited compute and power budgets. In this paper, we propose and evaluate two complementary schemes, intra-frame compression and inter-frame compression, to speed up the PC compression, without losing much quality or compression efficiency. Unlike existing techniques that use sequential algorithms, our first design, intra-frame compression, exploits parallelism for boosting the performance of both geometry and attribute compression. The proposed parallelism brings around 43.7× performance improvement and 96.6% energy savings at a cost of 1.01× larger compressed data size. To further improve the compression efficiency, our second scheme, inter-frame compression, considers the temporal similarity among the video frames and reuses the attribute data from the previous frame for the current frame. We implement our designs on an NVIDIA Jetson AGX Xavier edge GPU board. Experimental results with six videos show that the combined compression schemes provide 34.0× speedup compared to a state-of-the-art scheme, with minimal impact on quality and compression ratio.
more » « less
Full Text Available
Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms

https://doi.org/10.1145/3472883.3486992

Bhasi, Vivek M.; Gunasekaran, Jashwant Raj; Thinakaran, Prashanth; Mishra, Cyan Subhra; Kandemir, Mahmut Taylan; Das, Chita (November 2021, SoCC '21: Proceedings of the ACM Symposium on Cloud Computing)

Full Text Available
Origin: Enabling On-Device Intelligence for Human Activity Recognition Using Energy Harvesting Wireless Sensor Networks

https://doi.org/10.23919/DATE51398.2021.9474017

Mishra, Cyan Subhra; Sampson, Jack; Kandemir, Mahmut Taylan; Narayanan, Vijaykrishnan (February 2021, 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE))
null (Ed.)
There is an increasing demand for performing machine learning tasks, such as human activity recognition (HAR) on emerging ultra-low-power internet of things (IoT) platforms. Recent works show substantial efficiency boosts from performing inference tasks directly on the IoT nodes rather than merely transmitting raw sensor data. However, the computation and power demands of deep neural network (DNN) based inference pose significant challenges when executed on the nodes of an energy-harvesting wireless sensor network (EH-WSN). Moreover, managing inferences requiring responses from multiple energy-harvesting nodes imposes challenges at the system level in addition to the constraints at each node. This paper presents a novel scheduling policy along with an adaptive ensemble learner to efficiently perform HAR on a distributed energy-harvesting body area network. Our proposed policy, Origin, strategically ensures efficient and accurate individual inference execution at each sensor node by using a novel activity-aware scheduling approach. It also leverages the continuous nature of human activity when coordinating and aggregating results from all the sensor nodes to improve final classification accuracy. Further, Origin proposes an adaptive ensemble learner to personalize the optimizations based on each individual user. Experimental results using two different HAR data-sets show Origin, while running on harvested energy, to be at least 2.5% more accurate than a classical battery-powered energy aware HAR classifier continuously operating at the same average power.
more » « less
Full Text Available
ResiRCA: A Resilient Energy Harvesting ReRAM Crossbar-Based Accelerator for Intelligent Embedded Processors

https://doi.org/10.1109/HPCA47549.2020.00034

Qiu, Keni; Jao, Nicholas; Zhao, Mengying; Mishra, Cyan Subhra; Gudukbay, Gulsum; Jose, Sethu; Sampson, Jack; Kandemir, Mahmut Taylan; Narayanan, Vijaykrishnan (February 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA))

Many recent works have shown substantial efficiency boosts from performing inference tasks on Internet of Things (IoT) nodes rather than merely transmitting raw sensor data. However, such tasks, e.g., convolutional neural networks (CNNs), are very compute intensive. They are therefore challenging to complete at sensing-matched latencies in ultra-low-power and energy-harvesting IoT nodes. ReRAM crossbar-based accelerators (RCAs) are an ideal candidate to perform the dominant multiplication-and-accumulation (MAC) operations in CNNs efficiently, but conventional, performance-oriented RCAs, while energy-efficient, are power hungry and ill-optimized for the intermittent and unstable power supply of energy-harvesting IoT nodes. This paper presents the ResiRCA architecture that integrates a new, lightweight, and configurable RCA suitable for energy harvesting environments as an opportunistically executing augmentation to a baseline sense-and-transmit battery-powered IoT node. To maximize ResiRCA throughput under different power levels, we develop the ResiSchedule approach for dynamic RCA reconfiguration. The proposed approach uses loop tiling-based computation decomposition, model duplication within the RCA, and inter-layer pipelining to reduce RCA activation thresholds and more closely track execution costs with dynamic power income. Experimental results show that ResiRCA together with ResiSchedule achieve average speedups and energy efficiency improvements of 8× and 14× respectively compared to a baseline RCA with intermittency-unaware scheduling.
more » « less
Full Text Available

Search for: All records