skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Design Insights of Non-volatile Processors and Accelerators in Energy Harvesting Systems
There is growing interest in deploying energy harvesting processors and accelerators in Internet of Things (IoT). Energy harvesting harnesses the energy scavenged from the environment to power a system. Although it has many advantages over battery-operated systems such as lightweight, compact size, and no necessity of recharging and maintenance, it may suffer frequently power-down and a fluctuating power supply even with power on. Non-volatile processor (NVP) is a promising architecture for effective computing in energy harvesting scenarios. Recently, non-volatile accelerators (NVA) have been proposed to perform computations of deep learning algorithms. In this paper, we overview the recent studies of NVP and NVA across the layers of hardware, architecture, software and their co-design. Especially, we present the design insights of how the state-of-the-art works adapt their specific designs to the intermittent and fluctuating power conditions with the energy harvesting technology. Finally, we discuss recent trends using NVP and NVA in energy harvesting scenarios.  more » « less
Award ID(s):
1822923
PAR ID:
10193315
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 2020 on Great Lakes Symposium on VLSI
Page Range / eLocation ID:
369 to 374
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    There is an ongoing trend to increasingly offload inference tasks, such as CNNs, to edge devices in many IoT scenarios. As energy harvesting is an attractive IoT power source, recent ReRAM-based CNN accelerators have been designed for operation on harvested energy. When addressing the instability problems of harvested energy, prior optimization techniques often assume that the load is fixed, overlooking the close interactions among input power, computational load, and circuit efficiency, or adapt the dynamic load to match the just-in-time incoming power under a simple harvesting architecture with no intermediate energy storage. Targeting a more efficient harvesting architecture equipped with both energy storage and energy delivery modules, this paper is the first effort to target whole system, end-to-end efficiency for an energy harvesting ReRAM-based accelerator. First, we model the relationships among ReRAM load power, DC-DC converter efficiency, and power failure overhead. Then, a maximum computation progress tracking scheme ( MaxTracker ) is proposed to achieve a joint optimization of the whole system by tuning the load power of the ReRAM-based accelerator. Specifically, MaxTracker accommodates both continuous and intermittent computing schemes and provides dynamic ReRAM load according to harvesting scenarios. We evaluate MaxTracker over four input power scenarios, and the experimental results show average speedups of 38.4%/40.3% (up to 51.3%/84.4%), over a full activation scheme (with energy storage) and order-of-magnitude speedups over the recently proposed (energy storage-less) ResiRCA technique. Furthermore, we also explore MaxTracker in combination with the Capybara reconfigurable capacitor approach to offer more flexible tuners and thus further boost the system performance. 
    more » « less
  2. Convolutional Neural Networks (CNNs), due to their recent successes, have gained lots of attention in various vision-based applications. They have proven to produce incredible results, especially on big data, that require high processing demands. However, CNN processing demands have limited their usage in embedded edge devices with constrained energy budgets and hardware. This paper proposes an efficient new architecture, namely Ocelli includes a ternary compute pixel (TCP) consisting of a CMOS-based pixel and a compute add-on. The proposed Ocelli architecture offers several features; (I) Because of the compute add-on, TCPs can produce ternary values (i.e., −1, 0, +1) regarding the light intensity as pixels’ inputs; (II) Ocelli realizes analog convolutions enabling low-precision ternary weight neural networks. Since the first layer’s convolution operations are the performance bottleneck of accelerators, Ocelli mitigates the overhead of analog buffers and analog-to-digital converters. Moreover, our design supports a zero-skipping scheme to further power reduction; (III) Ocelli exploits non-volatile magnetic RAMs to store CNN’s weights, which remarkably reduces the static power consumption; and finally, (IV) Ocelli has two modes, including sensing and processing. Once the object is detected, the architecture switches to the typical sensing mode to capture the image. Compared to the conventional pixels, it achieves an average 10% efficiency on its lane detection power consumption compared with existing edge detection algorithms. Moreover, considering different CNN workloads, our design shows more than 23% power efficiency over conventional designs, while it can achieve better accuracy. 
    more » « less
  3. Sodium ion batteries are an emerging candidate to replace lithium ion batteries in large-scale electrical energy storage systems due to the abundance and widespread distribution of sodium. Despite the growing interest, the development of high-performance sodium cathode materials remains a challenge. In particular, polyanionic compounds are considered as a strong cathode candidate owing to their better cycling stability, a flatter voltage profile, and stronger thermal stability compared to other cathode materials. Here, we report the rational design of a biomimetic bone-inspired polyanionic Na3V2(PO4)3-reduced graphene oxide composite (BI-NVP) cathode that achieves ultrahigh rate charging and ultralong cycling life in a sodium ion battery. At a charging rate of 1 C, BI-NVP delivers 97% of its theoretical capacity and is able to retain a voltage plateau even at the ultra-high rate of 200 C. It also shows long cycling life with capacity retention of 91% after 10 000 cycles at 50 C. The sodium ion battery cells with a BI-NVP cathode and Na metal anode were able to deliver a maximum specific energy of 350 W h kg−1 and maximum specific power of 154 kW kg−1. In situ and postmortem analyses of cycled BI-NVP (including by Raman and XRD spectra) HRTEM, and STEM-EELS, indicate highly reversible dilation–contraction, negligible electrode pulverization, and a stable NVP-reduced graphene oxide layer interface. The results presented here provide a rational and biomimetic material design for the electrode architecture for ultrahigh power and ultralong cyclability of the sodium ion battery full cells when paired with a sodium metal anode. 
    more » « less
  4. null (Ed.)
    With the growing performance and wide application of deep neural networks (DNNs), recent years have seen enormous efforts on DNN accelerator hardware design for platforms from mobile devices to data centers. The systolic array has been a popular architectural choice for many proposed DNN accelerators with hundreds to thousands of processing elements (PEs) for parallel computing. Systolic array-based DNN accelerators for datacenter applications have high power consumption and nonuniform workload distribution, which makes power delivery network (PDN) design challenging. Server-class multicore processors have benefited from distributed on-chip voltage regulation and heterogeneous voltage regulation (HVR) for improving energy efficiency while guaranteeing power delivery integrity. This paper presents the first work on HVR-based PDN architecture and control for systolic array-based DNN accelerators. We propose to employ a PDN architecture comprising heterogeneous on-chip and off-chip voltage regulators and multiple power domains. By analyzing patterns of typical DNN workloads via a modeling framework, we propose a DNN workload-aware dynamic PDN control policy to maximize system energy efficiency while ensuring power integrity. We demonstrate significant energy efficiency improvements brought by the proposed PDN architecture, dynamic control, and power gating, which lead to a more than five-fold reduction of leakage energy and PDN energy overhead for systolic array DNN accelerators. 
    more » « less
  5. Many recent works have shown substantial efficiency boosts from performing inference tasks on Internet of Things (IoT) nodes rather than merely transmitting raw sensor data. However, such tasks, e.g., convolutional neural networks (CNNs), are very compute intensive. They are therefore challenging to complete at sensing-matched latencies in ultra-low-power and energy-harvesting IoT nodes. ReRAM crossbar-based accelerators (RCAs) are an ideal candidate to perform the dominant multiplication-and-accumulation (MAC) operations in CNNs efficiently, but conventional, performance-oriented RCAs, while energy-efficient, are power hungry and ill-optimized for the intermittent and unstable power supply of energy-harvesting IoT nodes. This paper presents the ResiRCA architecture that integrates a new, lightweight, and configurable RCA suitable for energy harvesting environments as an opportunistically executing augmentation to a baseline sense-and-transmit battery-powered IoT node. To maximize ResiRCA throughput under different power levels, we develop the ResiSchedule approach for dynamic RCA reconfiguration. The proposed approach uses loop tiling-based computation decomposition, model duplication within the RCA, and inter-layer pipelining to reduce RCA activation thresholds and more closely track execution costs with dynamic power income. Experimental results show that ResiRCA together with ResiSchedule achieve average speedups and energy efficiency improvements of 8× and 14× respectively compared to a baseline RCA with intermittency-unaware scheduling. 
    more » « less