Many recent works have shown substantial efficiency boosts from performing inference tasks on Internet of Things (IoT) nodes rather than merely transmitting raw sensor data. However, such tasks, e.g., convolutional neural networks (CNNs), are very compute intensive. They are therefore challenging to complete at sensing-matched latencies in ultra-low-power and energy-harvesting IoT nodes. ReRAM crossbar-based accelerators (RCAs) are an ideal candidate to perform the dominant multiplication-and-accumulation (MAC) operations in CNNs efficiently, but conventional, performance-oriented RCAs, while energy-efficient, are power hungry and ill-optimized for the intermittent and unstable power supply of energy-harvesting IoT nodes. This paper presents the ResiRCA architecture that integrates a new, lightweight, and configurable RCA suitable for energy harvesting environments as an opportunistically executing augmentation to a baseline sense-and-transmit battery-powered IoT node. To maximize ResiRCA throughput under different power levels, we develop the ResiSchedule approach for dynamic RCA reconfiguration. The proposed approach uses loop tiling-based computation decomposition, model duplication within the RCA, and inter-layer pipelining to reduce RCA activation thresholds and more closely track execution costs with dynamic power income. Experimental results show that ResiRCA together with ResiSchedule achieve average speedups and energy efficiency improvements of 8× and 14× respectively compared to a baseline RCA with intermittency-unaware scheduling.
more »
« less
MaxTracker: Continuously Tracking the Maximum Computation Progress for Energy Harvesting ReRAM-based CNN Accelerators
There is an ongoing trend to increasingly offload inference tasks, such as CNNs, to edge devices in many IoT scenarios. As energy harvesting is an attractive IoT power source, recent ReRAM-based CNN accelerators have been designed for operation on harvested energy. When addressing the instability problems of harvested energy, prior optimization techniques often assume that the load is fixed, overlooking the close interactions among input power, computational load, and circuit efficiency, or adapt the dynamic load to match the just-in-time incoming power under a simple harvesting architecture with no intermediate energy storage. Targeting a more efficient harvesting architecture equipped with both energy storage and energy delivery modules, this paper is the first effort to target whole system, end-to-end efficiency for an energy harvesting ReRAM-based accelerator. First, we model the relationships among ReRAM load power, DC-DC converter efficiency, and power failure overhead. Then, a maximum computation progress tracking scheme ( MaxTracker ) is proposed to achieve a joint optimization of the whole system by tuning the load power of the ReRAM-based accelerator. Specifically, MaxTracker accommodates both continuous and intermittent computing schemes and provides dynamic ReRAM load according to harvesting scenarios. We evaluate MaxTracker over four input power scenarios, and the experimental results show average speedups of 38.4%/40.3% (up to 51.3%/84.4%), over a full activation scheme (with energy storage) and order-of-magnitude speedups over the recently proposed (energy storage-less) ResiRCA technique. Furthermore, we also explore MaxTracker in combination with the Capybara reconfigurable capacitor approach to offer more flexible tuners and thus further boost the system performance.
more »
« less
- Award ID(s):
- 1822923
- PAR ID:
- 10296301
- Date Published:
- Journal Name:
- ACM Transactions on Embedded Computing Systems
- Volume:
- 20
- Issue:
- 5s
- ISSN:
- 1539-9087
- Page Range / eLocation ID:
- 1 to 23
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Deconvolution is a key component in contemporary neural networks, especially generative adversarial networks (GANs) and fully convolutional networks (FCNs). Due to extra operations of deconvolution compared to convolution, considerable degradation of performance as well as energy efficiency is incurred when implementing deconvolution on the existing resistive random access memory (ReRAM)-based processing-in-memory (PIM) accelerators. In this work, we propose a ReRAM-based accelerator design, RED, for providing high-performance and low-energy deconvolution. We analyze the deconvolution execution on the existing ReRAM-based PIMs and utilize its interior computation pattern for design optimization. RED includes two major contributions: pixel-wise mapping scheme and zero-skipping data flow. Pixel-wise mapping scheme removes the zero insertion and performs convolutions over several ReRAM arrays and thus enables parallel computations with non-zero inputs. Zero-skipping data flow, assisted with customized input buffers design, enhances the computation parallelism and input data reuse. In evaluation, we compare RED against the existing ReRAM-based PIMs and CMOS-based counterpart with a variety of GAN and FCN models, each of which contains multiple deconvolution layers. The experimental results show that RED achieves a 4.0×-56.16× speedup and a 1.05×-18.17× energy efficiency improvement over previous related accelerator designs.more » « less
-
This paper presents a step-up DC-DC converter that uses a stepwise gate-drive technique to reduce the power FET gate-drive energy by 82%, allowing positive efficiency down to an input voltage of ±0.5 mV—the lowest input voltage ever achieved for a DC-DC converter as far as we know. Below ±0.5 mV the converter automatically hibernates, reducing quiescent power consumption to just 255 pW. The converter has an efficiency of 63% at ±1 mV and 84% at ±6 mV. The input impedance is programmable from 1 Ω to 600 Ω to achieve maximum power extraction. A novel delay line circuit controls the stepwise gatedrive timing, programmable input impedance, and hibernation behavior. Bipolar input voltage is supported by using a flyback converter topology with two secondary windings. A generated power good signal enables the load when the output voltage has charged above 2.7 V and disables when the output voltage has discharged below 2.5 V. The DC-DC converter was used in a thermoelectric energy harvesting system that effectively harvests energy from small indoor temperature fluctuations of less than 1°C. Also, an analytical model with unprecedented accuracy of the stepwise gate-drive energy is presented.more » « less
-
null (Ed.)This paper presents the integration of an AC-DC rectifier and a DC-DC boost converter circuit designed in 180 nm CMOS process for ultra-low frequency (<; 10 Hz) energy harvesting applications. The proposed rectifier is a very low voltage CMOS rectifier circuit that rectifies the low-frequency signal of 100-250 mV amplitude and 1-10 Hz frequency into DC voltage. In this work, the energy is harvested from the REWOD (reverse electrowetting-on-dielectric) generator, which is a reverse electrowetting technique that converts mechanical vibrations to electrical energy. The objective is to develop a REWOD-based self-powered motion (such as walking, running, jogging, etc.) tracking sensors that can be worn, thus harvesting energy from regular activities. To this end, the proposed circuits are designed in such a way that the output from the REWOD is rectified and regulated using a DC-DC converter which is a 5-stage cross-coupled switching circuit. Simulation results show a voltage range of 1.1 V-2.1 V, i.e., 850-1200% voltage conversion efficiency (VCE) and 30% power conversion efficiency (PCE) for low input signal in the range 100-250 mV in the low-frequency range. This performance verifies the integration of the rectifier and DC-DC boost converter which makes it highly suitable for various motion-based energy harvesting applications.more » « less
-
There is growing interest in deploying energy harvesting processors and accelerators in Internet of Things (IoT). Energy harvesting harnesses the energy scavenged from the environment to power a system. Although it has many advantages over battery-operated systems such as lightweight, compact size, and no necessity of recharging and maintenance, it may suffer frequently power-down and a fluctuating power supply even with power on. Non-volatile processor (NVP) is a promising architecture for effective computing in energy harvesting scenarios. Recently, non-volatile accelerators (NVA) have been proposed to perform computations of deep learning algorithms. In this paper, we overview the recent studies of NVP and NVA across the layers of hardware, architecture, software and their co-design. Especially, we present the design insights of how the state-of-the-art works adapt their specific designs to the intermittent and fluctuating power conditions with the energy harvesting technology. Finally, we discuss recent trends using NVP and NVA in energy harvesting scenarios.more » « less