Vision processing on traditional architectures is inefficient due to energy-expensive off-chip data movement. Many researchers advocate pushing processing close to the sensor to substantially reduce data movement. However, continuous near-sensor processing raises sensor temperature, impairing imaging/vision fidelity. We characterize the thermal implications of using 3D stacked image sensors with near-sensor vision processing units. Our characterization reveals that near-sensor processing reduces system power but degrades image quality. For reasonable image fidelity, the sensor temperature needs to stay below a threshold, situationally determined by application needs. Fortunately, our characterization also identifies opportunities—unique to the needs of near-sensor processing—to regulate temperature based on dynamic visual task requirements and rapidly increase capture quality on demand. Based on our characterization, we propose and investigate two thermal management strategies—stop-capture-go and seasonal migration—for imaging-aware thermal management. For our evaluated tasks, our policies save up to 53% of system power with negligible performance impact and sustained image fidelity.
more »
« less
Squint: A Framework for Dynamic Voltage Scaling of Image Sensors Towards Low Power IoT Vision
Energy-efficient visual sensing is of paramount importance to enable battery-backed low power IoT and mobile applications. Unfortunately, modern image sensors still consume hundreds of milliwatts of power, mainly due to analog readout. This is because current systems always supply a fixed voltage to the sensor’s analog circuitry, leading to higher power profiles. In this work, we propose to aggressively scale the analog voltage supplied to the camera as a means to significantly reduce sensor power consumption. To that end, we characterize the power and fidelity implications of analog voltage scaling on three off-the-shelf image sensors. Our characterization reveals that analog voltage scaling reduces sensor power but also degrades image quality. Furthermore, the degradation in image quality situationally affects the task accuracy of vision applications. We develop a visual streaming pipeline that flexibly allows application developers to dynamically adapt sensor voltage on a frame-by-frame basis. We develop a voltage controller that programmatically generates desired sensor voltage based on application request. We integrate our voltage controller into the existing RPi-based video streaming IoT pipeline. On top of this, we develop runtime support for flexible voltage specification from vision applications. Evaluating the system over a wide range of voltage scaling policies on popular vision tasks reveals that Squint imaging can deliver up to 73% sensor power savings, while maintaining reasonable task fidelity. Our artifacts are available at: https://gitlab.com/squint1/squint-ae-public
more »
« less
- Award ID(s):
- 1909663
- PAR ID:
- 10466003
- Date Published:
- Journal Name:
- The 29th Annual International Conference on Mobile Computing and Networking (ACM MobiCom ’23)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)High spatiotemporal resolution can offer high precision for vision applications, which is particularly useful to capture the nuances of visual features, such as for augmented reality. Unfortunately, capturing and processing high spatiotemporal visual frames generates energy-expensive memory traffic. On the other hand, low resolution frames can reduce pixel memory throughput, but reduce also the opportunities of high-precision visual sensing. However, our intuition is that not all parts of the scene need to be captured at a uniform resolution. Selectively and opportunistically reducing resolution for different regions of image frames can yield high-precision visual computing at energy-efficient memory data rates. To this end, we develop a visual sensing pipeline architecture that flexibly allows application developers to dynamically adapt the spatial resolution and update rate of different “rhythmic pixel regions” in the scene. We develop a system that ingests pixel streams from commercial image sensors with their standard raster-scan pixel read-out patterns, but only encodes relevant pixels prior to storing them in the memory. We also present streaming hardware to decode the stored rhythmic pixel region stream into traditional frame-based representations to feed into standard computer vision algorithms. We integrate our encoding and decoding hardware modules into existing video pipelines. On top of this, we develop runtime support allowing developers to flexibly specify the region labels. Evaluating our system on a Xilinx FPGA platform over three vision workloads shows 43 − 64% reduction in interface traffic and memory footprint, while providing controllable task accuracy.more » « less
-
Advances in vision processing have ignited a proliferation of mobile vision applications, including augmented reality. However, limited by the inability to rapidly reconfigure sensor operation for performance-efficiency tradeoffs, high power consumption causes vision applications to drain the device's battery. To explore the potential impact of enabling rapid reconfiguration, we use a case study around marker-based pose estimation to understand the relationship between image frame resolution, task accuracy, and energy efficiency. Our case study motivates that to balance energy efficiency and task accuracy, the application needs to dynamically and frequently reconfigure sensor resolution. To explore the latency bottlenecks to sensor resolution reconfiguration, we define and profile the end-to-end reconfiguration latency and frame-to-frame latency of changing capture resolution on a Google LG Nexus 5X device. We identify three major sources of sensor resolution reconfiguration latency in current Android systems: (i) sequential configuration patterns, (ii) expensive system calls, and (iii) imaging pipeline delay. Based on our intuitions, we propose a redesign of the Android camera system to mitigate the sources of latency. Enabling smooth transitions between sensor configurations will unlock new classes of adaptive-resolution vision applications.more » « less
-
Abstract As machine vision technology generates large amounts of data from sensors, it requires efficient computational systems for visual cognitive processing. Recently, in-sensor computing systems have emerged as a potential solution for reducing unnecessary data transfer and realizing fast and energy-efficient visual cognitive processing. However, they still lack the capability to process stored images directly within the sensor. Here, we demonstrate a heterogeneously integrated 1-photodiode and 1 memristor (1P-1R) crossbar for in-sensor visual cognitive processing, emulating a mammalian image encoding process to extract features from the input images. Unlike other neuromorphic vision processes, the trained weight values are applied as an input voltage to the image-saved crossbar array instead of storing the weight value in the memristors, realizing the in-sensor computing paradigm. We believe the heterogeneously integrated in-sensor computing platform provides an advanced architecture for real-time and data-intensive machine-vision applications via bio-stimulus domain reduction.more » « less
-
In the Artificial Intelligence of Things (AIoT) era, always-on intelligent and self-powered visual perception systems have gained considerable attention and are widely used. Thus, this paper proposes TizBin, a low-power processing in-sensor scheme with event and object detection capabilities to eliminate power costs of data conversion and transmission and enable data-intensive neural network tasks. Once the moving object is detected, TizBin architecture switches to the high-power object detection mode to capture the image. TizBin offers several unique features, such as analog convolutions enabling low-precision ternary weight neural networks (TWNN) to mitigate the overhead of analog buffer and analog-to-digital converters. Moreover, TizBin exploits non-volatile magnetic RAMs to store NN’s weights, remarkably reducing static power consumption. Our circuit-to-application co-simulation results for TWNNs demonstrate minor accuracy degradation on various image datasets, while TizBin achieves a frame rate of 1000 and efficiency of ∼1.83 TOp/s/W.more » « less
An official website of the United States government

