Event-cameras have emerged as a revolutionary technology with a high temporal resolution that far surpasses standard active pixel cameras. This technology draws biological inspiration from photoreceptors and the initial retinal synapse. This research showcases the potential of additional retinal functionalities to extract visual features. We provide a domain-agnostic and efficient algorithm for ego-motion compensation based on Object Motion Sensitivity (OMS), one of the multiple features computed within the mammalian retina. We develop a method based on experimental neuroscience that translates OMS’ biological circuitry to a low-overhead algorithm to suppress camera motion bypassing the need for deep networks and learning. Our system processes event data from dynamic scenes to perform pixel-wise object motion segmentation using a real and synthetic dataset. This paper introduces a bio-inspired computer vision method that dramatically reduces the number of parameters by 10^3 to 10^6 orders of magnitude compared to previous approaches. Our work paves the way for robust, high-speed, and low-bandwidth decision-making for in-sensor computations.
more »
« less
A Retina-Inspired Pathway to Real-Time Motion Prediction inside Image Sensors for Extreme-Edge Intelligence.
Abstract The ability to predict motion in real time is fundamental to many maneuvering activities in animals, particularly those critical for survival, such as attack and escape responses. Given its significance, it is no surprise that motion prediction in animals begins in the retina. Similarly, autonomous systems utilizing computer vision could greatly benefit from the capability to predict motion in real time. Therefore, for computer vision applications, motion prediction should be integrated directly at the camera pixel level. Towards that end, we present a retina-inspired neuromorphic framework capable of performing real-time, energy-efficient MP directly within camera pixels. Our hardware-algorithm framework, implemented using GlobalFoundries’ 22nm FDSOI technology, integrates key retinal MP compute blocks, including a biphasic filter, spike adder, nonlinear circuit, and a 2D array for multi-directional motion prediction. Additionally, integrating the sensor and MP compute die using a 3D Cu-Cu hybrid bonding approach improves design compactness by minimizing area usage and simplifying routing complexity. Validated on real-world object stimuli, the model delivers efficient, low-latency MP for decision-making scenarios reliant on predictive visual computation, while consuming only 18.56 pJ/MP in our mixed-signal hardware implementation.
more »
« less
- Award ID(s):
- 2319617
- PAR ID:
- 10615232
- Publisher / Repository:
- IOP Publishing
- Date Published:
- Journal Name:
- Neuromorphic Computing and Engineering
- ISSN:
- 2634-4386
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Poor-quality facial images pose challenges in biometric authentication, especially in passport photo acquisition and recognition. This study proposes a novel and open-source solution to address these issues by introducing a real-time facial image quality analysis utilizing computer vision technology on a low-power single-board computer. We present an open-source complete hardware solution that consists of a Jetson processor, a 16 MP autofocus RGB camera, a custom enclosure, and a touch sensor LCD for user interaction. To ensure the integrity and confidentiality of captured facial data, Advanced Encryption Standard (AES) is used for secure image storage. Using the pilot data collection, the system demonstrated its ability to capture high-quality images, achieving 98.98% accuracy in storing images of acceptable quality. This open-source, readily deployable, secure system offers promising potential for diverse real-time applications such as passport verification, security systems, etc.more » « less
-
Recent advances in retinal neuroscience have fueled various hardware and algorithmic efforts to develop retina- inspired solutions for computer vision tasks. In this work, we focus on a fundamental visual feature within the mammalian retina, Object Motion Sensitivity (OMS). Using DVS data from EV-IMO dataset, we analyze the performance of an algorithmic implementation of OMS circuitry for motion segmentation in presence of ego-motion. This holistic analysis considers the underlying constraints arising from the hardware circuit implementation. We present novel CMOS circuits that implement OMS functionality inside image sensors, while providing run-time re-configurability for key algorithmic parameters. In-sensor technologies for dynamical environment adaptation are crucial for ensuring high system performance. Finally, we verify the functionality and re-configurability of the proposed CMOS circuit designs through Cadence simulations in 180nm technology. In summary, the presented work lays foundation for hardware- algorithm re-engineering of known biological circuits to suit application needs.more » « less
-
null (Ed.)Cameras are deployed at scale with the purpose of searching and tracking objects of interest (e.g., a suspected person) through the camera network on live videos. Such cross-camera analytics is data and compute intensive, whose costs grow with the number of cameras and time. We present Spatula, a cost-efficient system that enables scaling cross-camera analytics on edge compute boxes to large camera networks by leveraging the spatial and temporal cross-camera correlations. While such correlations have been used in computer vision community, Spatula uses them to drastically reduce the communication and computation costs by pruning search space of a query identity (e.g., ignoring frames not correlated with the query identity’s current position). Spatula provides the first system substrate on which cross-camera analytics applications can be built to efficiently harness the cross-camera correlations that are abundant in large camera deployments. Spatula reduces compute load by $$8.3\times$$ on an 8-camera dataset, and by $$23\times-86\times$$ on two datasets with hundreds of cameras (simulated from real vehicle/pedestrian traces). We have also implemented Spatula on a testbed of 5 AWS DeepLens cameras.more » « less
-
Our brains are, “prediction machines”, where we are continuously comparing our surroundings with predictions from internal models generated by our brains. This is demonstrated by observing our basic low level sensory systems and how they predict environmental changes as we move through space and time. Indeed, even at higher cognitive levels, we are able to do prediction. We can predict how the laws of physics affect people, places, and things and even predict the end of someone’s sentence. In our work, we sought to create an artificial model that is able to mimic early, low level biological predictive behavior in a computer vision system. Our predictive vision model uses spatiotemporal sequence memories learned from deep sparse coding. This model is implemented using a biologically inspired architecture: one that utilizes sequence memories, lateral inhibition, and top-down feed- back in a generative framework. Our model learns the causes of the data in a completely unsupervised manner, by simply observing and learning about the world. Spatiotemporal features are learned by minimizing a reconstruction error convolved over space and time, and can subsequently be used for recognition, classification, and future video prediction. Our experiments show that we are able to accurately predict what will happen in the future; furthermore, we can use our predictions to detect anomalous, unexpected events in both synthetic and real video sequences.more » « less
An official website of the United States government
