Biological neural computation is inherently asynchronous due to large variations in neuronal spike timing and transmission delays. Sofar, most theoretical work on neural networks assumes the synchronous setting where neurons fire simultaneously in discrete rounds. In this work we aim at understanding the barriers of asynchronous neural computation from an algorithmic perspective. We consider an extension of the widely studied model of synchronized spiking neurons [Maass, Neural Networks 97] to the asynchronous setting by taking into account edge and node delays.  Edge Delays: We define an asynchronous model for spiking neurons in which the latency values (i.e., transmission delays) of non selfloop edges vary adversarially over time. This extends the recent work of [Hitron and Parter, ESA'19] in which the latency values are restricted to be fixed over time. Our first contribution is an impossibility result that implies that the assumption that selfloop edges have no delays (as assumed in Hitron and Parter) is indeed necessary. Interestingly, in real biological networks selfloop edges (a.k.a. autapse) are indeed free of delays, and the latter has been noted by neuroscientists to be crucial for network synchronization. To capture the computational challenges in this setting, we first consider the implementation of a single NOT gate. This simple function already captures the fundamental difficulties in the asynchronous setting. Our key technical results are space and time upper and lower bounds for the NOT function, our time bounds are tight. In the spirit of the distributed synchronizers [Awerbuch and Peleg, FOCS'90] and following [Hitron and Parter, ESA'19], we then provide a general synchronizer machinery. Our construction is very modular and it is based on efficient circuit implementation of threshold gates. The complexity of our scheme is measured by the overhead in the number of neurons and the computation time, both are shown to be polynomial in the largest latency value, and the largest incoming degree Δ of the original network.  Node Delays: We introduce the study of asynchronous communication due to variations in the response rates of the neurons in the network. In real brain networks, the round duration varies between different neurons in the network. Our key result is a simulation methodology that allows one to transform the above mentioned synchronized solution under edge delays into a synchronized under node delays while incurring a small overhead w.r.t space and time.
more »
« less
The Computational Cost of Asynchronous Neural Communication
Biological neural computation is inherently asynchronous due to large variations in neuronal spike timing and transmission delays. Sofar, most theoretical work on neural networks assumes the synchronous setting where neurons fire simultaneously in discrete rounds. In this work we aim at understanding the barriers of asynchronous neural computation from an algorithmic perspective. We consider an extension of the widely studied model of synchronized spiking neurons [Maass, Neural= Networks 97] to the asynchronous setting by taking into account edge and node delays.
Edge Delays: We define an asynchronous model for spiking neurons in which the latency values (i.e., transmission delays) of non selfloop edges vary adversarially over time. This extends the recent work of [Hitron and Parter, ESA’19] in which the latency values are restricted to be fixed over time. Our first contribution is an impossibility result that implies that the assumption that selfloop edges have no delays (as assumed in Hitron and Parter) is indeed necessary. Interestingly, in real biological networks selfloop edges (a.k.a. autapse) are indeed free of delays, and the latter has been noted by neuroscientists to be crucial for network synchronization. To capture the computational challenges in this setting, we first consider the implementation of a single NOT gate. This simple function already captures the fundamental difficulties in the asynchronous setting. Our key technical results are space and time upper and lower bounds for the NOT function, our time bounds are tight. In the spirit of the distributed synchronizers [Awerbuch and Peleg, FOCS’90] and following [Hitron and Parter, ESA’19], we then provide a general synchronizer machinery. Our construction is very modular and it is based on efficient circuit implementation of threshold gates. The complexity of our scheme is measured by the overhead in the number of neurons and the computation time, both are shown to be polynomial in the largest latency value, and the largest incoming degree ∆ of the original network.
Node Delays: We introduce the study of asynchronous communication due to variations in the response rates of the neurons in the network. In real brain networks, the round duration varies between different neurons in the network. Our key result is a simulation methodology that allows one to transform the above mentioned synchronized solution under edge delays into a synchronized under node delays while incurring a small overhead w.r.t space and time.
more »
« less
 Award ID(s):
 1810758
 NSFPAR ID:
 10324574
 Date Published:
 Journal Name:
 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We consider the task of measuring time with probabilistic threshold gates implemented by bioinspired spiking neurons. In the model of spiking neural networks, network evolves in discrete rounds, where in each round, neurons fire in pulses in response to a sufficiently high membrane potential. This potential is induced by spikes from neighboring neurons that fired in the previous round, which can have either an excitatory or inhibitory effect. Discovering the underlying mechanisms by which the brain perceives the duration of time is one of the largest open enigma in computational neuroscience. To gain a better algorithmic understanding onto these processes, we introduce the neural timer problem. In this problem, one is given a time parameter t, an input neuron x, and an output neuron y. It is then required to design a minimum sized neural network (measured by the number of auxiliary neurons) in which every spike from x in a given round i, makes the output y fire for the subsequent t consecutive rounds.We first consider a deterministic implementation of a neural timer and show that Θ(logt)(deterministic) threshold gates are both sufficient and necessary. This raised the question of whether randomness can be leveraged to reduce the number of neurons. We answer this question in the affirmative by considering neural timers with spiking neurons where the neuron y is required to fire for t consecutive rounds with probability at least 1−δ, and should stop firing after at most 2 t rounds with probability 1−δ for some input parameter δ∈(0,1). Our key result is a construction of a neural timer with O(log log 1/δ) spiking neurons. Interestingly, this construction uses only one spiking neuron, while the remaining neurons can be deterministic threshold gates. We complement this construction with a matching lower bound of Ω(min{log log 1/δ,logt}) neurons. This provides the first separation between deterministic and randomized constructions in the setting of spiking neural networks.Finally, we demonstrate the usefulness of compressed counting networks for synchronizing neural networks. In the spirit of distributed synchronizers [AwerbuchPeleg, FOCS’90], we provide a general transformation (or simulation) that can take any synchronized network solution and simulate it in an asynchronous setting (where edges have arbitrary response latencies) while incurring a small overhead w.r.t the number of neurons and computation time.more » « less

We initiate the study of biologicallyinspired spiking neural networks from the perspective of streaming algorithms. Like computers, human brains face memory limitations, which pose a significant obstacle when processing large scale and dynamically changing data. In computer science, these challenges are captured by the wellknown streaming model, which can be traced back to Munro and Paterson `78 and has had significant impact in theory and beyond. In the classical streaming setting, one must compute a function f of a stream of updates 𝒮 = {u₁,…,u_m}, given restricted singlepass access to the stream. The primary complexity measure is the space used by the algorithm. In contrast to the large body of work on streaming algorithms, relatively little is known about the computational aspects of data processing in spiking neural networks. In this work, we seek to connect these two models, leveraging techniques developed for streaming algorithms to better understand neural computation. Our primary goal is to design networks for various computational tasks using as few auxiliary (noninput or output) neurons as possible. The number of auxiliary neurons can be thought of as the "space" required by the network. Previous algorithmic work in spiking neural networks has many similarities with streaming algorithms. However, the connection between these two spacelimited models has not been formally addressed. We take the first steps towards understanding this connection. On the upper bound side, we design neural algorithms based on known streaming algorithms for fundamental tasks, including distinct elements, approximate median, and heavy hitters. The number of neurons in our solutions almost match the space bounds of the corresponding streaming algorithms. As a general algorithmic primitive, we show how to implement the important streaming technique of linear sketching efficiently in spiking neural networks. On the lower bound side, we give a generic reduction, showing that any spaceefficient spiking neural network can be simulated by a spaceefficient streaming algorithm. This reduction lets us translate streamingspace lower bounds into nearly matching neuralspace lower bounds, establishing a close connection between the two models.more » « less

Evolution has honed predatory skills in the natural world where localizing and intercepting fastmoving prey is required. The current generation of robotic systems mimics these biological systems using deep learning. Highspeed processing of the camera frames using convolutional neural networks (CNN) (frame pipeline) on such constrained aerial edgerobots gets resourcelimited. Adding more compute resources also eventually limits the throughput at the frame rate of the camera as frameonly traditional systems fail to capture the detailed temporal dynamics of the environment. Bioinspired event cameras and spiking neural networks (SNN) provide an asynchronous sensorprocessor pair (event pipeline) capturing the continuous temporal details of the scene for highspeed but lag in terms of accuracy. In this work, we propose a target localization system combining eventcamera and SNNbased highspeed target estimation and framebased camera and CNNdriven reliable object detection by fusing complementary spatiotemporal prowess of event and frame pipelines. One of our main contributions involves the design of an SNN filter that borrows from the neural mechanism for egomotion cancelation in houseflies. It fuses the vestibular sensors with the vision to cancel the activity corresponding to the predator's selfmotion. We also integrate the neuroinspired multipipeline processing with taskoptimized multineuronal pathway structure in primates and insects. The system is validated to outperform CNNonly processing using preypredator drone simulations in realistic 3D virtual environments. The system is then demonstrated in a realworld multidrone setup with emulated event data. Subsequently, we use recorded actual sensory data from multicamera and inertial measurement unit (IMU) assembly to show desired working while tolerating the realistic noise in vision and IMU sensors. We analyze the design space to identify optimal parameters for spiking neurons, CNN models, and for checking their effect on the performance metrics of the fused system. Finally, we map the throughput controlling SNN and fusion network on edgecompatible Zynq7000 FPGA to show a potential 264 outputs per second even at constrained resource availability. This work may open new research directions by coupling multiple sensing and processing modalities inspired by discoveries in neuroscience to break fundamental tradeoffs in framebased computer vision 1 .more » « less

null (Ed.)Deep neural networks (DNNs) are increasingly used for realtime inference, requiring low latency, but require significant computational power as they continue to increase in complexity. Edge clouds promise to offer lower latency due to their proximity to endusers and having powerful accelerators like GPUs to provide the computation power needed for DNNs. But it is also important to ensure that the edgecloud resources are utilized well. For this, multiplexing several DNN models through spatial sharing of the GPU can substantially improve edgecloud resource usage. Typical GPU runtime environments have significant interactions with the CPU, to transfer data to the GPU, for CPUGPU synchronization on inference task completions, etc. These result in overheads. We present a DNN inference framework with a set of software primitives that reduce the overhead for DNN inference, increase GPU utilization and improve performance, with lower latency and higher throughput. Our first primitive uses the GPU DMA effectively, reducing the CPU cycles spent to transfer the data to the GPU. A second primitive uses asynchronous ‘events’ for faster task completion notification. GPU runtimes typically preclude finegrained user control on GPU resources, causing long GPU downtimes when adjusting resources. Our third primitive supports overlapping of modelloading and execution, thus allowing GPU resource reallocation with very little GPU idle time. Our other primitives increase inference throughput by improving scheduling and processing more requests. Overall, our primitives decrease inference latency by more than 35% and increase DNN throughput by 23×.more » « less