skip to main content


Title: Densified Winner Take All (WTA) Hashing for Sparse Datasets
WTA (Winner Take All) hashing has been successfully applied in many large-scale vision applications. This hashing scheme was tailored to take advantage of the comparative reasoning (or order based information), which showed significant accuracy improvements. In this paper, we identify a subtle issue with WTA, which grows with the sparsity of the datasets. This issue limits the discriminative power of WTA. We then propose a solution to this problem based on the idea of Densification which makes use of 2-universal hash functions in a novel way. Our experiments show that Densified WTA Hashing outperforms Vanilla WTA Hashing both in image retrieval and classification tasks consistently and significantly  more » « less
Award ID(s):
1718478 1652131
NSF-PAR ID:
10065989
Author(s) / Creator(s):
Date Published:
Journal Name:
Uncertainty in artificial intelligence
ISSN:
1525-3384
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Winner-take-all (WTA) refers to the neural operation that selects a (typically small) group of neurons from a large neuron pool. It is conjectured to underlie many of the brain’s fundamental computational abilities. However, not much is known about the robustness of a spike-based WTA network to the inherent randomness of the input spike trains.In this work, we consider a spike-based k–WTA model wherein n randomly generated input spike trains compete with each other based on their underlying firing rates and k winners are supposed to be selected. We slot the time evenly with each time slot of length 1 ms and model then input spike trains as n independent Bernoulli processes. We analytically characterize the minimum waiting time needed so that a target minimax decision accuracy (success probability) can be reached. We first derive an information-theoretic lower bound on the waiting time. We show that to guarantee a (minimax) decision error≤δ(whereδ∈(0,1)), the waiting time of any WTA circuit is at least ((1−δ)log(k(n−k)+1)−1)TR,where R⊆(0,1)is a finite set of rates and TR is a difficulty parameter of a WTA task with respect to set R for independent input spike trains. Additionally,TR is independent of δ,n,and k. We then design a simple WTA circuit whose waiting time is 2524L. Su, C.-J. Chang, and N. Lynch O((log(1δ)+logk(n−k))TR),provided that the local memory of each output neuron is sufficiently long. It turns out that for any fixed δ, this decision time is order-optimal (i.e., it matches the above lower bound up to a multiplicative constant factor) in terms of its scaling inn,k, and TR. 
    more » « less
  2. Winner-Take-All (WTA) refers to the neural operation that selects a (typically small) group of neurons from a large neuron pool. It is conjectured to underlie many of the brain’s fundamental computational abilities.However, not much is known about the robustness of a spike-based WTA network to the inherent randomness of the input spike trains. In this work, we consider a spike-based k–WTA model where in n randomly generated input spike trains compete with each other based on their underlying firing rates, and k winners are supposed to be selected. We slot the time evenly with each time slot of length 1ms, and model then input spike trains as n independent Bernoulli processes. We analytically characterize the minimum waiting time needed so that a target minimax decision accuracy (success probability) can be reached.We first derive an information-theoretic lower bound on the decision time. We show that to guarantee a (minimax) decision error≤δ(whereδ∈(0,1)), the waiting time of any WTA circuit is at least((1−δ) log(k(n−k) + 1)−1)TR,whereR ⊆(0,1) is a finite set of rates, and TR is a difficulty parameter of a WTA task with respect to setRfor independent input spike trains.Additionally,TR is independent ofδ,n, andk. We then design a simple WTA circuit whose waiting time isO((log(1δ)+ logk(n−k))TR), provided that the local memory of each output neuron is sufficiently long. It turns out that for any fixed δ, this decision time is order-optimal (i.e., it 2 matches the above lower bound up to a multiplicative constant factor) in terms of its scaling inn,k, and TR. 
    more » « less
  3. We propose a simple change to existing neural network structures for better defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activation, a C0 discontinuous function that purposely invalidates the neural network model's gradient at densely distributed input data points. The proposed k-WTA activation can be readily used in nearly all existing networks and training methods with no significant overhead. Our proposal is theoretically rationalized. We analyze why the discontinuities in k-WTA networks can largely prevent gradient-based search of adversarial examples and why they at the same time remain innocuous to the network training. This understanding is also empirically backed. We test k-WTA activation on various network structures optimized by a training method, be it adversarial training or not. In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks. 
    more » « less
  4. null (Ed.)
    An elemental computation in the brain is to identify the best in a set of options and report its value. It is required for inference, decision-making, optimization, action selection, consensus, and foraging. Neural computing is considered powerful because of its parallelism; however, it is unclear whether neurons can perform this max-finding operation in a way that improves upon the prohibitively slow optimal serial max-finding computation (which takes ∼ N ⁡ log ( N ) time for N noisy candidate options) by a factor of N, the benchmark for parallel computation. Biologically plausible architectures for this task are winner-take-all (WTA) networks, where individual neurons inhibit each other so only those with the largest input remain active. We show that conventional WTA networks fail the parallelism benchmark and, worse, in the presence of noise, altogether fail to produce a winner when N is large. We introduce the nWTA network, in which neurons are equipped with a second nonlinearity that prevents weakly active neurons from contributing inhibition. Without parameter fine-tuning or rescaling as N varies, the nWTA network achieves the parallelism benchmark. The network reproduces experimentally observed phenomena like Hick’s law without needing an additional readout stage or adaptive N-dependent thresholds. Our work bridges scales by linking cellular nonlinearities to circuit-level decision-making, establishes that distributed computation saturating the parallelism benchmark is possible in networks of noisy, finite-memory neurons, and shows that Hick’s law may be a symptom of near-optimal parallel decision-making with noisy input. 
    more » « less
  5. Neuromorphic computing is a promising candidate for beyond-von Neumann computer architectures, featuring low power consumption and high parallelism. Lateral inhibition and winner-take-all (WTA) features play a crucial role in neuronal competition of the nervous system as well as neuromorphic hardwares. The domain wall - magnetic tunnel junction (DWMTJ) neuron is an emerging spintronic artificial neuron device exhibiting intrinsic lateral inhibition. In this paper we show that lateral inhibition parameters modulate the neuron firing statistics in a DW-MTJ neuron array, thus emulating soft-winner-take-all (WTA) and firing group selection. 
    more » « less