skip to main content


Search for: All records

Award ID contains: 1734910

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The macaque middle temporal (MT) area is well known for its visual motion selectivity and relevance to motion perception, but the possibility of it also reflecting higher-level cognitive functions has largely been ignored. We tested for effects of task performance distinct from sensory encoding by manipulating subjects' temporal evidence-weighting strategy during a direction discrimination task while performing electrophysiological recordings from groups of MT neurons in rhesus macaques (one male, one female). This revealed multiple components of MT responses that were, surprisingly, not interpretable as behaviorally relevant modulations of motion encoding, or as bottom-up consequences of the readout of motion direction from MT. The time-varying motion-driven responses of MT were strongly affected by our strategic manipulation—but with time courses opposite the subjects' temporal weighting strategies. Furthermore, large choice-correlated signals were represented in population activity distinct from its motion responses, with multiple phases that lagged psychophysical readout and even continued after the stimulus (but which preceded motor responses). In summary, a novel experimental manipulation of strategy allowed us to control the time course of readout to challenge the correlation between sensory responses and choices, and population-level analyses of simultaneously recorded ensembles allowed us to identify strong signals that were so distinct from direction encoding that conventional, single-neuron-centric analyses could not have revealed or properly characterized them. Together, these approaches revealed multiple cognitive contributions to MT responses that are task related but not functionally relevant to encoding or decoding of motion for psychophysical direction discrimination, providing a new perspective on the assumed status of MT as a simple sensory area.

    SIGNIFICANCE STATEMENTThis study extends understanding of the middle temporal (MT) area beyond its representation of visual motion. Combining multineuron recordings, population-level analyses, and controlled manipulation of task strategy, we exposed signals that depended on changes in temporal weighting strategy, but did not manifest as feedforward effects on behavior. This was demonstrated by (1) an inverse relationship between temporal dynamics of behavioral readout and sensory encoding, (2) a choice-correlated signal that always lagged the stimulus time points most correlated with decisions, and (3) a distinct choice-correlated signal after the stimulus. These findings invite re-evaluation of MT for functions outside of its established sensory role and highlight the power of experimenter-controlled changes in temporal strategy, coupled with recording and analysis approaches that transcend the single-neuron perspective.

     
    more » « less
  2. Nonlinear state-space models are powerful tools to describe dynamical structures in complex time series. In a streaming setting where data are processed one sample at a time, simultaneous inference of the state and its nonlinear dynamics has posed significant challenges in practice. We develop a novel online learning framework, leveraging variational inference and sequential Monte Carlo, which enables flexible and accurate Bayesian joint filtering. Our method provides an approximation of the filtering posterior which can be made arbitrarily close to the true filtering distribution for a wide class of dynamics models and observation models. Specifically, the proposed framework can efficiently approximate a posterior over the dynamics using sparse Gaussian processes, allowing for an interpretable model of the latent dynamics. Constant time complexity per sample makes our approach amenable to online learning scenarios and suitable for real-time applications. 
    more » « less
  3. null (Ed.)
  4. Marinazzo, Daniele (Ed.)
  5. null (Ed.)
    The standard approach to fitting an autoregressive spike train model is to maximize the likelihood for one-step prediction. This maximum likelihood estimation (MLE) often leads to models that perform poorly when generating samples recursively for more than one time step. Moreover, the generated spike trains can fail to capture important features of the data and even show diverging firing rates. To alleviate this, we propose to directly minimize the divergence between neural recorded and model generated spike trains using spike train kernels. We develop a method that stochastically optimizes the maximum mean discrepancy induced by the kernel. Experiments performed on both real and synthetic neural data validate the proposed approach, showing that it leads to well-behaving models. Using different combinations of spike train kernels, we show that we can control the trade-off between different features which is critical for dealing with model-mismatch. 
    more » « less
  6. null (Ed.)
    Understanding the nature of representation in neural networks is a goal shared by neuroscience and machine learning. It is therefore exciting that both fields converge not only on shared questions but also on similar approaches. A pressing question in these areas is understanding how the structure of the representation used by neural networks affects both their generalization, and robustness to perturbations. In this work, we investigate the latter by juxtaposing experimental results regarding the covariance spectrum of neural representations in the mouse V1 (Stringer et al) with artificial neural networks. We use adversarial robustness to probe Stringer et al's theory regarding the causal role of a 1/n covariance spectrum. We empirically investigate the benefits such a neural code confers in neural networks, and illuminate its role in multi-layer architectures. Our results show that imposing the experimentally observed structure on artificial neural networks makes them more robust to adversarial attacks. Moreover, our findings complement the existing theory relating wide neural networks to kernel methods, by showing the role of intermediate representations. 
    more » « less
  7. null (Ed.)
    Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near isometries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that orthogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, and we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, we observe a surprising yet robust behavior of highly isometric initializations --- even though such networks have a lower FIM condition number \emph{at initialization}, and therefore by analogy to convex functions should be easier to optimize, experimentally they prove to be much harder to train with stochastic gradient descent. We conjecture the FIM condition number plays a non-trivial role in the optimization. 
    more » « less
  8. Exploding and vanishing gradient are both major problems often faced when an artificial neural network is trained with gradient descent. Inspired by the ubiquity and robustness of nonlinear oscillations in biological neural systems, we investigate the properties of their artificial counterpart, the stable limit cycle neural networks. Using a continuous time dynamical system interpretation of neural networks and backpropagation, we show that stable limit cycle neural networks have non-exploding gradients, and at least one effective non-vanishing gradient dimension. We conjecture that limit cycles can support the learning of long temporal dependence in both biological and artificial neural networks. 
    more » « less
  9. Many real-world systems studied are governed by complex, nonlinear dynamics. By modeling these dynamics, we can gain insight into how these systems work, make predictions about how they will behave, and develop strategies for controlling them. While there are many methods for modeling nonlinear dynamical systems, existing techniques face a trade off between offering interpretable descriptions and making accurate predictions. Here, we develop a class of models that aims to achieve both simultaneously, smoothly interpolating between simple descriptions and more complex, yet also more accurate models. Our probabilistic model achieves this multi-scale property through a hierarchy of locally linear dynamics that jointly approximate global nonlinear dynamics. We call it the tree-structured recurrent switching linear dynamical system. To fit this model, we present a fully-Bayesian sampling procedure using Polya-Gamma data augmentation to allow for fast and conjugate Gibbs sampling. Through a variety of synthetic and real examples, we show how these models outperform existing methods in both interpretability and predictive capability. 
    more » « less