Much of sensory neuroscience focuses on sensory features that are chosen by the experimenter because they are thought to be behaviorally relevant to the organism. However, it is not generally known what these features are in complex, natural scenes. This work focuses on using the retinal encoding of natural movies to determine the presumably behaviorally-relevant features that the brain represents. It is prohibitive to parameterize a natural movie and its respective retinal encoding fully. We use time within a natural movie as a proxy for the whole suite of features evolving across the scene. We then use a task-agnostic deep architecture, an encoder-decoder, to model the retinal encoding process and characterize its representation of``time in the natural scene''in a compressed latent space. In our end-to-end training, an encoder learns a compressed latent representation from a large population of salamander retinal ganglion cells responding to natural movies, while a decoder samples from this compressed latent space to generate the appropriate movie frame. By comparing latent representations of retinal activity from three movies, we find that the retina performs transfer learning to encode time: the precise, low-dimensional representation of time learned from one movie can be used to represent time in a different movie, with up to 17ms resolution. We then show that static textures and velocity features of a natural movie are synergistic. The retina simultaneously encodes both to establishes a generalizable, low-dimensional representation of time in the natural scene.
more »
« less
Stimulus invariant aspects of the retinal code drive discriminability of natural scenes
Everything that the brain sees must first be encoded by the retina, which maintains a reliable representation of the visual world in many different, complex natural scenes while also adapting to stimulus changes. This study quantifies whether and how the brain selectively encodes stimulus features about scene identity in complex naturalistic environments. While a wealth of previous work has dug into the static and dynamic features of the population code in retinal ganglion cells, less is known about how populations form both flexible and reliable encoding in natural moving scenes. We record from the larval salamander retina responding to five different natural movies, over many repeats, and use these data to characterize the population code in terms of single-cell fluctuations in rate and pairwise couplings between cells. Decomposing the population code into independent and cell-cell interactions reveals how broad scene structure is encoded in the retinal output. while the single-cell activity adapts to different stimuli, the population structure captured in the sparse, strong couplings is consistent across natural movies as well as synthetic stimuli. We show that these interactions contribute to encoding scene identity. We also demonstrate that this structure likely arises in part from shared bipolar cell input as well as from gap junctions between retinal ganglion cells and amacrine cells.
more »
« less
- Award ID(s):
- 2235451
- PAR ID:
- 10609193
- Publisher / Repository:
- bioRxiv
- Date Published:
- Format(s):
- Medium: X
- Institution:
- bioRxiv
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We have created encoding manifolds to reveal the overall responses of a brain area to a variety of stimuli. Encoding manifolds organize response properties globally: each point on an encoding manifold is a neuron, and nearby neurons respond similarly to the stimulus ensemble in time. We previously found, using a large stimulus ensemble including optic flows, that encoding manifolds for the retina were highly clustered, with each cluster corresponding to a different ganglion cell type. In contrast, the topology of the V1 manifold was continuous. Now, using responses of individual neurons from the Allen Institute Visual Coding-Neuropixels dataset in the mouse, we infer encoding manifolds for V1 and for five higher cortical visual areas (VISam, VISal, VISpm, VISlm, and VISrl). We show here that the encoding manifold topology computed only from responses to various grating stimuli is also continuous, not only for V1 but also for the higher visual areas, with smooth coordinates spanning it that include, among others, orientation selectivity and firing-rate magnitude. Surprisingly, the encoding manifold for gratings also provides information about natural scene responses. To investigate whether neurons respond more strongly to gratings or natural scenes, we plot the log ratio of natural scene responses to grating responses (mean firing rates) on the encoding manifold. This reveals a global coordinate axis organizing neurons' preferences between these two stimuli. This coordinate is orthogonal (i.e., uncorrelated) to that organizing firing rate magnitudes in VISp. Analyzing layer responses, a preference for gratings is concentrated in layer 6, whereas preference for natural scenes tends to be higher in layers 2/3 and 4. We also find that preference for natural scenes dominates the responses of neurons that prefer low (0.02 cpd) and high (0.32 cpd) spatial frequencies, rather than intermediate ones (0.04 to 0.16 cpd). Conclusion: while gratings seem limited and natural scenes unconstrained, machine learning algorithms can reveal subtle relationships between them beyond linear techniques.more » « less
-
null (Ed.)Two interleaved stimulus sets were identical except for the background. In one, the flow stimuli background was the mid-gray of the interstimulus interval (equal background, eqbg), leading to a change of 9-10% in the space-average luminance. In the other, the space-average luminance of the entire stimulus field was adjusted to a constant (equal luminance, eqlum) within 0.5%; i.e., the background was slightly lightened when the dots in the flow were dark, and darkened when the dots were bright. Most cortical cells appeared to respond similarly to the two stimulus sets, as if stimulus structure mattered but not the background change, while the responses of most retinal ganglion cells appeared to differ between the two conditions. Machine learning algorithms confirmed this quantitatively. A manifold embedding of neurons to the two stimulus sets was constructed using diffusion maps. In this manifold, the responses of the same cell to eqlum and eqbg stimuli were significantly closer to one another for V1 rather than for the retina. Geometrically, the median ratio of the distance between the responses of each cell to the two stimulus sets as compared to the distance to the closest cell on the manifold was 3.5 for V1 compared to 12.7 for retina. Topologically, the fraction of cells for which the responses of the same cell to the two stimulus sets were connected in the diffusion map datagraph was 53% for V1 but only 9% for retina; when retina and cortex were co-embedded in the manifold, these fractions were 44% and 6%. While retina and cortex differ on average, it will be intriguing to determine whether particular classes of retinal cells behave more like V1 neurons, and vice versa.more » « less
-
Convolutional neural networks (CNNs), a class of deep learning models, have experienced recent success in modeling sensory cortices and retinal circuits through optimizing performance on machine learning tasks, otherwise known as task optimization. Previous research has shown task-optimized CNNs to be capable of providing explanations as to why the retina efficiently encodes natural stimuli and how certain retinal cell types are involved in efficient encoding. In our work, we sought to use task-optimized CNNs as a means of explaining computational mechanisms responsible for motion-selective retinal circuits. We designed a biologically constrained CNN and optimized its performance on a motion-classification task. We drew inspiration from psychophysics, deep learning, and systems neuroscience literature to develop a toolbox of methods to reverse engineer the computational mechanisms learned in our model. Through reverse engineering our model, we proposed a computational mechanism in which direction-selective ganglion cells and starburst amacrine cells, both experimentally observed retinal cell types, emerge in our model to discriminate among moving stimuli. This emergence suggests that direction-selective circuits in the retina are ecologically designed to robustly discriminate among moving stimuli. Our results and methods also provide a framework for how to build more interpretable deep learning models and how to understand them.more » « less
-
Abstract Decoding sensory stimuli from neural activity can provide insight into how the nervous system might interpret the physical environment, and facilitates the development of brain-machine interfaces. Nevertheless, the neural decoding problem remains a significant open challenge. Here, we present an efficient nonlinear decoding approach for inferring natural scene stimuli from the spiking activities of retinal ganglion cells (RGCs). Our approach uses neural networks to improve on existing decoders in both accuracy and scalability. Trained and validated on real retinal spike data from more than 1000 simultaneously recorded macaque RGC units, the decoder demonstrates the necessity of nonlinear computations for accurate decoding of the fine structures of visual stimuli. Specifically, high-pass spatial features of natural images can only be decoded using nonlinear techniques, while low-pass features can be extracted equally well by linear and nonlinear methods. Together, these results advance the state of the art in decoding natural stimuli from large populations of neurons.more » « less
An official website of the United States government

