We propose SinGAN-GIF, an extension of the image-based SinGAN [27] to GIFs or short video snippets. Our method learns the distribution of both the image patches in the GIF as well as their motion patterns. We do so by using a pyramid of 3D and 2D convolutional networks to model temporal information while reducing model parameters and training time, along with an image and a video discriminator. SinGAN-GIF can generate similar looking video samples for natural scenes at different spatial resolutions or temporal frame rates, and can be extended to other video applications like video editing, super resolution, and motion transfer. The project page, with supplementary video results, is: https://rajat95.github. io/singan-gif/
more »
« less
Dynamic Electrode-to-Image (DETI) mapping reveals the human brain’s spatiotemporal code of visual information
A number of neuroimaging techniques have been employed to understand how visual information is transformed along the visual pathway. Although each technique has spatial and temporal limitations, they can each provide important insights into the visual code. While the BOLD signal of fMRI can be quite informative, the visual code is not static and this can be obscured by fMRI’s poor temporal resolution. In this study, we leveraged the high temporal resolution of EEG to develop an encoding technique based on the distribution of responses generated by a population of real-world scenes. This approach maps neural signals to each pixel within a given image and reveals location-specific transformations of the visual code, providing a spatiotemporal signature for the image at each electrode. Our analyses of the mapping results revealed that scenes undergo a series of nonuniform transformations that prioritize different spatial frequencies at different regions of scenes over time. This mapping technique offers a potential avenue for future studies to explore how dynamic feedforward and recurrent processes inform and refine high-level representations of our visual world.
more »
« less
- Award ID(s):
- 1736274
- PAR ID:
- 10382439
- Editor(s):
- Schyns, Philippe George
- Date Published:
- Journal Name:
- PLOS Computational Biology
- Volume:
- 17
- Issue:
- 9
- ISSN:
- 1553-7358
- Page Range / eLocation ID:
- e1009456
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract How the brain encodes, recognizes, and memorizes general visual objects is a fundamental question in neuroscience. Here, we investigated the neural processes underlying visual object perception and memory by recording from 3173 single neurons in the human amygdala and hippocampus across four experiments. We employed both passive-viewing and recognition memory tasks involving a diverse range of naturalistic object stimuli. Our findings reveal a region-based feature code for general objects, where neurons exhibit receptive fields in the high-level visual feature space. This code can be validated by independent new stimuli and replicated across all experiments, including fixation-based analyses with large natural scenes. This region code explains the long-standing visual category selectivity, preferentially enhances memory of encoded stimuli, predicts memory performance, encodes image memorability, and exhibits intricate interplay with memory contexts. Together, region-based feature coding provides an important mechanism for visual object processing in the human brain.more » « less
-
The availability of massive earth observing satellite data provides huge opportunities for land use and land cover mapping. However, such mapping effort is challenging due to the existence of various land cover classes, noisy data, and the lack of proper labels. Also, each land cover class typically has its own unique temporal pattern and can be identified only during certain periods. In this article, we introduce a novel architecture that incorporates the UNet structure with a Bidirectional LSTM and Attention mechanism to jointly exploit the spatial and temporal nature of satellite data and to better identify the unique temporal patterns of each land cover class. We compare our method with other state-of-the-art methods both quantitatively and qualitatively on two real-world datasets which involve multiple land cover classes. We also visualize the attention weights to study its effectiveness in mitigating noise and in identifying discriminative time periods of different classes. The code and dataset used in this work are made publicly available for reproducibility.more » « less
-
null (Ed.)The availability of massive earth observing satellite data provides huge opportunities for land use and land cover mapping. However, such mapping effort is challenging due to the existence of various land cover classes, noisy data, and the lack of proper labels. Also, each land cover class typically has its own unique temporal pattern and can be identified only during certain periods. In this article, we introduce a novel architecture that incorporates the UNet structure with Bidirectional LSTM and Attention mechanism to jointly exploit the spatial and temporal nature of satellite data and to better identify the unique temporal patterns of each land cover class. We compare our method with other state-of-the-art methods both quantitatively and qualitatively on two real-world datasets which involve multiple land cover classes. We also visualize the attention weights to study its effectiveness in mitigating noise and in identifying discriminative time periods of different classes. The code and dataset used in this work are made publicly available for reproducibility.more » « less
-
Biological visual systems have evolved to process natural scenes. A full understanding of visual cortical functions requires a comprehensive characterization of how neuronal populations in each visual area encode natural scenes. Here, we utilized widefield calcium imaging to record V4 cortical response to tens of thousands of natural images in male macaques. Using this large dataset, we developed a deep-learning digital twin of V4 that allowed us tomap the natural image preferences of the neural population at 100-μmscale. This detailed map revealed a diverse set of functional domains in V4, each encoding distinct natural image features. We validated these model predictions using additional widefield imaging and single-cell resolution two-photon imaging. Feature attribution analysis revealed that these domains lie along a continuum from preferring spatially localized shape features to preferring spatially dispersed surface features. These results provide insights into the organizing principles that govern natural scene encoding in V4.more » « less
An official website of the United States government

