skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 2:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Wang, C."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We use multimodal deep neural networks to identify sites of multimodal integration in the human brain and investigate how well these networks model integration in the brain. Sites of multimodal integration are regions where a multimodal language-vision model is better at predicting neural recordings (stereoelectroencephalography, SEEG) than either a unimodal language, unimodal vision, or a linearly-integrated language-vision model. We use a range of state-of-the-art models spanning different architectures including Transformers and CNNs with different multimodal integration approaches to model the SEEG signal while subjects watched movies. As a key enabling step, we first demonstrate that the approach has the resolution to distinguish trained from randomly-initialized models for both language and vision; the inability to do so would fundamentally hinder further analysis. We show that trained models systematically outperform randomly initialized models in their ability to predict the SEEG signal. We then compare unimodal and multimodal models against one another. Since models all have different architectures, number of parameters, and training sets which can obscure the results, we then carry out a test between two controlled models: SLIP-Combo and SLIP-SimCLR which keep all of these attributes the same aside from multimodal input. Our first key contribution identifies neural sites (on average 141 out of 1090 total sites or 12.94\%) and brain regions where multimodal integration is occurring. Our second key contribution finds that CLIP-style training is best suited for modeling multimodal integration in the brain when analyzing different methods of multimodal integration and how they model the brain. 
    more » « less
  2. Free, publicly-accessible full text available April 1, 2025
  3. Free, publicly-accessible full text available March 1, 2025
  4. Free, publicly-accessible full text available December 15, 2024
  5. Free, publicly-accessible full text available March 1, 2025
  6. Free, publicly-accessible full text available October 23, 2024
  7. We create a reusable Transformer, BrainBERT, for intracranial field potential recordings bringing modern representation learning approaches to neuroscience. Much like in NLP and speech recognition, this Transformer enables classifying complex concepts, i.e., decoding neural data, with higher accuracy and with much less data by being pretrained in an unsupervised manner on a large corpus of unannotated neural recordings. Our approach generalizes to new subjects with electrodes in new positions and to unrelated tasks showing that the representations robustly disentangle the neural signal. Just like in NLP where one can study language by investigating what a language model learns, this approach enables investigating the brain by studying what a model of the brain learns. As a first step along this path, we demonstrate a new analysis of the intrinsic dimensionality of the computations in different areas of the brain. To construct BrainBERT, we combine super-resolution spectrograms of neural data with an approach designed for generating contextual representations of audio by masking. In the future, far more concepts will be decodable from neural recordings by using representation learning, potentially unlocking the brain like language models unlocked language. 
    more » « less
  8. Free, publicly-accessible full text available August 1, 2024