skip to main content

Title: Dual-Attention Recurrent Networks for Affine Registration of Neuroimaging Data
Neuroimaging data typically undergoes several preprocessing steps before further analysis and mining can be done. Affine image registration is one of the important tasks during preprocessing. Recently, several image registration methods which are based on Convolutional Neural Networks have been proposed. However, due to the high computational and memory requirements of CNNs, these methods cannot be used in real-time for large neuroimaging data like fMRI. In this paper, we propose a Dual-Attention Recurrent Network (DRN) which uses a hard attention mechanism to allow the model to focus on small, but task-relevant, parts of the input image – thus reducing computational and memory costs. Furthermore, DRN naturally supports inhomogeneity between the raw input image (e.g., functional MRI) and the image we want to align it to (e.g., anatomical MRI) so it can be applied to harder registration tasks such as fMRI coregistration and normalization. Extensive experiments on two different datasets demonstrate that DRN significantly reduces the computational and memory costs compared with other neural network-based methods without sacrificing the quality of image registration
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. It is becoming increasingly common to collect multiple related neuroimaging datasets either from different modalities or from different tasks and conditions. In addition, we have non-imaging data such as cognitive or behavioral variables, and it is through the association of these two sets of data—neuroimaging and non-neuroimaging—that we can understand and explain the evolution of neural and cognitive processes, and predict outcomes for intervention and treatment. Multiple methods for the joint analysis or fusion of multiple neuroimaging datasets or modalities exist; however, methods for the joint analysis of imaging and non-imaging data are still in their infancy. Current approaches for identifying brain networks related to cognitive assessments are still largely based on simple one-to-one correlation analyses and do not use the cross information available across multiple datasets. This work proposes two approaches based on independent vector analysis (IVA) to jointly analyze the imaging datasets and behavioral variables such that multivariate relationships across imaging data and behavioral features can be identified. The simulation results show that our proposed methods provide better accuracy in identifying associations across imaging and behavioral components than current approaches. With functional magnetic resonance imaging (fMRI) task data collected from 138 healthy controls and 109 patients with schizophrenia,more »results reveal that the central executive network (CEN) estimated in multiple datasets shows a strong correlation with the behavioral variable that measures working memory, a result that is not identified by traditional approaches. Most of the identified fMRI maps also show significant differences in activations across healthy controls and patients potentially providing a useful signature of mental disorders.« less
  2. Cardiac Cine Magnetic Resonance (CMR) Imaging has made a significant paradigm shift in medical imaging technology, thanks to its capability of acquiring high spatial and temporal resolution images of different structures within the heart that can be used for reconstructing patient-specific ventricular computational models. In this work, we describe the development of dynamic patient-specific right ventricle (RV) models associated with normal subjects and abnormal RV patients to be subsequently used to assess RV function based on motion and kinematic analysis. We first constructed static RV models using segmentation masks of cardiac chambers generated from our accurate, memory-efficient deep neural architecture - CondenseUNet - featuring both a learned group structure and a regularized weight-pruner to estimate the motion of the right ventricle. In our study, we use a deep learning-based deformable network that takes 3D input volumes and outputs a motion field which is then used to generate isosurface meshes of the cardiac geometry at all cardiac frames by propagating the end-diastole (ED) isosurface mesh using the reconstructed motion field. The proposed model was trained and tested on the Automated Cardiac Diagnosis Challenge (ACDC) dataset featuring 150 cine cardiac MRI patient datasets. The isosurface meshes generated using the proposed pipeline weremore »compared to those obtained using motion propagation via traditional non-rigid registration based on several performance metrics, including Dice score and mean absolute distance (MAD).« less
  3. Brain large-scale dynamics is constrained by the heterogeneity of intrinsic anatomical substrate. Little is known how the spatiotemporal dynamics adapt for the heterogeneous structural connectivity (SC). Modern neuroimaging modalities make it possible to study the intrinsic brain activity at the scale of seconds to minutes. Diffusion magnetic resonance imaging (dMRI) and functional MRI reveals the large-scale SC across different brain regions. Electrophysiological methods (i.e. MEG/EEG) provide direct measures of neural activity and exhibits complex neurobiological temporal dynamics which could not be solved by fMRI. However, most of existing multimodal analytical methods collapse the brain measurements either in space or time domain and fail to capture the spatio-temporal circuit dynamics. In this paper, we propose a novel spatio-temporal graph Transformer model to integrate the structural and functional connectivity in both spatial and temporal domain. The proposed method learns the heterogeneous node and graph representation via contrastive learning and multi-head attention based graph Transformer using multimodal brain data (i.e. fMRI, MRI, MEG and behavior performance). The proposed contrastive graph Transformer representation model incorporates the heterogeneity map constrained by T1-to-T2-weighted (T1w/T2w) to improve the model fit to structurefunction interactions. The experimental results with multimodal resting state brain measurements demonstrate the proposed method couldmore »highlight the local properties of large-scale brain spatio-temporal dynamics and capture the dependence strength between functional connectivity and behaviors. In summary, the proposed method enables the complex brain dynamics explanation for different modal variants.« less
  4. Deep learning now offers state-of-the-art accuracy for many prediction tasks. A form of deep learning called deep convolutional neural networks (CNNs) are especially popular on image, video, and time series data. Due to its high computational cost, CNN inference is often a bottleneck in analytics tasks on such data. Thus, a lot of work in the computer architecture, systems, and compilers communities study how to make CNN inference faster. In this work, we show that by elevating the abstraction level and re-imagining CNN inference as queries , we can bring to bear database-style query optimization techniques to improve CNN inference efficiency. We focus on tasks that perform CNN inference repeatedly on inputs that are only slightly different . We identify two popular CNN tasks with this behavior: occlusion-based explanations (OBE) and object recognition in videos (ORV). OBE is a popular method for “explaining” CNN predictions. It outputs a heatmap over the input to show which regions (e.g., image pixels) mattered most for a given prediction. It leads to many re-inference requests on locally modified inputs. ORV uses CNNs to identify and track objects across video frames. It also leads to many re-inference requests. We cast such tasks in a unifiedmore »manner as a novel instance of the incremental view maintenance problem and create a comprehensive algebraic framework for incremental CNN inference that reduces computational costs. We produce materialized views of features produced inside a CNN and connect them with a novel multi-query optimization scheme for CNN re-inference. Finally, we also devise novel OBE-specific and ORV-specific approximate inference optimizations exploiting their semantics. We prototype our ideas in Python to create a tool called Krypton that supports both CPUs and GPUs. Experiments with real data and CNNs show that Krypton reduces runtimes by up to 5× (respectively, 35×) to produce exact (respectively, high-quality approximate) results without raising resource requirements.« less
  5. We consider an MRI reconstruction problem with input of k-space data at a very low undersampled rate. This can prac- tically benefit patient due to reduced time of MRI scan, but it is also challenging since quality of reconstruction may be compromised. Currently, deep learning based methods dom- inate MRI reconstruction over traditional approaches such as Compressed Sensing, but they rarely show satisfactory performance in the case of low undersampled k-space data. One explanation is that these methods treat channel-wise fea- tures equally, which results in degraded representation ability of the neural network. To solve this problem, we propose a new model called MRI Cascaded Channel-wise Attention Network (MICCAN), highlighted by three components: (i) a variant of U-net with Channel-wise Attention (UCA) mod- ule, (ii) a long skip connection and (iii) a combined loss. Our model is able to attend to salient information by filtering irrelevant features and also concentrate on high-frequency in- formation by enforcing low-frequency information bypassed to the final output. We conduct both quantitative evaluation and qualitative analysis of our method on a cardiac dataset. The experiment shows that our method achieves very promis- ing results in terms of three common metrics on the MRI reconstruction withmore »low undersampled k-space data. Code is public available« less