Semantic segmentation methods are typically designed for RGB color images, which are interpolated from raw Bayer images. While RGB images provide abundant color information and are easily understood by humans, they also add extra storage and computational burden for neural networks. On the other hand, raw Bayer images preserve primitive color information with a single channel, potentially increasing segmentation accuracy while significantly decreasing storage and computation time. In this paper, we propose RawSeg-Net to segment single-channel raw Bayer images directly. Different from RGB images that already contain neighboring context information during ISP color interpolation, each pixel in raw Bayer images does not contain any context clues. Based on Bayer pattern properties, RawSeg-Net assigns dynamic attention on Bayer images' spectral frequency and spatial locations to mitigate classification confusion, and proposes a re-sampling strategy to capture both global and local contextual information.
more »
« less
Efficient Diffeomorphic Image Registration using Multi-Scale Dual-Phased Learning
Diffeomorphic registration faces challenges for high dimensional images, especially in terms of memory limits. Existing approaches either downsample/crop original images or approximate underlying transformations to reduce the model size. To mitigate this, we propose a Dividing and Down-sampling mixed Registration network (DDR-Net), a general architecture that preserves most of the image information at multiple scales while reducing memory cost. DDR-Net leverages the global context via downsampling the input and utilizes local details by dividing the input images to subvolumes. Such design fuses global and local information and obtains both coarse- and fine-level alignments in the final deformation fields. We apply DDR-Net to the OASIS dataset. The proposed simple yet effective architecture is a general method and could be extended to other registration architectures for better performance with limited computing resources.
more »
« less
- Award ID(s):
- 1755970
- PAR ID:
- 10350886
- Date Published:
- Journal Name:
- IEEE 19th International Symposium on Biomedical Imaging (ISBI)
- Page Range / eLocation ID:
- 1 to 5
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Graph neural networks (GNNs) have emerged as a powerful tool for tasks such as node classification and graph classification. However, much less work has been done on signal classification, where the data consists of many functions (referred to as signals) defined on the vertices of a single graph. These tasks require networks designed differently from those designed for traditional GNN tasks. Indeed, traditional GNNs rely on localized low-pass filters, and signals of interest may have intricate multi-frequency behavior and exhibit long range interactions. This motivates us to introduce the BLIS-Net (Bi-Lipschitz Scattering Net), a novel GNN that builds on the previously introduced geometric scattering transform. Our network is able to capture both local and global signal structure and is able to capture both low-frequency and high-frequency information. We make several crucial changes to the original geometric scattering architecture which we prove increase the ability of our network to capture information about the input signal and show that BLIS-Net achieves superior performance on both synthetic and real-world data sets based on traffic flow and fMRI data.more » « less
-
This paper extends the reach of General Purpose GPU programming by presenting a software architecture that supports efficient fine-grained synchronization over global memory. The key idea is to transform global synchronization into global communication so that conflicts are serialized at the thread block level. With this structure, the threads within each thread block can synchronize using low latency, high-bandwidth local scratchpad memory. To enable this architecture, we implement a scalable and efficient message passing library. Using Nvidia GTX 1080 ti GPUs, we evaluate our new software architecture by using it to solve a set of five irregular problems on a variety of workloads. We find that on average, our solutions improve performance over carefully tuned state-of-the-art solutions by 3.6×.more » « less
-
In accelerated MRI reconstruction, the anatomy of a patient is recovered from a set of under-sampled and noisy measurements. Deep learning approaches have been proven to be successful in solving this ill-posed inverse problem and are capable of producing very high quality reconstructions. However, current architectures heavily rely on convolutions, that are content-independent and have difficulties modeling long-range dependencies in images. Recently, Transformers, the workhorse of contemporary natural language processing, have emerged as powerful building blocks for a multitude of vision tasks. These models split input images into nonoverlapping patches, embed the patches into lower-dimensional tokens and utilize a self-attention mechanism that does not suffer from the aforementioned weaknesses of convolutional architectures. However, Transformers incur extremely high compute and memory cost when 1) the input image resolution is high and 2) when the image needs to be split into a large number of patches to preserve fine detail information, both of which are typical in low-level vision problems such as MRI reconstruction, having a compounding effect. To tackle these challenges, we propose HUMUS-Net, a hybrid architecture that combines the beneficial implicit bias and efficiency of convolutions with the power of Transformer blocks in an unrolled and multi-scale network. HUMUS-Net extracts high-resolution features via convolutional blocks and refines low-resolution features via a novel Transformer-based multi-scale feature extractor. Features from both levels are then synthesized into a high-resolution output reconstruction. Our network establishes new state of the art on the largest publicly available MRI dataset, the fastMRI dataset. We further demonstrate the performance of HUMUS-Net on two other popular MRI datasets and perform fine-grained ablation studies to validate our design.more » « less
-
Major semantic segmentation approaches are designed for RGB color images, which is interpolated from raw Bayer images. The use of RGB images on the one hand provides abundant scene color information. On the other hand, RGB images are easily observable for human users to understand the scene. The RGB color continuity also facilitates researchers to design segmentation algorithms, which becomes unnecessary in end-to-end learning. More importantly, the use of 3 channels adds extra storage and computation burden for neural networks. In contrast, the raw Bayer images can reserve the primitive color information in the largest extent with just a single channel. The compact design of Bayer pattern not only potentially increases a higher segmentation accuracy because of avoiding interpolation, but also significantly decreases the storage requirement and computation time in comparison with standard R, G, B images. In this paper, we propose BayerSeg-Net to segment single channel raw Bayer image directly. Different from RGB color images that already contain neighboring context information during ISP color interpolation, each pixel in raw Bayer images does not contain any context clues. Based on Bayer pattern properties, BayerSeg-Net assigns dynamic attention on Bayer images' spectral frequency and spatial locations to mitigate classification confusion, and proposes a re-sampling strategy to capture both global and local contextual information. We demonstrate the usability of raw Bayer images in segmentation tasks and the efficiency of BayerSeg-Net on multiple datasets.more » « less
An official website of the United States government

