- Award ID(s):
- NSF-PAR ID:
- Date Published:
- Journal Name:
- Proceedings of the 38th International Conference on Machine Learning
- Page Range / eLocation ID:
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Deep neural networks have emerged as very successful tools for image restoration and reconstruction tasks. These networks are often trained end-to-end to directly reconstruct an image from a noisy or corrupted measurement of that image. To achieve state-of-the-art performance, training on large and diverse sets of images is considered critical. However, it is often difficult and/or expensive to collect large amounts of training images. Inspired by the success of Data Augmentation (DA) for classification problems, in this paper, we propose a pipeline for data augmentation for accelerated MRI reconstruction and study its effectiveness at reducing the required training data in a variety of settings. Our DA pipeline, MRAugment, is specifically designed to utilize the invariances present in medical imaging measurements as naive DA strategies that neglect the physics of the problem fail. Through extensive studies on multiple datasets we demonstrate that in the low-data regime DA prevents overfitting and can match or even surpass the state of the art while using significantly fewer training data, whereas in the high-data regime it has diminishing returns. Furthermore, our findings show that DA can improve the robustness of the model against various shifts in the test distribution.more » « less
Data augmentation (DA) is an essential technique for training state-of-the-art deep learning systems. In this paper, we empirically show that the standard data augmentation methods may introduce distribution shift and consequently hurt the performance on unaugmented data during inference. To alleviate this issue, we propose a simple yet effective approach, dubbed KeepAugment, to increase the fidelity of augmented images. The idea is to use the saliency map to detect important regions on the original images and preserve these informative regions during augmentation. This information-preserving strategy allows us to generate more faithful training examples. Empirically, we demonstrate that our method significantly improves upon a number of prior art data augmentation schemes, e.g. AutoAugment, Cutout, random erasing, achieving promising results on image classification, semi-supervised image classification, multi-view multi-camera tracking and object detection.more » « less
Morphological analyses with flatmount fluorescent images are essential to retinal pigment epithelial (RPE) aging studies and thus require accurate RPE cell segmentation. Although rapid technology advances in deep learning semantic segmentation have achieved great success in many biomedical research, the performance of these supervised learning methods for RPE cell segmentation is still limited by inadequate training data with high-quality annotations.
To address this problem, we develop a Self-Supervised Semantic Segmentation (S4) method that utilizes a self-supervised learning strategy to train a semantic segmentation network with an encoder–decoder architecture. We employ a reconstruction and a pairwise representation loss to make the encoder extract structural information, while we create a morphology loss to produce the segmentation map. In addition, we develop a novel image augmentation algorithm (AugCut) to produce multiple views for self-supervised learning and enhance the network training performance. To validate the efficacy of our method, we applied our developed S4 method for RPE cell segmentation to a large set of flatmount fluorescent microscopy images, we compare our developed method for RPE cell segmentation with other state-of-the-art deep learning approaches. Compared with other state-of-the-art deep learning approaches, our method demonstrates better performance in both qualitative and quantitative evaluations, suggesting its promising potential to support large-scale cell morphological analyses in RPE aging investigations.
Availability and implementation
The codes and the documentation are available at: https://github.com/jkonglab/S4_RPE.
Andreas Krause, Emma Brunskill (Ed.)Differentially private (DP) machine learning techniques are notorious for their degradation of model utility (e.g., they degrade classification accuracy). A recent line of work has demonstrated that leveraging public data can improve the trade-off between privacy and utility when training models with DP guaranteed. In this work, we further explore the potential of using public data in DP models, showing that utility gains can in fact be significantly higher than what shown in prior works. Specifically, we introduce DOPE-SGD, a modified DP-SGD algorithm that leverages public data during its training. DOPE-SGD uses public data in two complementary ways: (1) it uses advance augmentation techniques that leverages public data to generate synthetic data that is effectively embedded in multiple steps of the training pipeline; (2) it uses a modified gradient clipping mechanism (which is a standard technique in DP training) to change the origin of gradient vectors using the information inferred from available public and synthetic data, therefore boosting utility. We also introduce a technique to ensemble intermediate DP models by leveraging the post processing property of differential privacy to further improve the accuracy of the predictions. Our experimental results demonstrate the effectiveness of our approach in improving the state-of-the-art in DP machine learning across multiple datasets, network architectures, and application domains. For instance, assuming access to 2,000 public images, and for a privacy budget of 𝜀=2,𝛿=10−5, our technique achieves an accuracy of 75.1 on CIFAR10, significantly higher than 68.1 achieved by the state of the art.more » « less
null (Ed.)The classification of Antibody Mediated Rejection (AMR) in kidney transplant remains challenging even for experienced nephropathologists; this is partly because histological tissue stain analysis is often characterized by low inter-observer agreement and poor reproducibility. One of the implicated causes for inter-observer disagreement is the variability of tissue stain quality between (and within) pathology labs, coupled with the gradual fading of archival sections. Variations in stain colors and intensities can make tissue evaluation difficult for pathologists, ultimately affecting their ability to describe relevant morphological features. Being able to accurately predict the AMR status based on kidney histology images is crucial for improving patient treatment and care. We propose a novel pipeline to build robust deep neural networks for AMR classification based on StyPath, a histological data augmentation technique that leverages a light weight style-transfer algorithm as a means to reduce sample-specific bias. Each image was generated in 1.84 ± 0.03 s using a single GTX TITAN V gpu and pytorch, making it faster than other popular histological data augmentation techniques. We evaluated our model using a Monte Carlo (MC) estimate of Bayesian performance and generate an epistemic measure of uncertainty to compare both the baseline and StyPath augmented models. We also generated Grad-CAM representations of the results which were assessed by an experienced nephropathologist; we used this qualitative analysis to elucidate on the assumptions being made by each model. Our results imply that our style-transfer augmentation technique improves histological classification performance (reducing error from 14.8% to 11.5%) and generalization ability.more » « less