skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The risk of bias in denoising methods: Examples from neuroimaging
Experimental datasets are growing rapidly in size, scope, and detail, but the value of these datasets is limited by unwanted measurement noise. It is therefore tempting to apply analysis techniques that attempt to reduce noise and enhance signals of interest. In this paper, we draw attention to the possibility that denoising methods may introduce bias and lead to incorrect scientific inferences. To present our case, we first review the basic statistical concepts of bias and variance. Denoising techniques typically reduce variance observed across repeated measurements, but this can come at the expense of introducing bias to the average expected outcome. We then conduct three simple simulations that provide concrete examples of how bias may manifest in everyday situations. These simulations reveal several findings that may be surprising and counterintuitive: (i) different methods can be equally effective at reducing variance but some incur bias while others do not, (ii) identifying methods that better recover ground truth does not guarantee the absence of bias, (iii) bias can arise even if one has specific knowledge of properties of the signal of interest. We suggest that researchers should consider and possibly quantify bias before deploying denoising methods on important research data.  more » « less
Award ID(s):
1822929 1822683
PAR ID:
10390352
Author(s) / Creator(s):
Editor(s):
Yap, Pew-Thian
Date Published:
Journal Name:
PLOS ONE
Volume:
17
Issue:
7
ISSN:
1932-6203
Page Range / eLocation ID:
e0270895
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Yap, Pew-Thian (Ed.)
    Diffusion weighted imaging (DWI) with multiple, high b-values is critical for extracting tissue microstructure measurements; however, high b-value DWI images contain high noise levels that can overwhelm the signal of interest and bias microstructural measurements. Here, we propose a simple denoising method that can be applied to any dataset, provided a low-noise, single-subject dataset is acquired using the same DWI sequence. The denoising method uses a one-dimensional convolutional neural network (1D-CNN) and deep learning to learn from a low-noise dataset, voxel-by-voxel. The trained model can then be applied to high-noise datasets from other subjects. We validated the 1D-CNN denoising method by first demonstrating that 1D-CNN denoising resulted in DWI images that were more similar to the noise-free ground truth than comparable denoising methods, e.g., MP-PCA, using simulated DWI data. Using the same DWI acquisition but reconstructed with two common reconstruction methods, i.e. SENSE1 and sum-of-square, to generate a pair of low-noise and high-noise datasets, we then demonstrated that 1D-CNN denoising of high-noise DWI data collected from human subjects showed promising results in three domains: DWI images, diffusion metrics, and tractography. In particular, the denoised images were very similar to a low-noise reference image of that subject, more than the similarity between repeated low-noise images (i.e. computational reproducibility). Finally, we demonstrated the use of the 1D-CNN method in two practical examples to reduce noise from parallel imaging and simultaneous multi-slice acquisition. We conclude that the 1D-CNN denoising method is a simple, effective denoising method for DWI images that overcomes some of the limitations of current state-of-the-art denoising methods, such as the need for a large number of training subjects and the need to account for the rectified noise floor. 
    more » « less
  2. Segata, Nicola (Ed.)
    The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as sequencing protocol, which can make it difficult to predict phenotype and find biomarkers of disease. Supervised methods to correct for background noise, originally designed for gene expression and RNA-seq data, are commonly applied to microbiome data but may be limited because they cannot account for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach that is presently used in other domains but has not been applied to microbiome data to date. We find that the unsupervised principal component correction approach has comparable ability in reducing false discovery of biomarkers as the supervised approaches, with the added benefit of not needing to know the sources of variation apriori. However, in prediction tasks, it appears to only improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses. 
    more » « less
  3. SUMMARY Infrasound sensors are deployed in a variety of spatial configurations and scales for geophysical monitoring, including networks of single sensors and networks of multisensor infrasound arrays. Infrasound signal detection strategies exploiting these data commonly make use of intersensor correlation and coherence (array processing, multichannel correlation); network-based tracking of signal features (e.g. reverse time migration); or a combination of these such as backazimuth cross-bearings for multiple arrays. Single-sensor trace-based denoising techniques offer significant potential to improve all of these various infrasound data processing strategies, but have not previously been investigated in detail. Single-sensor denoising represents a pre-processing step that could reduce the effects of ambient infrasound and wind noise in infrasound signal association and location workflows. We systematically investigate the utility of a range of single-sensor denoising methods for infrasound data processing, including noise gating, non-negative matrix factorization, and data-adaptive Wiener filtering. For the data testbed, we use the relatively dense regional infrasound network in Alaska, which records a high rate of volcanic eruptions with signals varying in power, duration, and waveform and spectral character. We primarily use data from the 2016–2017 Bogoslof volcanic eruption, which included multiple explosions, and synthetics. The Bogoslof volcanic sequence provides an opportunity to investigate regional infrasound detection, association, and location for a set of real sources with varying source spectra subject to anisotropic atmospheric propagation and varying noise levels (both incoherent wind noise and coherent ambient infrasound, primarily microbaroms). We illustrate the advantages and disadvantages of the different denoising methods in categories such as event detection, waveform distortion, the need for manual data labelling, and computational cost. For all approaches, denoising generally performs better for signals with higher signal-to-noise ratios and with less spectral and temporal overlap between signals and noise. Microbaroms are the most globally pervasive and repetitive coherent ambient infrasound noise source, with such noise often referred to as clutter or interference. We find that denoising offers significant potential for microbarom clutter reduction. Single-channel denoising of microbaroms prior to standard array processing enhances both the quantity and bandwidth of detectable volcanic events. We find that reduction of incoherent wind noise is more challenging using the denoising methods we investigate; thus, station hardware (wind noise reduction systems) and site selection remain critical and cannot be replaced by currently available digital denoising methodologies. Overall, we find that adding single-channel denoising as a component in the processing workflow can benefit a variety of infrasound signal detection, association, and location schemes. The denoising methods can also isolate the noise itself, with utility in statistically characterizing ambient infrasound noise. 
    more » « less
  4. While 2D diffusion models generate realistic, high-detail images, 3D shape generation methods like Score Distillation Sampling (SDS) built on these 2D diffusion models produce cartoon-like, over-smoothed shapes. To help explain this discrepancy, we show that the image guidance used in Score Distillation can be understood as the velocity field of a 2D denoising generative process, up to the choice of a noise term. In particular, after a change of variables, SDS resembles a high-variance version of Denoising Diffusion Implicit Models (DDIM) with a differently-sampled noise term: SDS introduces noise i.i.d. randomly at each step, while DDIM infers it from the previous noise predictions. This excessive variance can lead to over-smoothing and unrealistic outputs. We show that a better noise approximation can be recovered by inverting DDIM in each SDS update step. This modification makes SDS's generative process for 2D images almost identical to DDIM. In 3D, it removes over-smoothing, preserves higher-frequency detail, and brings the generation quality closer to that of 2D samplers. Experimentally, our method achieves better or similar 3D generation quality compared to other state-of-the-art Score Distillation methods, all without training additional neural networks or multi-view supervision, and providing useful insights into relationship between 2D and 3D asset generation with diffusion models. 
    more » « less
  5. To understand the delay characteristics of the Internet, a myriad of measurement tools and techniques are proposed by the researchers in academia and industry. Datasets from such measurement tools are curated to facilitate analyses at a later time. Despite the benefits of these tools and datasets, the systematic interpretation of measurements in the face of measurement noise. Unfortunately, state-of-the-art denoising techniques are labor-intensive and ineffective. To tackle this problem, we develop NoMoNoise, an open-source framework for denoising latency measurements by leveraging the recent advancements in weak-supervised learning. NoMoNoise can generate measurement noise labels that could be integrated into the inference and control logic to remove and/or repair noisy measurements in an automated and rapid fashion. We evaluate the efficacy of NoMoNoise in a lab-based setting and a real-world setting by applying it on CAIDA's Ark dataset and show that NoMoNoise can remove noisy measurements effectively with high accuracy. 
    more » « less