skip to main content


Title: An Error Analysis Toolkit for Binned Counting Experiments
We introduce the MINERvA Analysis Toolkit (MAT), a utility for centralizing the handling of systematic uncertainties in HEP analyses. The fundamental utilities of the toolkit are the MnvHnD, a powerful histogram container class, and the systematic Universe classes, which provide a modular implementation of the many universe error analysis approach. These products can be used stand-alone or as part of a complete error analysis prescription. They support the propagation of systematic uncertainty through all stages of analysis, and provide flexibility for an arbitrary level of user customization. This extensible solution to error analysis enables the standardization of systematic uncertainty definitions across an experiment and a transparent user interface to lower the barrier to entry for new analyzers.  more » « less
Award ID(s):
1806849
NSF-PAR ID:
10356300
Author(s) / Creator(s):
; ;
Editor(s):
Biscarat, C.; Campana, S.; Hegner, B.; Roiser, S.; Rovelli, C.I.; Stewart, G.A.
Date Published:
Journal Name:
EPJ Web of Conferences
Volume:
251
ISSN:
2100-014X
Page Range / eLocation ID:
03046
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We present constraints on cosmological parameters from the Pantheon+ analysis of 1701 light curves of 1550 distinct Type Ia supernovae (SNe Ia) ranging in redshift from z = 0.001 to 2.26. This work features an increased sample size from the addition of multiple cross-calibrated photometric systems of SNe covering an increased redshift span, and improved treatments of systematic uncertainties in comparison to the original Pantheon analysis, which together result in a factor of 2 improvement in cosmological constraining power. For a flat ΛCDM model, we find Ω M = 0.334 ± 0.018 from SNe Ia alone. For a flat w 0 CDM model, we measure w 0 = −0.90 ± 0.14 from SNe Ia alone, H 0 = 73.5 ± 1.1 km s −1 Mpc −1 when including the Cepheid host distances and covariance (SH0ES), and w 0 = − 0.978 − 0.031 + 0.024 when combining the SN likelihood with Planck constraints from the cosmic microwave background (CMB) and baryon acoustic oscillations (BAO); both w 0 values are consistent with a cosmological constant. We also present the most precise measurements to date on the evolution of dark energy in a flat w 0 w a CDM universe, and measure w a = − 0.1 − 2.0 + 0.9 from Pantheon+ SNe Ia alone, H 0 = 73.3 ± 1.1 km s −1 Mpc −1 when including SH0ES Cepheid distances, and w a = − 0.65 − 0.32 + 0.28 when combining Pantheon+ SNe Ia with CMB and BAO data. Finally, we find that systematic uncertainties in the use of SNe Ia along the distance ladder comprise less than one-third of the total uncertainty in the measurement of H 0 and cannot explain the present “Hubble tension” between local measurements and early universe predictions from the cosmological model. 
    more » « less
  2. Abstract

    Satellite precipitation products, as all quantitative estimates, come with some inherent degree of uncertainty. To associate a quantitative value of the uncertainty to each individual estimate, error modeling is necessary. Most of the error models proposed so far compute the uncertainty as a function of precipitation intensity only, and only at one specific spatiotemporal scale. We propose a spectral error model that accounts for the neighboring space–time dynamics of precipitation into the uncertainty quantification. Systematic distortions of the precipitation signal and random errors are characterized distinctively in every frequency–wavenumber band in the Fourier domain, to accurately characterize error across scales. The systematic distortions are represented as a deterministic space–time linear filtering term. The random errors are represented as a nonstationary additive noise. The spectral error model is applied to the IMERG multisatellite precipitation product, and its parameters are estimated empirically through a system identification approach using the GV-MRMS gauge–radar measurements as reference (“truth”) over the eastern United States. The filtering term is found to be essentially low-pass (attenuating the fine-scale variability). While traditional error models attribute most of the error variance to random errors, it is found here that the systematic filtering term explains 48% of the error variance at the native resolution of IMERG. This fact confirms that, at high resolution, filtering effects in satellite precipitation products cannot be ignored, and that the error cannot be represented as a purely random additive or multiplicative term. An important consequence is that precipitation estimates derived from different sources shall not be expected to automatically have statistically independent errors.

    Significance Statement

    Satellite precipitation products are nowadays widely used for climate and environmental research, water management, risk analysis, and decision support at the local, regional, and global scales. For all these applications, knowledge about the accuracy of the products is critical for their usability. However, products are not systematically provided with a quantitative measure of the uncertainty associated with each individual estimate. Various parametric error models have been proposed for uncertainty quantification, mostly assuming that the uncertainty is only a function of the precipitation intensity at the pixel and time of interest. By projecting satellite precipitation fields and their retrieval errors into the Fourier frequency–wavenumber domain, we show that we can explicitly take into account the neighboring space–time multiscale dynamics of precipitation and compute a scale-dependent uncertainty.

     
    more » « less
  3. ABSTRACT

    Cosmological analyses of samples of photometrically identified type Ia supernovae (SNe Ia) depend on understanding the effects of ‘contamination’ from core-collapse and peculiar SN Ia events. We employ a rigorous analysis using the photometric classifier SuperNNova on state-of-the-art simulations of SN samples to determine cosmological biases due to such ‘non-Ia’ contamination in the Dark Energy Survey (DES) 5-yr SN sample. Depending on the non-Ia SN models used in the SuperNNova training and testing samples, contamination ranges from 0.8 to 3.5 per cent, with a classification efficiency of 97.7–99.5 per cent. Using the Bayesian Estimation Applied to Multiple Species (BEAMS) framework and its extension BBC (‘BEAMS with Bias Correction’), we produce a redshift-binned Hubble diagram marginalized over contamination and corrected for selection effects, and use it to constrain the dark energy equation-of-state, w. Assuming a flat universe with Gaussian ΩM prior of 0.311 ± 0.010, we show that biases on w are <0.008 when using SuperNNova, with systematic uncertainties associated with contamination around 10 per cent of the statistical uncertainty on w for the DES-SN sample. An alternative approach of discarding contaminants using outlier rejection techniques (e.g. Chauvenet’s criterion) in place of SuperNNova leads to biases on w that are larger but still modest (0.015–0.03). Finally, we measure biases due to contamination on w0 and wa (assuming a flat universe), and find these to be <0.009 in w0 and <0.108 in wa, 5 to 10 times smaller than the statistical uncertainties for the DES-SN sample.

     
    more » « less
  4. ABSTRACT

    We evaluate the consistency between lensing and clustering based on measurements from Baryon Oscillation Spectroscopic Survey combined with galaxy–galaxy lensing from Dark Energy Survey (DES) Year 3, Hyper Suprime-Cam Subaru Strategic Program (HSC) Year 1, and Kilo-Degree Survey (KiDS)-1000. We find good agreement between these lensing data sets. We model the observations using the Dark Emulator and fit the data at two fixed cosmologies: Planck (S8 = 0.83), and a Lensing cosmology (S8 = 0.76). For a joint analysis limited to large scales, we find that both cosmologies provide an acceptable fit to the data. Full utilization of the higher signal-to-noise small-scale measurements is hindered by uncertainty in the impact of baryon feedback and assembly bias, which we account for with a reasoned theoretical error budget. We incorporate a systematic inconsistency parameter for each redshift bin, A, that decouples the lensing and clustering. With a wide range of scales, we find different results for the consistency between the two cosmologies. Limiting the analysis to the bins for which the impact of the lens sample selection is expected to be minimal, for the Lensing cosmology, the measurements are consistent with A = 1; A = 0.91 ± 0.04 (A = 0.97 ± 0.06) using DES+KiDS (HSC). For the Planck case, we find a discrepancy: A = 0.79 ± 0.03 (A = 0.84 ± 0.05) using DES+KiDS (HSC). We demonstrate that a kinematic Sunyaev–Zeldovich-based estimate for baryonic effects alleviates some of the discrepancy in the Planck cosmology. This analysis demonstrates the statistical power of small-scale measurements; however, caution is still warranted given modelling uncertainties and foreground sample selection effects.

     
    more » « less
  5. Sparse tensor factorization is a popular tool in multi-way data analysis and is used in applications such as cybersecurity, recommender systems, and social network analysis. In many of these applications, the tensor is not known a priori and instead arrives in a streaming fashion for a potentially unbounded amount of time. Existing approaches for streaming sparse tensors are not practical for unbounded streaming because they rely on maintaining the full factorization of the data, which grows linearly with time. In this work, we present CP-stream, an algorithm for streaming factorization in the model of the canonical polyadic decomposition which does not grow linearly in time or space, and is thus practical for long-term streaming. Additionally, CP-stream incorporates user-specified constraints such as non-negativity which aid in the stability and interpretability of the factorization. An evaluation of CP-stream demonstrates that it converges faster than state-of-the-art streaming algorithms while achieving lower reconstruction error by an order of magnitude. We also evaluate it on real-world sparse datasets and demonstrate its usability in both network traffic analysis and discussion tracking. Our evaluation uses exclusively public datasets and our source code is released to the public as part of SPLATT, an open source high-performance tensor factorization toolkit. 
    more » « less