skip to main content


Title: Noise-marginalized optimal statistic: A robust hybrid frequentist-Bayesian statistic for the stochastic gravitational-wave background in pulsar timing arrays
NSF-PAR ID:
10066554
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Physical Society
Date Published:
Journal Name:
Physical Review D
Volume:
98
Issue:
4
ISSN:
2470-0010
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dimensionality-reduction methods are a fundamental tool in the analysis of large datasets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of mapping data into a smaller-dimensional space with minimal information loss, dimensionality-reduction techniques implicitly or explicitly provide information about the dimension of the dataset.In this paper, we propose a new statistic that we call the kappa-profile for analysis of large datasets. The kappa-profile arises from a dimensionality-reduction optimization problem: namely that of finding a projection that optimally preserves the secants between points in the dataset. From this optimal projection we extract kappa, the norm of the shortest projected secant from among the set of all normalized secants. This kappa can be computed for any dimension k; thus the tuple of kappa values (indexed by dimension) becomes a kappa-profile. Algorithms such as the Secant-Avoidance Projection algorithm and the Hierarchical Secant-Avoidance Projection algorithm provide a computationally feasible means of estimating the kappa-profile for large datasets, and thus a method of understanding and monitoring their behavior. As we demonstrate in this paper, the kappa-profile serves as a useful statistic in several representative settings: weather data, soundscape data, and dynamical systems data. 
    more » « less
  2. Statistical prediction plays an important role in many decision processes, such as university budgeting (depending on the number of students who will enroll), capital budgeting (depending on the remaining lifetime of a fleet of systems), the needed amount of cash reserves for warranty expenses (depending on the number of warranty returns), and whether a product recall is needed (depending on the number of potentially life-threatening product failures). In statistical inference, likelihood ratios have a long history of use for decision making relating to model parameters (e.g., in evidence-based medicine and forensics). We propose a general prediction method, based on a likelihood ratio (LR) involving both the data and a future random variable. This general approach provides a way to identify prediction interval methods that have excellent statistical properties. For example, if a prediction method can be based on a pivotal quantity, our LR-based method will often identify it. For applications where a pivotal quantity does not exist, the LR-based method provides a procedure with good coverage properties for both continuous or discrete-data prediction applications.

     
    more » « less
  3. Variational autoencoders have been recently proposed for the problem of process monitoring. While these works show impressive results over classical methods, the proposed monitoring statistics often ignore the inconsistencies in learned lower-dimensional representations and computational limitations in high-dimensional approximations. In this work, we first manifest these issues and then overcome them with a novel statistic formulation that increases out-of-control detection accuracy without compromising computational efficiency. We demonstrate our results on a simulation study with explicit control over latent variations, and a real-life example of image profiles obtained from a hot steel rolling process. 
    more » « less
  4. Abstract Joint ranking statistics are used to distinguish real from random coincidences, ideally considering whether shared parameters are consistent with each other as well as whether the individual candidates are distinguishable from noise. We expand on previous works to include additional shared parameters, we use galaxy catalogues as priors for sky localization and distance, and avoid some approximations previously used. We develop methods to calculate this statistic both in low-latency using HEALPix sky maps, as well as with posterior samples. We show that these changes lead to a factor of one to two orders of magnitude improvement for GW170817-GRB 170817A depending on the method used, placing this significant event further into the foreground. We also examined the more tenuous joint candidate GBM-GW150914, which was largely penalized by these methods. Finally, we performed a simplistic simulation that argues these changes could better help distinguish between real and random coincidences in searches, although more realistic simulations are needed to confirm this. 
    more » « less