NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Asynchronous and Distributed Data Augmentation for Massive Data Settings

https://doi.org/10.1080/10618600.2022.2130928

Zhou, Jiayuan; Khare, Kshitij; Srivastava, Sanvesh (July 2023, Journal of Computational and Graphical Statistics)

Full Text Available
Divide-and-conquer Bayesian inference in hidden Markov models

https://doi.org/10.1214/23-EJS2118

Wang, Chunlei; Srivastava, Sanvesh (January 2023, Electronic Journal of Statistics)

Full Text Available
Distributed Bayesian Inference in Massive Spatial Data

https://doi.org/10.1214/22-STS868

Guhaniyogi, Rajarshi; Li, Cheng; Savitsky, Terrance; Srivastava, Sanvesh (January 2023, Statistical Science)

Full Text Available
An algorithm for distributed Bayesian inference

https://doi.org/10.1002/sta4.432

Shyamalkumar, Nariankadu D.; Srivastava, Sanvesh (April 2022, Stat)

Monte Carlo algorithms, such as Markov chain Monte Carlo (MCMC) and Hamiltonian Monte Carlo (HMC), are routinely used for Bayesian inference; however, these algorithms are prohibitively slow in massive data settings because they require multiple passes through the full data in every iteration. Addressing this problem, we develop a scalable extension of these algorithms using the divide‐and‐conquer (D&C) technique that divides the data into a sufficiently large number of subsets, draws parameters in parallel on the subsets using apoweredlikelihood and produces Monte Carlo draws of the parameter by combining parameter draws obtained from each subset. The combined parameter draws play the role of draws from the original sampling algorithm. Our main contributions are twofold. First, we demonstrate through diverse simulated and real data analyses focusing on generalized linear models (GLMs) that our distributed algorithm delivers comparable results as the current state‐of‐the‐art D&C algorithms in terms of statistical accuracy and computational efficiency. Second, providing theoretical support for our empirical observations, we identify regularity assumptions under which the proposed algorithm leads to asymptotically optimal inference. We also provide illustrative examples focusing on normal linear and logistic regressions where parts of our D&C algorithm are analytically tractable.
more » « less
Inbreeding Depression in Genotypically Matched Diploid and Tetraploid Maize

https://doi.org/10.3389/fgene.2020.564928

Yao, Hong; Srivastava, Sanvesh; Swyers, Nathan; Han, Fangpu; Doerge, Rebecca W.; Birchler, James A. (November 2020, Frontiers in Genetics)

The genetic and molecular basis of heterosis has long been studied but without a consensus about mechanism. The opposite effect, inbreeding depression, results from repeated self-pollination and leads to a reduction in vigor. A popular explanation for this reaction is the homozygosis of recessive, slightly deleterious alleles upon inbreeding. However, extensive studies in alfalfa indicated that inbreeding between diploids and autotetraploids was similar despite the fact that homozygosis of alleles would be dramatically different. The availability of tetraploid lines of maize generated directly from various inbred lines provided the opportunity to examine this issue in detail in perfectly matched diploid and tetraploid hybrids and their parallel inbreeding regimes. Identical hybrids at the diploid and tetraploid levels were inbred in triplicate for seven generations. At the conclusion of this regime, F1 hybrids and selected representative generations (S1, S3, S5, S7) were characterized phenotypically in randomized blocks during the same field conditions. Quantitative measures of the multiple generations of inbreeding provided little evidence for a distinction in the decline of vigor between the diploids and the tetraploids. The results suggest that the homozygosis of completely recessive, slightly deleterious alleles is an inadequate hypothesis to explain inbreeding depression in general.
more » « less
Full Text Available
Robust and Scalable Bayes via a Median of Subset Posterior Measures

Minsker, Stanislav; Srivastava, Sanvesh; Lin, Lizhen; Dunson, David (January 2017, Journal of machine learning research)

We propose a novel approach to Bayesian analysis that is provably robust to outliers in the data and often has computational advantages over standard methods. Our technique is based on splitting the data into non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the resulting measures. The main novelty of our approach is the proposed aggregation step, which is based on the evaluation of a median in the space of probability measures equipped with a suitable collection of distances that can be quickly and efficiently evaluated in practice. We present both theoretical and numerical evidence illustrating the improvements achieved by our method.
more » « less
Full Text Available

Search for: All records