Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Marschall, Tobias (Ed.)
Abstract Motivation In a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive.
Results We present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA’s linear mixed models and mv-PLINK’s canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits.
Availability and implementation Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.
-
Csikász-Nagy, Attila (Ed.)Differential sensitivity analysis is indispensable in fitting parameters, understanding uncertainty, and forecasting the results of both thought and lab experiments. Although there are many methods currently available for performing differential sensitivity analysis of biological models, it can be difficult to determine which method is best suited for a particular model. In this paper, we explain a variety of differential sensitivity methods and assess their value in some typical biological models. First, we explain the mathematical basis for three numerical methods: adjoint sensitivity analysis, complex perturbation sensitivity analysis, and forward mode sensitivity analysis. We then carry out four instructive case studies. (a) The CARRGO model for tumor-immune interaction highlights the additional information that differential sensitivity analysis provides beyond traditional naive sensitivity methods, (b) the deterministic SIR model demonstrates the value of using second-order sensitivity in refining model predictions, (c) the stochastic SIR model shows how differential sensitivity can be attacked in stochastic modeling, and (d) a discrete birth-death-migration model illustrates how the complex perturbation method of differential sensitivity can be generalized to a broader range of biological models. Finally, we compare the speed, accuracy, and ease of use of these methods. We find that forward mode automatic differentiation has the quickest computational time, while the complex perturbation method is the simplest to implement and the most generalizable.more » « less
-
This paper discusses algorithms for phase retrieval where the measurements follow independent Poisson distributions. We developed an optimization problem based on maximum likelihood estimation (MLE) for the Poisson model and applied Wirtinger flow algorithm to solve it. Simulation results with a random Gaussian sensing matrix and Poisson measurement noise demonstrated that the Wirtinger flow algorithm based on the Poisson model produced higher quality reconstructions than when algorithms derived from Gaussian noise models (Wirtinger flow, Gerchberg Saxton) are applied to such data, with significantly improved computational efficiency.more » « less
-
Summary Nan Laird has an enormous and growing impact on computational statistics. Her paper with Dempster and Rubin on the expectation‐maximisation (EM) algorithm is the second most cited paper in statistics. Her papers and book on longitudinal modelling are nearly as impressive. In this brief survey, we revisit the derivation of some of her most useful algorithms from the perspective of the minorisation‐maximisation (MM) principle. The MM principle generalises the EM principle and frees it from the shackles of missing data and conditional expectations. Instead, the focus shifts to the construction of surrogate functions via standard mathematical inequalities. The MM principle can deliver a classical EM algorithm with less fuss or an entirely new algorithm with a faster rate of convergence. In any case, the MM principle enriches our understanding of the EM principle and suggests new algorithms of considerable potential in high‐dimensional settings where standard algorithms such as Newton's method and Fisher scoring falter.
-
Kelso, Janet (Ed.)Abstract Motivation Current methods for genotype imputation and phasing exploit the volume of data in haplotype reference panels and rely on hidden Markov models (HMMs). Existing programs all have essentially the same imputation accuracy, are computationally intensive and generally require prephasing the typed markers. Results We introduce a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for HMM calculations. This strategy, embodied in our Julia program MendelImpute.jl, avoids explicit assumptions about recombination and population structure while delivering similar prediction accuracy, better memory usage and an order of magnitude or better run-times compared to the fastest competing method. MendelImpute operates on both dosage data and unphased genotype data and simultaneously imputes missing genotypes and phase at both the typed and untyped SNPs (single nucleotide polymorphisms). Finally, MendelImpute naturally extends to global and local ancestry estimation and lends itself to new strategies for data compression and hence faster data transport and sharing. Availability and implementation Software, documentation and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelImpute.jl. Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
Ndeffo Mbah, Martial L (Ed.)The SARS-CoV-2 pandemic led to closure of nearly all K-12 schools in the United States of America in March 2020. Although reopening K-12 schools for in-person schooling is desirable for many reasons, officials understand that risk reduction strategies and detection of cases are imperative in creating a safe return to school. Furthermore, consequences of reclosing recently opened schools are substantial and impact teachers, parents, and ultimately educational experiences in children. To address competing interests in meeting educational needs with public safety, we compare the impact of physical separation through school cohorts on SARS-CoV-2 infections against policies acting at the level of individual contacts within classrooms. Using an age-stratified Susceptible-Exposed-Infected-Removed model, we explore influences of reduced class density, transmission mitigation, and viral detection on cumulative prevalence. We consider several scenarios over a 6-month period including (1) multiple rotating cohorts in which students cycle through in-person instruction on a weekly basis, (2) parallel cohorts with in-person and remote learning tracks, (3) the impact of a hypothetical testing program with ideal and imperfect detection, and (4) varying levels of aggregate transmission reduction. Our mathematical model predicts that reducing the number of contacts through cohorts produces a larger effect than diminishing transmission rates per contact. Specifically, the latter approach requires dramatic reduction in transmission rates in order to achieve a comparable effect in minimizing infections over time. Further, our model indicates that surveillance programs using less sensitive tests may be adequate in monitoring infections within a school community by both keeping infections low and allowing for a longer period of instruction. Lastly, we underscore the importance of factoring infection prevalence in deciding when a local outbreak of infection is serious enough to require reverting to remote learning.more » « less