Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Plasmodium parasites infect thousands of species and provide an exceptional system for studying host- pathogen dynamics, especially for multi-host pathogens. However, understanding these interactions requires an accurate assay of infection. Assessing Plasmodium infections using microscopy on blood smears often misses infections with low parasitemias (the fractions of cells infected), and biases in malaria prevalence estimates will differ among hosts that differ in mean parasitemias. We examined Plasmodium relictum infection and parasitemia using both microscopy of blood smears and quantitative polymerase chain reaction (qPCR) on 299 samples from multiple bird species in Hawai’i and fit models to predict parasitemias from qPCR cycle threshold (Ct) values. We used these models to quantify the extent to which microscopy underestimated infection prevalence and to more accurately estimate infection pat- terns for each species for a large historical study done by microscopy. We found that most qPCR-positive wild-caught birds in Hawaii had low parasitemias (Ct scores 35), which were rarely detected by microscopy. The fraction of infections missed by microscopy differed substantially among eight species due to differences in species’ parasitemia levels. Infection prevalence was likely 4–5-fold higher than previous microscopy estimates for three introduced species, including Zosterops japonicus, Hawaii’s most abundant forest bird, which had low average parasitemias. In contrast, prevalence was likely only 1.5–2.3-fold higher than previous estimates for Himatione sanguinea and Chlorodrepanis virens, two native species with high average parasitemias. Our results indicate that relative patterns of infection among species differ substantially from those observed in previous microscopy studies, and that differences depend on variation in parasitemias among species. Although microscopy of blood smears is useful for estimating the frequency of different Plasmodium stages and host attributes, more sensitive quantitative methods, including qPCR, are needed to accurately estimate and compare infection prevalence among host species.more » « lessFree, publicly-accessible full text available February 1, 2025
-
Graph processes that unfold in continuous time are of obvious theoretical and practical interest. Particularly useful are those whose long-term behavior converges to a graph distribution of known form. Here, we review some of the conditions for such convergence, and provide examples of novel and/or known processes that do so. These include subfamilies of the well-known stochastic actor oriented models, as well as continuum extensions of temporal and separable temporal exponential family random graph models. We also comment on some related threads in the broader work on network dynamics, which provide additional context for the continuous time case.more » « less
-
De Vico Fallani, Fabrizio (Ed.)The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case—in which we observe multiple networks from a common generative process—adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.more » « less
-
Despite the vital role innovation plays in scientific advancement, opportunities to develop innovation skills remain limited, especially for low-income students. Training in innovation principles and processes are frequently extra-curricular pursuits, such as unpaid internships with start-up organizations, shadowing innovation professionals, or obtaining an additional business degree that covers innovation principles. These pursuits often require financial means or connections in the field – both of which are often unavailable to low-income students. Without an academic route in which STEM degree programs are embedded with innovation instruction and exercises, innovation training will remain out of reach for most low-income students. The bridge program engages students in a specially designed 3-credit hour course where 2-credit hours are dedicated to teaching students about innovation and developing their innovative thinking and behaviors. One-credit hour is devoted to student success strategies and developing feelings of being welcome at the university through guest speakers. Outside of class, bridge students participate in cohort building and mentoring activities. The bridge program included 12 NSF S-STEM students as well as 12 non-STEM students, all of which are participating in the Honors College Path Program which is designed to increase retention of underrepresented students. This allowed multidisciplinary collaboration for diversity of thought.more » « less
-
Signal maps are essential for the planning and operation of cellular networks. However, the measurements needed to create such maps are expensive, often biased, not always reflecting the performance metrics of interest, and posing privacy risks. In this paper, we develop a unified framework for predicting cellular performance maps from limited available measurements. Our framework builds on a state-of-the-art random-forest predictor, or any other base predictor. We propose and combine three mechanisms that deal with the fact that not all measurements are equally important for a particular prediction task. First, we design quality-of-service functions (Q), including signal strength (RSRP) but also other metrics of interest to operators, such as number of bars, coverage (improving recall by 76%-92%) and call drop probability (reducing error by as much as 32%). By implicitly altering the loss function employed in learning, quality functions can also improve prediction for RSRP itself where it matters (e.g., MSE reduction up to 27% in the low signal strength regime, where high accuracy is critical). Second, we introduce weight functions (W) to specify the relative importance of prediction at different locations and other parts of the feature space. We propose re-weighting based on importance sampling to obtain unbiased estimators when the sampling and target distributions are different. This yields improvements up to 20% for targets based on spatially uniform loss or losses based on user population density. Third, we apply the Data Shapley framework for the first time in this context: to assign values (ϕ) to individual measurement points, which capture the importance of their contribution to the prediction task. This can improve prediction (e.g., from 64% to 94% in recall for coverage loss) by removing points with negative values and storing only the remaining data points (i.e., as low as 30%), which also has the side-benefit of helping privacy. We evaluate our methods and demonstrate significant improvement in prediction performance, using several real-world datasets.more » « less
-
null (Ed.)Soil science is one of the least diverse subdisciplines within the agricultural, earth, and natural sciences. Representation within soil science does not currently reflect demographic trends in the United States. We synthesize available data on the repre- sentation of historically marginalized groups in soil science in the United States and identify historical mechanisms contributing to these trends. We review education and employment information within academia and the federal government, land-grant university participation, and available Soil Science Society of America (SSSA) mem- bership data to gain insight into the current state of representation within soil sciences and implications for the future of this discipline. Across all domains of diversity, historically marginalized groups are under-represented in soil science. We provide recommendations toward recognizing diversity within the field and improving and encouraging diversity within the SSSA, and suggested responses for both individuals and institutions toward improving diversity, equity, and inclusion.more » « less
-
The uneven spread of COVID-19 has resulted in disparate experiences for marginalized populations in urban centers. Using computational models, we examine the effects of local cohesion on COVID-19 spread in social contact networks for the city of San Francisco, finding that more early COVID-19 infections occur in areas with strong local cohesion. This spatially correlated process tends to affect Black and Hispanic communities more than their non-Hispanic White counterparts. Local social cohesion thus acts as a potential source of hidden risk for COVID-19 infection.more » « less
-
Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of Aβ1–40, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.more » « less