NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Clustering future scenarios based on predicted range maps

https://doi.org/10.1111/2041-210X.14080

Davidow, Matthew; Schafer, Toryn_L J; Merow, Cory; Che‐Castaldo, Judy; Düker, Marie‐Christine; Feng, Emily; Matteson, David S (May 2023, Methods in Ecology and Evolution)

Abstract Predictions of biodiversity trajectories under climate change are crucial in order to act effectively in maintaining the diversity of species. In many ecological applications, future predictions are made under various global warming scenarios, as described by a range of different climate models. We propose a clustering methodology to synthesize and interpret the outputs of these various predictions.We propose an interpretable and flexible two‐step methodology to measure the similarity between predicted species range maps and to cluster the future scenario predictions utilizing a spectral clustering technique. We implement and provide code for this method.We find that clustering based on predicted species range maps is mainly driven by the amount of warming rather than climate model or future scenario. We contrast this with clustering based only on predicted climate variables, which is driven primarily by climate models, that is, scenarios of the same climate model are clustered together, even when the amount of warming input to the models is varied.The differences between species‐based and climate‐based clusterings illustrate that it is crucial to incorporate ecological information to understand the relevant differences between climate models. Our findings can be used to better synthesize forecasts of biodiversity change under the wide spectrum of results that emerge when considering potential future scenarios.
more » « less
Full Text Available
Factor analysis of mixed data for anomaly detection

https://doi.org/10.1002/sam.11585

Davidow, Matthew; Matteson, David S (August 2022, Statistical Analysis and Data Mining: The ASA Data Science Journal)

Abstract Anomaly detection aims to identify observations that deviate from the typical pattern of data. Anomalous observations may correspond to financial fraud, health risks, or incorrectly measured data in practice. We focus on unsupervised detection and the continuous and categorical (mixed) variable case. We show that detecting anomalies in mixed data is enhanced through first embedding the data then assessing an anomaly scoring scheme. We propose a kurtosis‐weightedFactor Analysis of Mixed Datafor anomaly detection to obtain a continuous embedding for anomaly scoring. We illustrate that anomalies are highly separable in the first and last few ordered dimensions of this space, and test various anomaly scoring experiments within this subspace. Results are illustrated for both simulated and real datasets, and the proposed approach is highly accurate for mixed data throughout these diverse scenarios.
more » « less
Full Text Available
Thyroid hormone replacement therapy patterns in pregnant women and perinatal outcomes in the offspring

https://doi.org/10.1002/pds.4927

Frank, Anna S; Lupattelli, Angela; Matteson, David S; Meltzer, Helle Margrete; Nordeng, Hedvig (January 2020, Pharmacoepidemiology and Drug Safety)

Abstract PurposeIt remains unknown to what degree thyroid hormone replacement therapy (THRT) during and initiation after pregnancy determines pregnancy outcomes. The present study primarily aimed to quantify the impact of THRT patterns (including trajectories) on gestational age, birth weight, and head circumference of infants. The secondary aim was to compare results of trajectory with traditional analysis. MethodsWe combined data from the Norwegian Mother, Father and Child Cohort Study (MoBa) to other Norwegian registry data and the Norwegian Environmental Biobank. The study population included 54 020 women enrolled in MoBa in 2005 to 2008. On the basis of prescription records, we classified women into nonhypothyroid (n = 51 390; reference group), THRT after delivery (n = 1397), or medicated (n = 1233) groups. Applying Group‐Based‐Trajectory Models (GBTMs), we determined THRT trajectories among women in the medicated group. Propensity score weighting linked multiple treatment groups to pregnancy outcomes. ResultsPatterns were identified among women using medication during (Decreasing‐Low, Increasing‐Medium, Constant‐Medium, and Constant‐High) and after pregnancy. Women in the Increasing‐Medium (adjusted Odds Ratio [aOR] = 1.69; 95% Confidence Interval [CI], 1.06‐2.73) and the THRT after delivery (aOR = 1.19; 95% CI, 1.01‐1.42) groups had increased risk of giving birth to an LGA infant. In the traditional analysis, only women in the THRT after delivery group showed increased risk for an LGA infant (aOR = 1.19; 95% CI, 1.00‐1.42). We found no other differential effect among the five THRT patterns on the other outcomes. ConclusionsWomen with THRT after delivery or late onset THRT treatment showed increased risk of LGA infants.
more » « less
Full Text Available
Extending balance assessment for the generalized propensity score under multiple imputation

https://doi.org/10.1515/em-2019-0003

Frank, Anna-Simone J; Matteson, David S; Solvang, Hiroko K; Lupattelli, Angela; Nordeng, Hedvig (January 2020, Epidemiologic Methods)

Abstract This manuscript extends the definition of the Absolute Standardized Mean Difference (ASMD) for binary exposure (M = 2) to cases for M > 2 on multiple imputed data sets. The Maximal Maximized Standardized Difference (MMSD) and the Maximal Averaged Standardized Difference (MASD) were proposed. For different percentages, missing data were introduced in covariates in the simulated data based on the missing at random (MAR) assumption. We then investigate the performance of these two metric definitions using simulated data of full and imputed data sets. The performance of the MASD and the MMSD were validated by relating the balance metrics to estimation bias. The results show that there is an association between the balance metrics and bias. The proposed balance diagnostics seem therefore appropriate to assess balance for the generalized propensity score (GPS) under multiple imputation.
more » « less
Full Text Available
Dynamic Shrinkage Processes

https://doi.org/10.1111/rssb.12325

Kowal, Daniel R; Matteson, David S; Ruppert, David (May 2019, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Summary We propose a novel class of dynamic shrinkage processes for Bayesian time series and regression analysis. Building on a global–local framework of prior construction, in which continuous scale mixtures of Gaussian distributions are employed for both desirable shrinkage properties and computational tractability, we model dependence between the local scale parameters. The resulting processes inherit the desirable shrinkage behaviour of popular global–local priors, such as the horseshoe prior, but provide additional localized adaptivity, which is important for modelling time series data or regression functions with local features. We construct a computationally efficient Gibbs sampling algorithm based on a Pólya–gamma scale mixture representation of the process proposed. Using dynamic shrinkage processes, we develop a Bayesian trend filtering model that produces more accurate estimates and tighter posterior credible intervals than do competing methods, and we apply the model for irregular curve fitting of minute-by-minute Twitter central processor unit usage data. In addition, we develop an adaptive time varying parameter regression model to assess the efficacy of the Fama–French five-factor asset pricing model with momentum added as a sixth factor. Our dynamic analysis of manufacturing and healthcare industry data shows that, with the exception of the market risk, no other risk factors are significant except for brief periods.
more » « less
Full Text Available
Cell Line Classification Using Electric Cell-Substrate Impedance Sensing (ECIS)

https://doi.org/10.1515/ijb-2018-0083

Gelsinger, Megan L; Tupper, Laura L; Matteson, David S (June 2019, The International Journal of Biostatistics)

Abstract We present new methods for cell line classification using multivariate time series bioimpedance data obtained from electric cell-substrate impedance sensing (ECIS) technology. The ECIS technology, which monitors the attachment and spreading of mammalian cells in real time through the collection of electrical impedance data, has historically been used to study one cell line at a time. However, we show that if applied to data from multiple cell lines, ECIS can be used to classify unknown or potentially mislabeled cells, factors which have previously been associated with the reproducibility crisis in the biological literature. We assess a range of approaches to this new problem, testing different classification methods and deriving a dictionary of 29 features to characterize ECIS data. Most notably, our analysis enriches the current field by making use of simultaneous multi-frequency ECIS data, where previous studies have focused on only one frequency; using classification methods to distinguish multiple cell lines, rather than simple statistical tests that compare only two cell lines; and assessing a range of features derived from ECIS data based on their classification performance. In classification tests on fifteen mammalian cell lines, we obtain very high out-of-sample predictive accuracy. These preliminary findings provide a baseline for future large-scale studies in this field.
more » « less
Full Text Available
Trend and Variance Adaptive Bayesian Changepoint Analysis and Local Outlier Scoring

https://doi.org/10.1080/07350015.2024.2362269

Wu, Haoxuan; Schafer, Toryn_L J; Matteson, David S (April 2025, Journal of Business & Economic Statistics)

Free, publicly-accessible full text available April 3, 2026
Likelihood Inference for Possibly Nonstationary Processes via Adaptive Overdifferencing

https://doi.org/10.1080/00401706.2025.2453207

Griffin, Maryclare; Samorodnitsky, Gennady; Matteson, David S (March 2025, Technometrics)

Free, publicly-accessible full text available March 10, 2026
Drift versus Shift: Decoupling Trends and Changepoint Analysis

https://doi.org/10.1080/00401706.2024.2365730

Wu, Haoxuan; Schafer, Toryn_L J; Ryan, Sean; Matteson, David S (January 2025, Technometrics)

Full Text Available
Classifying contaminated cell cultures using time series features

https://doi.org/10.1080/02664763.2023.2248413

Tupper, Laura L; Keese, Charles R; Matteson, David S (April 2024, Journal of Applied Statistics)

Full Text Available

« Prev Next »

Search for: All records