skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 11, 2026

Title: Aligning Time-series by Local Trends: Applications in Public Health
Individual models of infectious diseases or trajectories coming from different simulations may vary considerably, making it challenging for public communication and supporting policy-making. Therefore, it is common in public health to first create a consensus across multiple models and simulations through ensembling. However, current methods are limited to mean and median ensembles that perform aggregation of scale (cases, hospitalizations, deaths) along the time axis, which often misrepresents the underlying trajectories -- e.g., they underrepresent the peak. Instead, we wish to create an ensemble that represents aggregation simultaneously over both time and scale and thus better preserves the properties of the trajectories. This is particularly useful for public health where time-series have a sequence of meaningful local trends that are ordered, e.g., a surge to an increase to a peak to a decrease. We propose a novel alignment method DTW+SBA, which combines a representation of local trends along with dynamic time warping barycenter averaging. We prove key properties of this method that ensure appropriate alignment based on local trends. We demonstrate on real multi-model outputs that our approach preserves the properties of underlying trajectories. We also show that our alignment leads to a more sensible clustering of epidemic trajectories.  more » « less
Award ID(s):
2333494 2223933 2135784
PAR ID:
10590699
Author(s) / Creator(s):
Publisher / Repository:
Proceedings of the AAAI Conference on Artificial Intelligence
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
39
Issue:
27
ISSN:
2159-5399
Page Range / eLocation ID:
28405 to 28412
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Phenological shifts due to climate change have been extensively studied in plants and animals. Yet, the responses of fungal spores—organisms important to ecosystems and major airborne allergens—remain understudied. This knowledge gap limits our understanding of their ecological and public health implications. To address this, we analyzed a long‐term (2003–2022), large‐scale (the continental US) data set of airborne fungal spores collected by the US National Allergy Bureau. We first pre‐processed the spore data by gap‐filling and smoothing. Afterward, we extracted 10 metrics describing the phenology (e.g., start and end of season) and intensity (e.g., peak concentration and integral) of fungal spore seasons. These metrics were derived using two complementary but not mutually exclusive approaches—ecological and public health approaches, defined as percentiles of total spore concentration and allergenic thresholds of spore concentration, respectively. Using linear mixed‐effects models, we quantified annual shifts in these metrics across the continental US. We revealed a significant advancement in the onset of the spore seasons defined in both ecological (11 days, 95% confidence interval: 0.4–23 days) and public health (22 days, 6–38 days) approaches over two decades. Meanwhile, total spore concentrations in an annual cycle and in a spore allergy season tended to decrease over time. The earlier start of the spore season was significantly correlated with climatic variables, such as warmer temperatures and altered precipitations. Overall, our findings suggest possible climate‐driven advanced fungal spore seasons, highlighting the importance of climate change mitigation and adaptation in public health decision‐making. 
    more » « less
  2. Probabilistic predictions support public health planning and decision making, especially in infectious disease emergencies. Aggregating outputs from multiple models yields more robust predictions of outcomes and associated uncertainty. While the selection of an aggregation method can be guided by retrospective performance evaluations, this is not always possible. For example, if predictions are conditional on assumptions about how the future will unfold (e.g. possible interventions), these assumptions may never materialize, precluding any direct comparison between predictions and observations. Here, we summarize literature on aggregating probabilistic predictions, illustrate various methods for infectious disease predictions via simulation, and present a strategy for choosing an aggregation method when empirical validation cannot be used. We focus on the linear opinion pool (LOP) and Vincent average, common methods that make different assumptions about between-prediction uncertainty. We contend that assumptions of the aggregation method should align with a hypothesis about how uncertainty is expressed within and between predictions from different sources. The LOP assumes that between-prediction uncertainty is meaningful and should be retained, while the Vincent average assumes that between-prediction uncertainty is akin to sampling error and should not be preserved. We provide an R package for implementation. Given the rising importance of multi-model infectious disease hubs, our work provides useful guidance on aggregation and a deeper understanding of the benefits and risks of different approaches. 
    more » « less
  3. Data visualizations can reveal trends and patterns that are not otherwise obvious from the raw data or summary statistics. While visualizing low-dimensional data is relatively straightforward (for example, plotting the change in a variable over time as (x,y) coordinates on a graph), it is not always obvious how to visualize high-dimensional datasets in a similarly intuitive way. Here we present HypeTools, a Python toolbox for visualizing and manipulating large, high-dimensional datasets. Our primary approach is to use dimensionality reduction techniques (Pearson, 1901; Tipping & Bishop, 1999) to embed high-dimensional datasets in a lower-dimensional space, and plot the data using a simple (yet powerful) API with many options for data manipulation [e.g. hyperalignment (Haxby et al., 2011), clustering, normalizing, etc.] and plot styling. The toolbox is designed around the notion of data trajectories and point clouds. Just as the position of an object moving through space can be visualized as a 3D trajectory, HyperTools uses dimensionality reduction algorithms to create similar 2D and 3D trajectories for time series of high-dimensional observations. The trajectories may be plotted as interactive static plots or visualized as animations. These same dimensionality reduction and alignment algorithms can also reveal structure in static datasets (e.g. collections of observations or attributes). We present several examples showcasing how using our toolbox to explore data through trajectories and low-dimensional embeddings can reveal deep insights into datasets across a wide variety of domains. 
    more » « less
  4. This paper develops the partial trajectory method to align the views from successive fixed cameras that are used for video-based vehicle tracking across multiple camera views. The method is envisioned to serve as a validation tool of whatever alignment has already been performed between the cameras to ensure high fidelity with the actual vehicle movements as they cross the boundaries between cameras. The strength of the method is that it operates on the output of vehicle tracking in each camera rather than secondary features visible in the camera view that are unrelated to the traffic dynamics (e.g., fixed fiducial points). Thereby providing a direct feedback path from the tracking to ensure the quality of the alignment in the context of the traffic dynamics. The method uses vehicle trajectories within successive camera views along a freeway to deduce the presence of an overlap or a gap between those cameras and quantify how large the overlap or gap is. The partial trajectory method can also detect scale factor errors between successive cameras. If any error is detected, ideally one would redo the original camera alignment, if that is not possible, one could use the calculations from the algorithm to post hoc address the existing alignment. This research manually re-extracted the individual vehicle trajectories within each of the seven camera views from the NGSIM I-80 dataset. These trajectories are simply an input to the algorithm. The resulting method transcends the dataset and should be applicable to most methods that seek to extract vehicle trajectories across successive cameras. That said, the results reveal fundamental errors in the NGSIM dataset, including unaccounted for overlap at the boundaries between successive cameras, which leads to systematic speed and acceleration errors at the six camera interfaces. This method also found scale factor errors in the original NGSIM homographies. In response to these findings, we identified a new aerial photo of the NGSIM site and generated new homographies. To evaluate the impact of the partial trajectory method on the actual trajectory data, the manually re-extracted data were projected into the new coordinate system and smoothed. The re-extracted data shows much greater fidelity to the actual vehicle motion. The re-extracted data also tracks the vehicles over a 14% longer distance and adds 23% more vehicles compared to the original NGSIM dataset. As of publication, the re-extracted data from this paper will be released to the research community. 
    more » « less
  5. Abstract Traditional health surveillance methods play a critical role in public health safety but are limited by the data collection speed, coverage, and resource requirements. Wastewater‐based epidemiology (WBE) has emerged as a cost‐effective and rapid tool for detecting infectious diseases through sewage analysis of disease biomarkers. Recent advances in big data analytics have enhanced public health monitoring by enabling predictive modeling and early risk detection. This paper explores the application of machine learning (ML) in WBE data analytics, with a focus on infectious disease surveillance and forecasting. We highlight the advantages of ML‐driven WBE prediction models, including their ability to process multimodal data, predict disease trends, and evaluate policy impacts through scenario simulations. We also examine challenges such as data quality, model interpretability, and integration with existing public health infrastructure. The integration of ML WBE data analytics enables rapid health data collection, analysis, and interpretation that are not feasible in current surveillance approaches. By leveraging ML and WBE, decision makers can reduce cognitive biases and enhance data‐driven responses to public health threats. As global health risks evolve, the synergy between WBE, ML, and data‐driven decision‐making holds significant potential for improving public health outcomes. 
    more » « less