skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 1, 2026

Title: A Co‐Segmentation Algorithm to Predict Emotional Stress From Passively Sensed mHealth Data
ABSTRACT We develop a data‐driven cosegmentation algorithm of passively sensed and self‐reported active variables collected through smartphones to identify emotionally stressful states in middle‐aged and older patients with mood disorders undergoing therapy, some of whom also have chronic pain. Our method leverages the association between the different types of time series. These data are typically nonstationary, with meaningful associations often occurring only over short time windows. Traditional machine learning (ML) methods, when applied globally on the entire time series, often fail to capture these time‐varying local patterns. Our approach first segments the passive sensing variables by detecting their change points, then examines segment‐specific associations with the active variable to identify cosegmented periods that exhibit distinct relationships between stress and passively sensed measures. We then use these periods to predict future emotional stress states using standard ML methods. By shifting the unit of analysis from individual time points to data‐driven segments of time and allowing for different associations in different segments, our algorithm helps detect patterns that only exist within short‐time windows. We apply our method to detect periods of stress in patient data collected during ALACRITY Phase I study. Our findings indicate that the data‐driven segmentation algorithm identifies stress periods more accurately than traditional ML methods that do not incorporate segmentation.  more » « less
Award ID(s):
2239102
PAR ID:
10615719
Author(s) / Creator(s):
; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Statistics in Medicine
Volume:
44
Issue:
10-12
ISSN:
0277-6715
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Change point detection is widely used for finding transitions between states of data generation within a time series. Methods for change point detection currently assume this transition is instantaneous and therefore focus on finding a single point of data to classify as a change point. However, this assumption is flawed because many time series actually display short periods of transitions between different states of data generation. Previous work has shown Bayesian Online Change Point Detection (BOCPD) to be the most effective method for change point detection on a wide range of different time series. This paper explores adapting the change point detection algorithms to detect abrupt changes over short periods of time. We design a segment-based mechanism to examine a window of data points within a time series, rather than a single data point, to determine if the window captures abrupt change. We test our segment-based Bayesian change detection algorithm on 36 different time series and compare it to the original BOCPD algorithm. Our results show that, for some of these 36 time series, the segment-based approach for detecting abrupt changes can much more accurately identify change points based on standard metrics. 
    more » « less
  2. Traffic shockwaves demonstrate the formation and spreading of traffic fluctuation on roads. Existing methods mainly detect the shockwaves and their propagation by estimating traffic density and flow, which presents weaknesses in applications when traffic data is only partially or locally collected. This paper proposed a four-step data-driven approach that integrates machine learning with the traffic features to detect shockwaves and estimate their propagation speeds only using partial vehicle trajectory data. Specifically, we first denoise the speed data derived from trajectory data by the Fast Fourier Transform (FFT) to mitigate the effect of spontaneous random speed fluctuation. Next, we identify trajectory curves’ turning points where a vehicle runs into a shockwave and its speed presents a high standard deviation within a short interval. Furthermore, the Density-based Spatial Clustering of Applications with Noise algorithm (DBSCAN) combined with traffic flow features is adopted to split the turning points into different clusters, each corresponding to a shockwave with constant speed. Last, the one-norm distance regression method is used to estimate the propagation speed of detected shockwaves. The proposed framework was applied to the field data collected from the I-80 and US-101 freeway by the Next Generation Simulation (NGSIM) program. The results show that this four-step data-driven method could efficiently detect the shockwaves and their propagation speeds without estimating the traffic densities and flows nearby. It performs well for both homogenous and nonhomogeneous road segments with trajectory data collected from total or partial traffic flow. 
    more » « less
  3. Modern smart cities need smart transportation solutions to quickly detect various traffic emergencies and incidents in the city to avoid cascading traffic disruptions. To materialize this, roadside units and ambient transportation sensors are being deployed to collect speed data that enables the monitoring of traffic conditions on each road segment. In this paper, we first propose a scalable data-driven anomaly-based traffic incident detection framework for a city-scale smart transportation system. Specifically, we propose an incremental region growing approximation algorithm for optimal Spatio-temporal clustering of road segments and their data; such that road segments are strategically divided into highly correlated clusters. The highly correlated clusters enable identifying a Pythagorean Mean-based invariant as an anomaly detection metric that is highly stable under no incidents but shows a deviation in the presence of incidents. We learn the bounds of the invariants in a robust manner such that anomaly detection can generalize to unseen events, even when learning from real noisy data. Second, using cluster-level detection, we propose a folded Gaussian classifier to pinpoint the particular segment in a cluster where the incident happened in an automated manner. We perform extensive experimental validation using mobility data collected from four cities in Tennessee, compare with the state-of-the-art ML methods, to prove that our method can detect incidents within each cluster in real-time and outperforms known ML methods. 
    more » « less
  4. This work proposes an Adaptive Fuzzy Prediction (AFP) method for the attenuation time series in Commercial Microwave links (CMLs). Time-series forecasting models regularly rely on the assumption that the entire data set follows the same Data Generating Process (DGP). However, the signals in wireless microwave links are severely affected by the varying weather conditions in the channel. Consequently, the attenuation time series might change its characteristics significantly at different periods. We suggest an adaptive framework to better employ the training data by grouping sequences with related temporal patterns to consider the non-stationary nature of the signals. The focus in this work is two-folded. The first is to explore the integration of static data of the CMLs as exogenous variables for the attenuation time series models to adopt diverse link characteristics. This extension allows to include various attenuation datasets obtained from additional CMLs in the training process and dramatically increasing available training data. The second is to develop an adaptive framework for short-term attenuation forecasting by employing an unsupervised fuzzy clustering procedure and supervised learning models. We empirically analyzed our framework for model and data-driven approaches with Recurrent Neural Network (RNN) and Autoregressive Integrated Moving Average (ARIMA) variations. We evaluate the proposed extensions on real-world measurements collected from 4G backhaul networks, considering dataset availability and the accuracy for 60 seconds prediction. We show that our framework can significantly improve conventional models’ accuracy and that incorporating data from various CMLs is essential to the AFP framework. The proposed methods have been shown to enhance the forecasting model’s performance by 30 − 40%, depending on the specific model and the data availability. 
    more » « less
  5. This work proposes an Adaptive Fuzzy Prediction (AFP) method for the attenuation time series in Commercial Microwave links (CMLs). Time-series forecasting models regularly rely on the assumption that the entire data set follows the same Data Generating Process (DGP). However, the signals in wireless microwave links are severely affected by the varying weather conditions in the channel. Consequently, the attenuation time series might change its characteristics significantly at different periods. We suggest an adaptive framework to better employ the training data by grouping sequences with related temporal patterns to consider the non-stationary nature of the signals. The focus in this work is two-folded. The first is to explore the integration of static data of the CMLs as exogenous variables for the attenuation time series models to adopt diverse link characteristics. This extension allows to include various attenuation datasets obtained from additional CMLs in the training process and dramatically increasing available training data. The second is to develop an adaptive framework for short-term attenuation forecasting by employing an unsupervised fuzzy clustering procedure and supervised learning models. We empirically analyzed our framework for model and data-driven approaches with Recurrent Neural Network (RNN) and Autoregressive Integrated Moving Average (ARIMA) variations. We evaluate the proposed extensions on real-world measurements collected from 4G backhaul networks, considering dataset availability and the accuracy for 60 seconds prediction. We show that our framework can significantly improve conventional models’ accuracy and that incorporating data from various CMLs is essential to the AFP framework. The proposed methods have been shown to enhance the forecasting model’s performance by 30 − 40%, depending on the specific model and the data availability. 
    more » « less