skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Monitoring Substance Use with Fitbit Biosignals: A Case Study on Training Deep Learning Models Using Ecological Momentary Assessments and Passive Sensing
Substance use disorders affect 17.3% of Americans. Digital health solutions that use machine learning to detect substance use from wearable biosignal data can eventually pave the way for real-time digital interventions. However, difficulties in addressing severe between-subject data heterogeneity have hampered the adaptation of machine learning approaches for substance use detection, necessitating more robust technological solutions. We tested the utility of personalized machine learning using participant-specific convolutional neural networks (CNNs) enhanced with self-supervised learning (SSL) to detect drug use. In a pilot feasibility study, we collected data from 9 participants using Fitbit Charge 5 devices, supplemented by ecological momentary assessments to collect real-time labels of substance use. We implemented a baseline 1D-CNN model with traditional supervised learning and an experimental SSL-enhanced model to improve individualized feature extraction under limited label conditions. Results: Among the 9 participants, we achieved an average area under the receiver operating characteristic curve score across participants of 0.695 for the supervised CNNs and 0.729 for the SSL models. Strategic selection of an optimal threshold enabled us to optimize either sensitivity or specificity while maintaining reasonable performance for the other metric. Conclusion: These findings suggest that Fitbit data have the potential to enhance substance use monitoring systems. However, the small sample size in this study limits its generalizability to diverse populations, so we call for future research that explores SSL-powered personalization at a larger scale.  more » « less
Award ID(s):
2406251 2516767
PAR ID:
10596261
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
AI
Volume:
5
Issue:
4
ISSN:
2673-2688
Page Range / eLocation ID:
2725 to 2738
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The high impedance fault (HIF) has random, irregular and unsymmetrical characteristics, making such a fault difficult to detect in distribution grids via conventional relay measurements with relatively low resolution and accuracy. This paper proposes a stochastic HIF monitoring and location scheme using high-resolution time-synchronized data in μ-PMUs for distribution network protection. Specifically, we systematically design a process based on feature selections, semi-supervised learning (SSL), and probabilistic learning for fault detection and location. For example, a wrapper method is proposed to leverage output data in feature selection to avoid overfitting and reduce communication demand. To utilize unlabeled data and quantify uncertainties, an SSL-based method is proposed using the Information Theory for fault detection. For location, a probabilistic analysis is proposed via moving window total least square based on the probability distribution of the fault impedance. For numerical validation, we set up an experiment platform based on the real-time simulator, so that the real-time property of μ-PMU can be examined. Such experiment shows enhanced HIF detection and location, when compared to the traditional methods. 
    more » « less
  2. Abstract Insect pests cause significant damage to food production, so early detection and efficient mitigation strategies are crucial. There is a continual shift toward machine learning (ML)‐based approaches for automating agricultural pest detection. Although supervised learning has achieved remarkable progress in this regard, it is impeded by the need for significant expert involvement in labeling the data used for model training. This makes real‐world applications tedious and oftentimes infeasible. Recently, self‐supervised learning (SSL) approaches have provided a viable alternative to training ML models with minimal annotations. Here, we present an SSL approach to classify 22 insect pests. The framework was assessed on raw and segmented field‐captured images using three different SSL methods, Nearest Neighbor Contrastive Learning of Visual Representations (NNCLR), Bootstrap Your Own Latent, and Barlow Twins. SSL pre‐training was done on ResNet‐18 and ResNet‐50 models using all three SSL methods on the original RGB images and foreground segmented images. The performance of SSL pre‐training methods was evaluated using linear probing of SSL representations and end‐to‐end fine‐tuning approaches. The SSL‐pre‐trained convolutional neural network models were able to perform annotation‐efficient classification. NNCLR was the best performing SSL method for both linear and full model fine‐tuning. With just 5% annotated images, transfer learning with ImageNet initialization obtained 74% accuracy, whereas NNCLR achieved an improved classification accuracy of 79% for end‐to‐end fine‐tuning. Models created using SSL pre‐training consistently performed better, especially under very low annotation, and were robust to object class imbalances. These approaches help overcome annotation bottlenecks and are resource efficient. 
    more » « less
  3. Recent studies have demonstrated the effectiveness of fine-tuning self-supervised speech representation models for speech emotion recognition (SER). However, applying SER in real-world environments remains challenging due to pervasive noise. Relying on low-accuracy predictions due to noisy speech can undermine the user’s trust. This paper proposes a unified self-supervised speech representation framework for enhanced speech emotion recognition designed to increase noise robustness in SER while generating enhanced speech. Our framework integrates speech enhancement (SE) and SER tasks, leveraging shared self-supervised learning (SSL)-derived features to improve emotion classification performance in noisy environments. This strategy encourages the SE module to enhance discriminative information for SER tasks. Additionally, we introduce a cascade unfrozen training strategy, where the SSL model is gradually unfrozen and fine-tuned alongside the SE and SER heads, ensuring training stability and preserving the generalizability of SSL representations. This approach demonstrates improvements in SER performance under unseen noisy conditions without compromising SE quality. When tested at a 0 dB signal-to-noise ratio (SNR) level, our proposed method outperforms the original baseline by 3.7% in F1-Macro and 2.7% in F1-Micro scores, where the differences are statistically significant. 
    more » « less
  4. Understanding the behavioral and neural dynamics of social interactions is a goal of contemporary neuroscience. Many machine learning methods have emerged in recent years to make sense of complex video and neurophysiological data that result from these experiments. Less focus has been placed on understanding how animals process acoustic information, including social vocalizations. A critical step to bridge this gap is determining the senders and receivers of acoustic information in social interactions. While sound source localization (SSL) is a classic problem in signal processing, existing approaches are limited in their ability to localize animal-generated sounds in standard laboratory environments. Advances in deep learning methods for SSL are likely to help address these limitations, however there are currently no publicly available models, datasets, or benchmarks to systematically evaluate SSL algorithms in the domain of bioacoustics. Here, we present the VCL Benchmark: the first large-scale dataset for benchmarking SSL algorithms in rodents. We acquired synchronized video and multi-channel audio recordings of 767,295 sounds with annotated ground truth sources across 9 conditions. The dataset provides benchmarks which evaluate SSL performance on real data, simulated acoustic data, and a mixture of real and simulated data. We intend for this benchmark to facilitate knowledge transfer between the neuroscience and acoustic machine learning communities, which have had limited overlap. 
    more » « less
  5. Inspired by the success of Self-Supervised Learning (SSL) in learning visual representations from unlabeled data, a few recent works have studied SSL in the context of Continual Learning (CL), where multiple tasks are learned sequentially, giving rise to a new paradigm, namely Self-Supervised Continual Learning (SSCL). It has been shown that the SSCL outperforms Supervised Continual Learning (SCL) as the learned representations are more informative and robust to catastrophic forgetting. However, building upon the training process of SSL, prior SSCL studies involve training all the parameters for each task, resulting to prohibitively high training cost. In this work, we first analyze the training time and memory consumption and reveals that the backward gradient calculation is the bottleneck. Moreover, by investigating the task correlations in SSCL, we further discover an interesting phenomenon that, with the SSL-learned background model, the intermediate features are highly correlated between tasks. Based on these new finding, we propose a new SSCL method with layer-wise freezing which progressively freezes partial layers with the highest correlation ratios for each task to improve training computation efficiency and memory efficiency. Extensive experiments across multiple datasets are performed, where our proposed method shows superior performance against the SoTA SSCL methods under various SSL frameworks. For example, compared to LUMP, our method achieves 1.18x, 1.15x, and 1.2x GPU training time reduction, 1.65x, 1.61x, and 1.6x memory reduction, 1.46x, 1.44x, and 1.46x backward FLOPs reduction, and 1.31%/1.98%/1.21% forgetting reduction without accuracy degradation on three datasets, respectively. 
    more » « less