skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Overheard: Audio-based Integral Event Inference
There is no doubt that the popularity of smart devices and the development of deep learning models bring individuals too much convenience. However, some rancorous attackers can also implement unexpected privacy inferences on sensed data from smart devices via advanced deep-learning tools. Nonetheless, up to now, no work has investigated the possibility of riskier overheard, referring to inferring an integral event about humans by analyzing polyphonic audios. To this end, we propose an Audio-based integraL evenT infERence (ALTER) model and two upgraded models (ALTER-p and ALTER-pp) to achieve the integral event inference. Specifically, ALTER applies a link-like multi-label inference scheme to consider the short-term co-occurrence dependency among multiple labels for the event inference. Moreover, ALTER-p uses a newly designed attention mechanism, which fully exploits audio information and the importance of all data points, to mitigate information loss in audio data feature learning for the event inference performance improvement. Furthermore, ALTER-pp takes into account the long-term co-occurrence dependency among labels to infer an event with more diverse elements, where another devised attention mechanism is utilized to conduct a graph-like multi-label inference. Finally, extensive real-data experiments demonstrate that our models are effective in integral event inference and also outperform the state-of-the-art models.  more » « less
Award ID(s):
2416872
PAR ID:
10634559
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
Journal of Data and Information Quality
ISSN:
1936-1955
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In the era of big data, data-driven based classification has become an essential method in smart manufacturing to guide production and optimize inspection. The industrial data obtained in practice is usually time-series data collected by soft sensors, which are highly nonlinear, nonstationary, imbalanced, and noisy. Most existing soft-sensing machine learning models focus on capturing either intra-series temporal dependencies or pre-defined inter-series correlations, while ignoring the correlation between labels as each instance is associated with multiple labels simultaneously. In this paper, we propose a novel graph based soft-sensing neural network (GraSSNet) for multivariate time-series classification of noisy and highly-imbalanced soft-sensing data. The proposed GraSSNet is able to 1) capture the inter-series and intra-series dependencies jointly in the spectral domain; 2) exploit the label correlations by superimposing label graph that built from statistical co-occurrence information; 3) learn features with attention mechanism from both textual and numerical domain; and 4) leverage unlabeled data and mitigate data imbalance by semi-supervised learning. Comparative studies with other commonly used classifiers are carried out on Seagate soft sensing data, and the experimental results validate the competitive performance of our proposed method. 
    more » « less
  2. In this work we explore confidence elicitation methods for crowdsourcing "soft" labels, e.g., probability estimates, to reduce the annotation costs for domains with ambiguous data. Machine learning research has shown that such "soft" labels are more informative and can reduce the data requirements when training supervised machine learning models. By reducing the number of required labels, we can reduce the costs of slow annotation processes such as audio annotation. In our experiments we evaluated three confidence elicitation methods: 1) "No Confidence" elicitation, 2) "Simple Confidence" elicitation, and 3) "Betting" mechanism for confidence elicitation, at both individual (i.e., per participant) and aggregate (i.e., crowd) levels. In addition, we evaluated the interaction between confidence elicitation methods, annotation types (binary, probability, and z-score derived probability), and "soft" versus "hard" (i.e., binarized) aggregate labels. Our results show that both confidence elicitation mechanisms result in higher annotation quality than the "No Confidence" mechanism for binary annotations at both participant and recording levels. In addition, when aggregating labels at the recording level, results indicate that we can achieve comparable results to those with 10-participant aggregate annotations using fewer annotators if we aggregate "soft" labels instead of "hard" labels. These results suggest that for binary audio annotation using a confidence elicitation mechanism and aggregating continuous labels we can obtain higher annotation quality, more informative labels, with quality differences more pronounced with fewer participants. Finally, we propose a way of integrating these confidence elicitation methods into a two-stage, multi-label annotation pipeline. 
    more » « less
  3. Abstract The purpose of this research is to build an operational model for predicting wildfire occurrence for the contiguous United States (CONUS) in the 1–10-day range using the U-Net 3+ machine learning model. This paper illustrates the range of model performance resulting from choices made in the modeling process, such as how labels are defined for the model and how input variables are codified for the model. By combining the capabilities of the U-Net 3+ model with a neighborhood loss function, fractions skill score (FSS), we can quantify model success by predictions made both in and around the location of the original fire occurrence label. The model is trained on weather, weather-derived fuel, and topography observational inputs and labels representing fire occurrence. Observational weather, weather-derived fuel, and topography data are sourced from the gridded surface meteorological (gridMET) dataset, a daily, CONUS-wide, high-spatial-resolution dataset of surface meteorological variables. Fire occurrence labels are sourced from the U.S. Department of Agriculture’s Fire Program Analysis Fire-Occurrence Database (FPA-FOD), which contains spatial wildfire occurrence data for CONUS, combining data sourced from the reporting systems of federal, state, and local organizations. By exploring the many aspects of the modeling process with the added context of model performance, this work builds understanding around the use of deep learning to predict fire occurrence in CONUS. Significance StatementOur work seeks to explore the limits to which deep learning can predict wildfire occurrence in CONUS with the ultimate goal of providing decision support to those allocating fire resources during high fire seasons. By exploring with what accuracy and lead time we can provide insights to these persons, we hope to reduce loss of life, reduce damage to property, and improve future event preparedness. We compare two models, one trained on all fires in the continental United States and the other on only large lightning fires. We found that a model trained on all fires produced a higher probability of fire. 
    more » « less
  4. null (Ed.)
    Data collected from real-world environments often contain multiple objects, scenes, and activities. In comparison to single-label problems, where each data sample only defines one concept, multi-label problems allow the co-existence of multiple concepts. To exploit the rich semantic information in real-world data, multi-label classification has seen many applications in a variety of domains. The traditional approaches to multi-label problems tend to have the side effects of increased memory usage, slow model inference speed, and most importantly the under-utilization of the dependency across concepts. In this paper, we adopt multi-task learning to address these challenges. Multi-task learning treats the learning of each concept as a separate job, while at the same time leverages the shared representations among all tasks. We also propose a dynamic task balancing method to automatically adjust the task weight distribution by taking both sample-level and task-level learning complexities into consideration. Our framework is evaluated on a disaster video dataset and the performance is compared with several state-of-the-art multi-label and multi-task learning techniques. The results demonstrate the effectiveness and supremacy of our approach. 
    more » « less
  5. We explore the effect of auxiliary labels in improving the classification accuracy of wearable sensor-based human activity recognition (HAR) systems, which are primarily trained with the supervision of the activity labels (e.g. running, walking, jumping). Supplemental meta-data are often available during the data collection process such as body positions of the wearable sensors, subjects' demographic information (e.g. gender, age), and the type of wearable used (e.g. smartphone, smart-watch). This information, while not directly related to the activity classification task, can nonetheless provide auxiliary supervision and has the potential to significantly improve the HAR accuracy by providing extra guidance on how to handle the introduced sample heterogeneity from the change in domains (i.e positions, persons, or sensors), especially in the presence of limited activity labels. However, integrating such meta-data information in the classification pipeline is non-trivial - (i) the complex interaction between the activity and domain label space is hard to capture with a simple multi-task and/or adversarial learning setup, (ii) meta-data and activity labels might not be simultaneously available for all collected samples. To address these issues, we propose a novel framework Conditional Domain Embeddings (CoDEm). From the available unlabeled raw samples and their domain meta-data, we first learn a set of domain embeddings using a contrastive learning methodology to handle inter-domain variability and inter-domain similarity. To classify the activities, CoDEm then learns the label embeddings in a contrastive fashion, conditioned on domain embeddings with a novel attention mechanism, enforcing the model to learn the complex domain-activity relationships. We extensively evaluate CoDEm in three benchmark datasets against a number of multi-task and adversarial learning baselines and achieve state-of-the-art performance in each avenue. 
    more » « less