skip to main content

Title: Relation Inference among Sensor Time Series in Smart Buildings with Metric Learning
Smart Building Technologies hold promise for better livability for residents and lower energy footprints. Yet, the rollout of these technologies, from demand response controls to fault detection and diagnosis, significantly lags behind and is impeded by the current practice of manual identification of sensing point relationships, e.g., how equipment is connected or which sensors are co-located in the same space. This manual process is still error-prone, albeit costly and laborious.We study relation inference among sensor time series. Our key insight is that, as equipment is connected or sensors co-locate in the same physical environment, they are affected by the same real-world events, e.g., a fan turning on or a person entering the room, thus exhibiting correlated changes in their time series data. To this end, we develop a deep metric learning solution that first converts the primitive sensor time series to the frequency domain, and then optimizes a representation of sensors that encodes their relations. Built upon the learned representation, our solution pinpoints the relationships among sensors via solving a combinatorial optimization problem. Extensive experiments on real-world buildings demonstrate the effectiveness of our solution.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Page Range / eLocation ID:
4683 to 4690
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modern buildings produce thousands of data streams, and the ability to automatically infer the physical context of such data is the key to enabling building analytics at scale. As acquiring this contextual information is currently a time-consuming and error-prone manual process, in this study we make the first attempt at automatically inferring one important contextual aspect of the equipment in buildings --- how each equipment is functionally connected with another. The main insight behind our solution is that functionally connected equipment is exposed to the same events in the physical world, creating correlated changes in the time series data of both equipment. Because events are of indeterminate length in time series, however, identifying them requires solving a non-polynomial combinatorial data segmentation problem. We present a solution that first extracts latent events from the sensory time series data, and then sifts out coincident events with a customized correlation procedure to identify the relationship between equipment. We evaluated our approach on data collected from over 1,000 pieces of equipment from 5 commercial buildings of various sizes located in different geographical regions in the US. Results show that this approach achieves 94.38% accuracy in relation inference, compared to 85.49% by the best baseline. 
    more » « less
  2. A key barrier to applying any smart technology to a building is the requirement of locating and connecting to the necessary resources among the thousands of sensing and control points, i.e., the metadata mapping problem. Existing solutions depend on exhaustive manual annotation of sensor metadata - a laborious, costly, and hardly scalable process. To reduce the amount of manual effort required, this paper presents a multi-oracle selective sampling framework to leverage noisy labels from information sources with unknown reliability such as existing buildings, which we refer to as weak oracles, for metadata mapping. This framework involves an interactive process, where a small set of sensor instances are progressively selected and labeled for it to learn how to aggregate the noisy labels as well as to predict sensor types. Two key challenges arise in designing the framework, namely, weak oracle reliability estimation and instance selection for querying. To address the first challenge, we develop a clustering-based approach for weak oracle reliability estimation to capitalize on the observation that weak oracles perform differently in different groups of instances. For the second challenge, we propose a disagreement-based query selection strategy to combine the potential effect of a labeled instance on both reducing classifier uncertainty and improving the quality of label aggregation. We evaluate our solution on a large collection of real-world building sensor data from 5 buildings with more than 11, 000 sensors of 18 different types. The experiment results validate the effectiveness of our solution, which outperforms a set of state-of-the-art baselines. 
    more » « less
  3. null (Ed.)
    Modern physical systems deploy large numbers of sensors to record at different time-stamps the status of different systems components via measurements such as temperature, pressure, speed, but also the component's categorical state. Depending on the measurement values, there are two kinds of sequences: continuous and discrete. For continuous sequences, there is a host of state-of-the-art algorithms for anomaly detection based on time-series analysis, but there is a lack of effective methodologies that are tailored specifically to discrete event sequences. This paper proposes an analytics framework for discrete event sequences for knowledge discovery and anomaly detection. During the training phase, the framework extracts pairwise relationships among discrete event sequences using a neural machine translation model by viewing each discrete event sequence as a "natural language". The relationship between sequences is quantified by how well one discrete event sequence is "translated" into another sequence. These pairwise relationships among sequences are aggregated into a multivariate relationship graph that clusters the structural knowledge of the underlying system and essentially discovers the hidden relationships among discrete sequences. This graph quantifies system behavior during normal operation. During testing, if one or more pairwise relationships are violated, an anomaly is detected. The proposed framework is evaluated on two real-world datasets: a proprietary dataset collected from a physical plant where it is shown to be effective in extracting sensor pairwise relationships for knowledge discovery and anomaly detection, and a public hard disk drive dataset where its ability to effectively predict upcoming disk failures is illustrated. 
    more » « less
  4. Online underground forums have been widely used by cybercriminals to trade the illicit products, resources and services, which have played a central role in the cybercrim-inal ecosystem. Unfortunately, due to the number of forums, their size, and the expertise required, it's infeasible to perform manual exploration to understand their behavioral processes. In this paper, we propose a novel framework named iDetector to automate the analysis of underground forums for the detection of cybercrime-suspected threads. In iDetector, to detect whether the given threads are cybercrime-suspected threads, we not only analyze the content in the threads, but also utilize the relations among threads, users, replies, and topics. To model this kind of rich semantic relationships (i.e., thread-user, thread-reply, thread-topic, reply-user and reply-topic relations), we introduce a structured heterogeneous information network (HIN) for representation, which is capable to be composed of different types of entities and relations. To capture the complex relationships (e.g., two threads are relevant if they were posted by the same user and discussed the same topic), we use a meta-structure based approach to characterize the semantic relatedness over threads. As different meta-structures depict the relatedness over threads at different views, we then build a classifier using Laplacian scores to aggregate different similarities formulated by different meta-structures to make predictions. To the best of our knowledge, this is the first work to use structural HIN to automate underground forum analysis. Comprehensive experiments on real data collections from underground forums (e.g., Hack Forums) are conducted to validate the effectiveness of our developed system iDetector in cybercrime-suspected thread detection by comparisons with other alternative methods. 
    more » « less
  5. null (Ed.)
    Breathing biomarkers, such as breathing rate, fractional inspiratory time, and inhalation-exhalation ratio, are vital for monitoring the user's health and well-being. Accurate estimation of such biomarkers requires breathing phase detection, i.e., inhalation and exhalation. However, traditional breathing phase monitoring relies on uncomfortable equipment, e.g., chestbands. Smartphone acoustic sensors have shown promising results for passive breathing monitoring during sleep or guided breathing. However, detecting breathing phases using acoustic data can be challenging for various reasons. One of the major obstacles is the complexity of annotating breathing sounds due to inaudible parts in regular breathing and background noises. This paper assesses the potential of using smartphone acoustic sensors for passive unguided breathing phase monitoring in a natural environment. We address the annotation challenges by developing a novel variant of the teacher-student training method for transferring knowledge from an inertial sensor to an acoustic sensor, eliminating the need for manual breathing sound annotation by fusing signal processing with deep learning techniques. We train and evaluate our model on the breathing data collected from 131 subjects, including healthy individuals and respiratory patients. Experimental results show that our model can detect breathing phases with 77.33% accuracy using acoustic sensors. We further present an example use-case of breathing phase-detection by first estimating the biomarkers from the estimated breathing phases and then using these biomarkers for pulmonary patient detection. Using the detected breathing phases, we can estimate fractional inspiratory time with 92.08% accuracy, the inhalation-exhalation ratio with 86.76% accuracy, and the breathing rate with 91.74% accuracy. Moreover, we can distinguish respiratory patients from healthy individuals with up to 76% accuracy. This paper is the first to show the feasibility of detecting regular breathing phases towards passively monitoring respiratory health and well-being using acoustic data captured by a smartphone. 
    more » « less