Identifying the subset of events that influence events of interest from continuous time datasets is of great interest in various applications. Existing methods however often fail to produce accurate and interpretable results in a time-efficient manner. In this paper, we propose a neural model – Influence-Aware Attention for Multivariate Temporal Point Processes (IAA-MTPPs) – which leverages the powerful attention mechanism in transformers to capture temporal dynamics between event types, which is different from existing instance-to-instance attentions, using variational inference while maintaining interpretability. Given event sequences and a prior influence matrix, IAA-MTPP efficiently learns an approximate posterior by an Attention-to-Influence mechanism, and subsequently models the conditional likelihood of the sequences given a sampled influence through an Influence-to-Attention formulation. Both steps are completed efficiently inside a B-block multi-head self-attention layer, thus our end-to-end training with parallelizable transformer architecture enables faster training compared to sequential models such as RNNs. We demonstrate strong empirical performance compared to existing baselines on multiple synthetic and real benchmarks, including qualitative analysis for an application in decentralized finance.
more »
« less
Influence-Aware Attention for Multivariate Temporal Point Processes
Identifying the subset of events that influence events of interest from continuous time datasets is of great interest in various applications. Existing methods however often fail to produce accurate and interpretable results in a time-efficient manner. In this paper, we propose a neural model – Influence-Aware Attention for Multivariate Temporal Point Processes (IAA-MTPPs) – which leverages the powerful attention mechanism in transformers to capture temporal dynamics between event types, which is different from existing instance-to-instance attentions, using variational inference while maintaining interpretability. Given event sequences and a prior influence matrix, IAA-MTPP efficiently learns an approximate posterior by an Attention-to-Influence mechanism, and subsequently models the conditional likelihood of the sequences given a sampled influence through an Influence-to-Attention formulation. Both steps are completed efficiently inside a Bblock multi-head self-attention layer, thus our end-to-end training with parallelizable transformer architecture enables faster training compared to sequential models such as RNNs. We demonstrate strong empirical performance compared to existing baselines on multiple synthetic and real benchmarks, including qualitative analysis for an application in decentralized finance.
more »
« less
- Award ID(s):
- 2113906
- PAR ID:
- 10600784
- Editor(s):
- van_der_Schaar, M; Janzing, D; Zhang, C
- Publisher / Repository:
- 2nd Conference on Causal Learning and Reasoning
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Deep learning models have been studied to forecast human events using vast volumes of data, yet they still cannot be trusted in certain applications such as healthcare and disaster assistance due to the lack of interpretability. Providing explanations for event predictions not only helps practitioners understand the underlying mechanism of prediction behavior but also enhances the robustness of event analysis. Improving the transparency of event prediction models is challenging given the following factors: (i) multilevel features exist in event data which creates a challenge to cross-utilize different levels of data; (ii) features across different levels and time steps are heterogeneous and dependent; and (iii) static model-level interpretations cannot be easily adapted to event forecasting given the dynamic and temporal characteristics of the data. Recent interpretation methods have proven their capabilities in tasks that deal with graph-structured or relational data. In this paper, we present a Contextualized Multilevel Feature learning framework, CMF, for interpretable temporal event prediction. It consists of a predictor for forecasting events of interest and an explanation module for interpreting model predictions. We design a new context-based feature fusion method to integrate multiple levels of heterogeneous features. We also introduce a temporal explanation module to determine sequences of text and subgraphs that have crucial roles in a prediction. We conduct extensive experiments on several real-world datasets of political and epidemic events. We demonstrate that the proposed method is competitive compared with the state-of-the-art models while possessing favorable interpretation capabilities.more » « less
-
Adverse clinical events related to unsafe care are among the top ten causes of death in the U.S. Accurate modeling and prediction of clinical events from electronic health records (EHRs) play a crucial role in patient safety enhancement. An example is modeling de facto care pathways that characterize common step-by-step plans for treatment or care. However, clinical event data pose several unique challenges, including the irregularity of time intervals between consecutive events, the existence of cycles, periodicity, multi-scale event interactions, and the high computational costs associated with long event sequences. Existing neural temporal point processes (TPPs) methods do not effectively capture the multi-scale nature of event interactions, which is common in many real-world clinical applications. To address these issues, we propose the cross-temporal-scale transformer (XTSFormer), specifically designed for irregularly timed event data. Our model consists of two vital components: a novel Feature-based Cycle-aware Time Positional Encoding (FCPE) that adeptly captures the cyclical nature of time, and a hierarchical multi-scale temporal attention mechanism, where different temporal scales are determined by a bottom-up clustering approach. Extensive experiments on several real-world EHR datasets show that our XTSFormer outperforms multiple baseline methods.more » « less
-
This paper introduces a novel transformer network tailored to skeleton-based action detection in untrimmed long video streams. Our approach centers around three innovative mechanisms that collectively enhance the network’s temporal analysis capabilities. First, a new predictive attention mechanism incorporates future frame data into the sequence analysis during the training phase. This mechanism addresses the essential issue of the current action detection models: incomplete temporal modeling in long action sequences, particularly for boundary frames that lie outside the network’s immediate temporal receptive field, while maintaining computational efficiency. Second, we integrate a new adaptive weighted temporal attention system that dynamically evaluates the importance of each frame within an action sequence. In contrast to the existing approaches, the proposed weighting strategy is both adaptive and interpretable, making it highly effective in handling long sequences with numerous non-informative frames. Third, the network incorporates an advanced regression technique. This approach independently identifies the start and end frames based on their relevance to different frames. Unlike existing homogeneous regression methods, the proposed regression method is heterogeneous and based on various temporal relationships, including those in future frames in actions, making it more effective for action detection. Extensive experiments on prominent untrimmed skeleton-based action datasets, PKU-MMD, OAD, and the Charade dataset demonstrate the effectiveness of this network.more » « less
-
Natural language often describes events in different granularities, such that more coarse-grained (goal) events can often be decomposed into fine-grained sequences of (step) events. A critical but overlooked challenge in understanding an event process lies in the fact that the step events are not equally important to the central goal. In this paper, we seek to fill this gap by studying how well current models can understand the essentiality of different step events towards a goal event. As discussed by cognitive studies, such an ability enables the machine to mimic human’s commonsense reasoning about preconditions and necessary efforts of daily-life tasks. Our work contributes with a high-quality corpus of (goal, step) pairs from a community guideline website WikiHow, where the steps are manually annotated with their essentiality w.r.t. the goal. The high IAA indicates that humans have a consistent understanding of the events. Despite evaluating various statistical and massive pre-trained NLU models, we observe that existing SOTA models all perform drastically behind humans, indicating the need for future investigation of this crucial yet challenging task.more » « less
An official website of the United States government

