skip to main content


Title: Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling (Selected as one of the best-ranked papers for possible publication in the journal Knowledge and Information Systems.)
Accurate long-term predictions are the foundations for many machine learning applications and decision-making processes. However, building accurate long-term prediction models remains challenging due to the limitations of existing temporal models like recurrent neural networks (RNNs), as they capture only the statistical connections in the training data and may fail to learn the underlying dynamics of the target system. To tackle this challenge, we propose a novel machine learning model based on Koopman operator theory, which we call Koopman Invertible Autoencoders (KIA), that captures the inherent characteristic of the system by modeling both forward and backward dynamics in the infinite-dimensional Hilbert space. This enables us to efficiently learn low-dimensional representations, resulting in more accurate predictions of long-term system behavior. Moreover, our method’s invertibility design enforces reversibility and consistency in both forward and inverse operations. We illustrate the utility of KIA on pendulum and climate datasets, demonstrating 300% improvements in long-term prediction capability for pendulum while maintaining robustness against noise. Additionally, our method demonstrates the ability to better comprehend the intricate dynamics of the climate system when compared to existing Koopman-based methods.  more » « less
Award ID(s):
1934721
PAR ID:
10481369
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE International Conference on Data Mining (ICDM)
Subject(s) / Keyword(s):
Koopman operator, Autoencoder, temporal modeling
Format(s):
Medium: X
Location:
Beijing, China
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate long-term predictions are the foundations for many machine learning applications and decision-making processes. However, building accurate long-term prediction models remains challenging due to the limitations of existing temporal models like recurrent neural networks (RNNs), as they capture only the statistical connections in the training data and may fail to learn the underlying dynamics of the target system. To tackle this challenge, we propose a novel machine learning model based on Koopman operator theory, which we call Koopman Invertible Autoencoders (KIA), that captures the inherent characteristic of the system by modeling both forward and backward dynamics in the infinite-dimensional Hilbert space. This enables us to efficiently learn low-dimensional representations, resulting in more accurate predictions of long-term system behavior. Moreover, our method’s invertibility design enforces reversibility and consistency in both forward and inverse operations. We illustrate the utility of KIA on pendulum and climate datasets, demonstrating 300% improvements in long-term prediction capability for pendulum while maintaining robustness against noise. Additionally, our method demonstrates the ability to better comprehend the intricate dynamics of the climate system when compared to existing Koopman-based methods. 
    more » « less
  2. Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other regularization techniques that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise. 
    more » « less
  3. Base metal electrode (BME) multilayer ceramic capacitors (MLCCs) are widely used in aerospace, medical, military, and communication applications, emphasizing the need for high reliability. The ongoing advancements in BaTiO3-based MLCC technology have facilitated further miniaturization and improved capacitive volumetric density for both low and high voltage devices. However, concerns persist regarding infant mortality failures and long-term reliability under higher fields and temperatures. To address these concerns, a comprehensive understanding of the mechanisms underlying insulation resistance degradation is crucial. Furthermore, there is a need to develop effective screening procedures during MLCC production and improve the accuracy of mean time to failure (MTTF) predictions. This article reviews our findings on the effect of the burn-in test, a common quality control process, on the dynamics of oxygen vacancies within BME MLCCs. These findings reveal the burn-in test has a negative impact on the lifetime and reliability of BME MLCCS. Moreover, the limitations of existing lifetime prediction models for BME MLCCs are discussed, emphasizing the need for improved MTTF predictions by employing a physics-based machine learning model to overcome the existing models’ limitations. The article also discusses the new physical-based machine learning model that has been developed. While data limitations remain a challenge, the physics-based machine learning approach offers promising results for MTTF prediction in MLCCs, contributing to improved lifetime predictions. Furthermore, the article acknowledges the limitations of relying solely on MTTF to predict MLCCs’ lifetime and emphasizes the importance of developing comprehensive prediction models that predict the entire distribution of failures. 
    more » « less
  4. Abstract

    Nucleic acid-binding proteins (NABPs), including DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs), play important roles in essential biological processes. To facilitate functional annotation and accurate prediction of different types of NABPs, many machine learning-based computational approaches have been developed. However, the datasets used for training and testing as well as the prediction scopes in these studies have limited their applications. In this paper, we developed new strategies to overcome these limitations by generating more accurate and robust datasets and developing deep learning-based methods including both hierarchical and multi-class approaches to predict the types of NABPs for any given protein. The deep learning models employ two layers of convolutional neural network and one layer of long short-term memory. Our approaches outperform existing DBP and RBP predictors with a balanced prediction between DBPs and RBPs, and are more practically useful in identifying novel NABPs. The multi-class approach greatly improves the prediction accuracy of DBPs and RBPs, especially for the DBPs with ~12% improvement. Moreover, we explored the prediction accuracy of single-stranded DNA binding proteins and their effect on the overall prediction accuracy of NABP predictions.

     
    more » « less
  5. Abstract Objective

    We aim to develop a hybrid model for earlier and more accurate predictions for the number of infected cases in pandemics by (1) using patients’ claims data from different counties and states that capture local disease status and medical resource utilization; (2) utilizing demographic similarity and geographical proximity between locations; and (3) integrating pandemic transmission dynamics into a deep learning model.

    Materials and Methods

    We proposed a spatio-temporal attention network (STAN) for pandemic prediction. It uses a graph attention network to capture spatio-temporal trends of disease dynamics and to predict the number of cases for a fixed number of days into the future. We also designed a dynamics-based loss term for enhancing long-term predictions. STAN was tested using both real-world patient claims data and COVID-19 statistics over time across US counties.

    Results

    STAN outperforms traditional epidemiological models such as susceptible-infectious-recovered (SIR), susceptible-exposed-infectious-recovered (SEIR), and deep learning models on both long-term and short-term predictions, achieving up to 87% reduction in mean squared error compared to the best baseline prediction model.

    Conclusions

    By combining information from real-world claims data and disease case counts data, STAN can better predict disease status and medical resource utilization.

     
    more » « less