skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 6, 2026

Title: An attention-based spatio-temporal neural operator for evolving physics
Abstract In scientific machine learning (SciML), a key challenge is learning unknown, evolving physical processes and making predictions across spatio-temporal scales. For example, in real-world manufacturing problems like additive manufacturing, users adjust known machine settings while unknown environmental parameters simultaneously fluctuate. To make reliable predictions, it is desired for a model to not only capture long-range spatio-temporal interactions from data but also adapt to new and unknown environments; traditional machine learning models excel at the first task but often lack physical interpretability and struggle to generalize under varying environmental conditions. To tackle these challenges, we propose the attention-based spatio-temporal neural operator (ASNO), a novel architecture that combines separable attention mechanisms for spatial and temporal interactions and adapts to unseen physical parameters. Inspired by the backward differentiation formula, ASNO learns a transformer for temporal prediction and extrapolation and an attention-based neural operator for handling varying external loads, enhancing interpretability by isolating historical state contributions and external forces, enabling the discovery of underlying physical laws and generalizability to unseen physical environments. Empirical results on SciML benchmarks demonstrate that ASNO outperforms existing models, establishing its potential for engineering applications, physics discovery, and interpretable machine learning.  more » « less
Award ID(s):
2227641
PAR ID:
10654358
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Purpose Led Publishing
Date Published:
Journal Name:
Machine Learning: Science and Technology
Volume:
6
Issue:
4
ISSN:
2632-2153
Page Range / eLocation ID:
045036
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Scientific Machine Learning (SciML) is a new multidisciplinary methodology that combines the data-driven machine learning models and the principle-based computational models to improve the simulations of scientific phenomenon and uncover new scientific rules from existing measurements. This article reveals the experience of using the SciML method to discover the nonlinear dynamics that may be hard to model or be unknown in the real-world scenario. The SciML method solves the traditional principle-based differential equations by integrating a neural network to accurately model the nonlinear dynamics while respecting the scientific constraints and principles. The paper discusses the latest SciML models and apply them to the oscillator simulations and experiment. Besides better capacity to simulate, and match with the observation, the results also demonstrate a successful discovery of the hidden physics in the pendulum dynamics using SciML. 
    more » « less
  2. Arai, Kohei (Ed.)
    Scientific Machine Learning (SciML) is a new multidisciplinary methodology that combines the data-driven machine learning models and the principle-based computational models to improve the simulations of scientific phenomenon and uncover new scientific rules from existing measurements. This article reveals the experience of using the SciML method to discover the nonlinear dynamics that may be hard to model or be unknown in the real-world scenario. The SciML method solves the traditional principle-based differential equations by integrating a neural network to accurately model the nonlinear dynamics while respecting the scientific constraints and principles. The paper discusses the latest SciML models and apply them to the oscillator simulations and experiment. Besides better capacity to simulate, and match with the observation, the results also demonstrate a successful discovery of the hidden physics in the pendulum dynamics using SciML. 
    more » « less
  3. Process-based modelling offers interpretability and physical consistency in many domains of geosciences but struggles to leverage large datasets efficiently. Machine-learning methods, especially deep networks, have strong predictive skills yet are unable to answer specific scientific questions. In this Perspective, we explore differentiable modelling as a pathway to dissolve the perceived barrier between process-based modelling and machine learning in the geosciences and demonstrate its potential with examples from hydrological modelling. ‘Differentiable’ refers to accurately and efficiently calculating gradients with respect to model variables or parameters, enabling the discovery of high-dimensional unknown relationships. Differentiable modelling involves connecting (flexible amounts of) prior physical knowledge to neural networks, pushing the boundary of physics-informed machine learning. It offers better interpretability, generalizability, and extrapolation capabilities than purely data-driven machine learning, achieving a similar level of accuracy while requiring less training data. Additionally, the performance and efficiency of differentiable models scale well with increasing data volumes. Under data-scarce scenarios, differentiable models have outperformed machine-learning models in producing short-term dynamics and decadal-scale trends owing to the imposed physical constraints. Differentiable modelling approaches are primed to enable geoscientists to ask questions, test hypotheses, and discover unrecognized physical relationships. Future work should address computational challenges, reduce uncertainty, and verify the physical significance of outputs. 
    more » « less
  4. Despite the recent popularity of attention-based neural architectures in core AI fields like natural language processing (NLP) and computer vision (CV), their potential in modeling complex physical systems remains underexplored. Learning problems in physical systems are often characterized as discovering operators that map between function spaces based on a few instances of function pairs. This task frequently presents a severely ill-posed PDE inverse problem. In this work, we propose a novel neural operator architecture based on the attention mechanism, which we refer to as the Nonlocal Attention Operator (NAO), and explore its capability in developing a foundation physical model. In particular, we show that the attention mechanism is equivalent to a double integral operator that enables nonlocal interactions among spatial tokens, with a data-dependent kernel characterizing the inverse mapping from data to the hidden parameter field of the underlying operator. As such, the attention mechanism extracts global prior information from training data generated by multiple systems, and suggests the exploratory space in the form of a nonlinear kernel map. Consequently, NAO can address ill-posedness and rank deficiency in inverse PDE problems by encoding regularization and achieving generalizability. We empirically demonstrate the advantages of NAO over baseline neural models in terms of generalizability to unseen data resolutions and system states. Our work not only suggests a novel neural operator architecture for learning interpretable foundation models of physical systems, but also offers a new perspective towards understanding the attention mechanism. Our code and data accompanying this paper are available at https://github.com/fishmoon1234/NAO. 
    more » « less
  5. Time series forecasting with additional spatial information has attracted a tremendous amount of attention in recent research, due to its importance in various real-world applications on social studies, such as conflict prediction and pandemic forecasting. Conventional machine learning methods either consider temporal dependencies only, or treat spatial and temporal relations as two separate autoregressive models, namely, space-time autoregressive models. Such methods suffer when it comes to long-term forecasting or predictions for large-scale areas, due to the high nonlinearity and complexity of spatio-temporal data. In this paper, we propose to address these challenges using spatio-temporal graph neural networks. Empirical results on Violence Early Warning System (ViEWS) dataset and U.S. Covid-19 dataset indicate that our method significantly improved performance over the baseline approaches. 
    more » « less