Abstract We compare the performance of energy-based and entropy-conserving schemes for modeling nonthermal energy components, such as unresolved turbulence and cosmic rays, using idealized fluid dynamics tests and isolated galaxy simulations. While both methods are aimed to model advection and adiabatic compression or expansion of different energy components, the energy-based scheme numerically solves the nonconservative equation for the energy density evolution, while the entropy-conserving scheme uses a conservative equation for modified entropy. Using the standard shock tube and Zel’dovich pancake tests, we show that the energy-based scheme results in a spurious generation of nonthermal energy on shocks, while the entropy-conserving method evolves the energy adiabatically to machine precision. We also show that, in simulations of an isolatedL⋆galaxy, switching between the schemes results in ≈20%–30% changes of the total star formation rate and a significant difference in morphology, particularly near the galaxy center. We also outline and test a simple method that can be used in conjunction with the entropy-conserving scheme to model the injection of nonthermal energies on shocks. Finally, we discuss how the entropy-conserving scheme can be used to capture the kinetic energy dissipated by numerical viscosity into the subgrid turbulent energyimplicitly, without explicit source terms that require calibration and can be rather uncertain. Our results indicate that the entropy-conserving scheme is the preferred choice for modeling nonthermal energy components, a conclusion that is equally relevant for Eulerian and moving-mesh fluid dynamics codes. 
                        more » 
                        « less   
                    
                            
                            Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
                        
                    
    
            We introduce a novel framework for optimization based on energy-conserving Hamiltonian dynamics in a strongly mixing (chaotic) regime and establish its key properties analytically and numerically. The prototype is a discretization of Born-Infeld dynamics, with a squared relativistic speed limit depending on the objective function. This class of frictionless, energy-conserving optimizers proceeds unobstructed until slowing naturally near the minimal loss, which dominates the phase space volume of the system. Building from studies of chaotic systems such as dynamical billiards, we formulate a specific algorithm with good performance on machine learning and PDE-solving tasks, including generalization. It cannot stop at a high local minimum, an advantage in non-convex loss functions, and proceeds faster than GD+momentum in shallow valleys. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2014215
- PAR ID:
- 10352506
- Editor(s):
- Chaudhuri, Kamalika; Jegelka, Stefanie; Song, Le; Szepesvari, Csaba; Niu, Gang; Sabato, Sivan
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 162
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 4918-4936
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)We compare the performance of energy-based and entropy-conservative schemes for modeling nonthermal energy components, such as unresolved turbulence and cosmic rays, using idealized fluid dynamics tests and isolated galaxy simulations. While both methods are aimed to model advection and adiabatic compression or expansion of different energy components, the energy-based scheme numerically solves the non-conservative equation for the energy density evolution, while the entropy-conserving scheme uses a conservative equation for modified entropy. Using the standard shock tube and Zel'dovich pancake tests, we show that the energy-based scheme results in a spurious generation of nonthermal energy on shocks, while the entropy-conserving method evolves the energy adiabatically to machine precision. We also show that, in simulations of an isolated Lstar galaxy, switching between the schemes results in ~20-30% changes of the total star formation rate and a significant difference in morphology, particularly near the galaxy center. We also outline and test a simple method that can be used in conjunction with the entropy-conserving scheme to model the injection of nonthermal energies on shocks. Finally, we discuss how the entropy-conserving scheme can be used to capture the kinetic energy dissipated by numerical viscosity into the subgrid turbulent energy implicitly, without explicit source terms that require calibration and can be rather uncertain. Our results indicate that the entropy-conserving scheme is the preferred choice for modeling nonthermal energy components, a conclusion that is equally relevant for Eulerian and moving-mesh fluid dynamics codes.more » « less
- 
            The understanding of chaotic systems is challenging not only for theoretical research but also for many important applications. Chaotic behavior is found in many nonlinear dynamical systems, such as those found in climate dynamics, weather, the stock market, and the space-time dynamics of virus spread. A reliable solution for these systems must handle their complex space-time dynamics and sensitive dependence on initial conditions. We develop a deep learning framework to push the time horizon at which reliable predictions can be made further into the future by better evaluating the consequences of local errors when modeling nonlinear systems. Our approach observes the future trajectories of initial errors at a time horizon to model the evolution of the loss to that point with two major components: 1) a recurrent architecture, Error Trajectory Tracing, that is designed to trace the trajectories of predictive errors through phase space, and 2) a training regime, Horizon Forcing, that pushes the model’s focus out to a predetermined time horizon. We validate our method on classic chaotic systems and real-world time series prediction tasks with chaotic characteristics, and show that our approach outperforms the current state-of-the-art methods.more » « less
- 
            Chaotic dynamics are ubiquitous in many real-world systems, ranging from biological and industrial processes to climate dynamics and the spread of viruses. These systems are characterized by high sensitivity to initial conditions, making it challenging to predict their future behavior confidently. In this study, we propose a novel deep-learning framework that addresses this challenge by directly exploiting the long-term compounding of local prediction errors during model training, aiming to extend the time horizon for reliable predictions of chaotic systems. Our approach observes the future trajectories of initial errors at a time horizon, modeling the evolution of the loss to that point through the use of two major components: 1) a recurrent architecture (Error Trajectory Tracing) designed to trace the trajectories of predictive errors through phase space, and 2) a training regime, Horizon Forcing, that pushes the model’s focus out to a predetermined time horizon. We validate our method on three classic chaotic systems and six real-world time series prediction tasks with chaotic characteristics. The results show that our approach outperforms the state-of-the-art methods.more » « less
- 
            null (Ed.)In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight expansion of the network at initialization. Standard training, however, diverges from its linearization in ways that are poorly understood. We study the relationship between the training dynamics of nonlinear deep networks, the geometry of the loss landscape, and the time evolution of a data-dependent NTK. We do so through a large-scale phenomenological analysis of training, synthesizing diverse measures characterizing loss landscape geometry and NTK dynamics. In multiple neural architectures and datasets, we find these diverse measures evolve in a highly correlated manner, revealing a universal picture of the deep learning process. In this picture, deep network training exhibits a highly chaotic rapid initial transient that within 2 to 3 epochs determines the final linearly connected basin of low loss containing the end point of training. During this chaotic transient, the NTK changes rapidly, learning useful features from the training data that enables it to outperform the standard initial NTK by a factor of 3 in less than 3 to 4 epochs. After this rapid chaotic transient, the NTK changes at constant velocity, and its performance matches that of full network training in 15\% to 45\% of training time. Overall, our analysis reveals a striking correlation between a diverse set of metrics over training time, governed by a rapid chaotic to stable transition in the first few epochs, that together poses challenges and opportunities for the development of more accurate theories of deep learning.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    