NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Flow Matching by Aligning Flow Divergence

Huang, Yuhao; Transue, Taos; Wang, Shih-Hsin; Feldman, William M; Zhang, Hong; Wang, Bao (May 2025, ICML, https://openreview.net/forum?id=FeZimuj6SG)

Free, publicly-accessible full text available May 1, 2026
A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules

Wang, Shih-Hsin; Huang, Yuhao; Baker, Justin M; Sun, Yuan-En; Tang, Qi; Wang, Bao (January 2025, ICLR, https://openreview.net/forum?id=OIvg3MqWX2¬eId=UlvPDZvECD)

Free, publicly-accessible full text available January 22, 2026
Implicit Graph Neural Networks: A Monotone Operator Viewpoint

Justin Baker, Qingsong Wang (July 2023, International Conference on Machine Learning)

Full Text Available
Learning Proper Orthogonal Decomposition of Complex Dynamics Using Heavy-ball Neural ODEs

https://doi.org/10.1007/s10915-023-02176-8

Baker, Justin; Cherkaev, Elena; Narayan, Akil; Wang, Bao (May 2023, Journal of Scientific Computing)

Full Text Available
Accelerated Sparse Recovery via Gradient Descent with Nonlinear Conjugate Gradient Momentum

https://doi.org/10.1007/s10915-023-02148-y

Hu, Mengqi; Lou, Yifei; Wang, Bao; Yan, Ming; Yang, Xiu; Ye, Qiang (April 2023, Journal of Scientific Computing)

Full Text Available
Improving Deep Neural Networks’ Training for Image Classification With Nonlinear Conjugate Gradient-Style Adaptive Momentum

https://doi.org/10.1109/TNNLS.2023.3255783

Wang, Bao; Ye, Qiang (March 2023, IEEE Transactions on Neural Networks and Learning Systems)

Full Text Available
A deterministic gradient-based approach to avoid saddle points

https://doi.org/10.1017/S0956792522000316

Kreusser, L. M.; Osher, S. J.; Wang, B. (December 2022, European Journal of Applied Mathematics)

Abstract Loss functions with a large number of saddle points are one of the major obstacles for training modern machine learning (ML) models efficiently. First-order methods such as gradient descent (GD) are usually the methods of choice for training ML models. However, these methods converge to saddle points for certain choices of initial guesses. In this paper, we propose a modification of the recently proposed Laplacian smoothing gradient descent (LSGD) [Osher et al., arXiv:1806.06317 ], called modified LSGD (mLSGD), and demonstrate its potential to avoid saddle points without sacrificing the convergence rate. Our analysis is based on the attraction region, formed by all starting points for which the considered numerical scheme converges to a saddle point. We investigate the attraction region’s dimension both analytically and numerically. For a canonical class of quadratic functions, we show that the dimension of the attraction region for mLSGD is $$\lfloor (n-1)/2\rfloor$$ , and hence it is significantly smaller than that of GD whose dimension is $n-1$ .
more » « less
Full Text Available
How does momentum benefit deep neural networks architecture design? A few case studies

https://doi.org/10.1007/s40687-022-00352-0

Wang, Bao; Xia, Hedi; Nguyen, Tan; Osher, Stanley (September 2022, Research in the Mathematical Sciences)

Full Text Available
Efficient and Reliable Overlay Networks for Decentralized Federated Learning

https://doi.org/10.1137/21M1465081

Hua, Yifan; Miller, Kevin; Bertozzi, Andrea L.; Qian, Chen; Wang, Bao (August 2022, SIAM Journal on Applied Mathematics)

Full Text Available

Search for: All records