NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning

https://doi.org/10.1145/3637528.3671461

Dereziński, Michał; Mahoney, Michael W (August 2024, 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24))

Large matrices arise in many machine learning and data analysis applications, including as representations of datasets, graphs, model weights, and first and second-order derivatives. Randomized Numerical Linear Algebra (RandNLA) is an area which uses randomness to develop improved algorithms for ubiquitous matrix problems. The area has reached a certain level of maturity; but recent hardware trends, efforts to incorporate RandNLA algorithms into core numerical libraries, and advances in machine learning, statistics, and random matrix theory, have lead to new theoretical and practical challenges. This article provides a self-contained overview of RandNLA, in light of these developments.
more » « less
Full Text Available
Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels

Long, Da; Xing, Wei; Krishnapriyan, Aditi S; Kirby, Robert M; Zhe, Shandian; Mahoney, Michael W (March 2024, Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Learning continuous models for continuous physics

https://doi.org/10.1038/s42005-023-01433-4

Krishnapriyan, Aditi S; Queiruga, Alejandro F; Erichson, N Benjamin; Mahoney, Michael W (December 2023, Communications Physics)

Dynamical systems that evolve continuously over time are ubiquitous throughout science and engineering. Machine learning (ML) provides data-driven approaches to model and predict the dynamics of such systems. A core issue with this approach is that ML models are typically trained on discrete data, using ML methodologies that are not aware of underlying continuity properties. This results in models that often do not capture any underlying continuous dynamics—either of the system of interest, or indeed of any related system. To address this challenge, we develop a convergence test based on numerical analysis theory. Our test verifies whether a model has learned a function that accurately approximates an underlying continuous dynamics. Models that fail this test fail to capture relevant dynamics, rendering them of limited utility for many scientific prediction tasks; while models that pass this test enable both better interpolation and better extrapolation in multiple ways. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
more » « less
Full Text Available
Reliable edge machine learning hardware for scientific applications

https://doi.org/10.1109/VTS60656.2024.10538639

Baldi, Tommaso; Campos, Javier; Hawks, Ben; Ngadiuba, Jennifer; Tran, Nhan; Diaz, Daniel; Duarte, Javier; Kastner, Ryan; Meza, Andres; Quinnan, Melissa; et al (April 2024, IEEE)

Full Text Available
Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data

https://doi.org/10.1145/3580305.3599518

Yang, Yaoqing; Theisen, Ryan; Hodgkinson, Liam; Gonzalez, Joseph E; Ramchandran, Kannan; Martin, Charles H; Mahoney, Michael W (August 2023, ACM)

Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective generalization metrics that can guide this type of model selection. Effective metrics are typically expected to correlate strongly with test performance. In this paper, we expand on prior analyses by examining generalization-metric-based model selection with the following objectives: (i) focusing on natural language processing (NLP) tasks, as prior work primarily concentrates on computer vision (CV) tasks; (ii) considering metrics that directly predict test error instead of the generalization gap; (iii) exploring metrics that do not need access to data to compute. From these objectives, we are able to provide the first model selection results on large pretrained Transformers from Huggingface using generalization metrics. Our analyses consider (I) hundreds of Transformers trained in different settings, in which we systematically vary the amount of data, the model size and the optimization hyperparameters, (II) a total of 51 pretrained Transformers from eight families of Huggingface NLP models, including GPT2, BERT, etc., and (III) a total of 28 existing and novel generalization metrics. Despite their niche status, we find that metrics derived from the heavy-tail (HT) perspective are particularly useful in NLP tasks, exhibiting stronger correlations than other, more popular metrics. To further examine these metrics, we extend prior formulations relying on power law (PL) spectral distributions to exponential (EXP) and exponentially-truncated power law (E-TPL) families.
more » « less
Full Text Available
Rapid Fitting of Band-Excitation Piezoresponse Force Microscopy Using Physics Constrained Unsupervised Neural Networks

Kaliyev, Alibek T; Forelli, Ryan F; Qin, Shuyu; Guo, Yichen; Memik, Seda; Mahoney, Michael W; Gholami, Amir; Tran, Nhan; Harris, Philip; Takáč, Martin; et al (October 2023, NeurIPS)

Full Text Available
Bootstrapping the operator norm in high dimensions: Error estimation for covariance matrices and sketching

https://doi.org/10.3150/22-BEJ1463

Lopes, Miles E.; Erichson, N. Benjamin; Mahoney, Michael W. (February 2023, Bernoulli)

Full Text Available
Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance

https://doi.org/10.1137/20m1333055

Fountoulakis, Kimon; Liu, Meng; Gleich, David F.; Mahoney, Michael W. (February 2023, SIAM Review)

Full Text Available
Inexact Newton-CG algorithms with complexity guarantees

https://doi.org/10.1093/imanum/drac043

Yao, Zhewei; Xu, Peng; Roosta, Fred; Wright, Stephen J; Mahoney, Michael W (August 2022, IMA Journal of Numerical Analysis)

Abstract We consider variants of a recently developed Newton-CG algorithm for nonconvex problems (Royer, C. W. & Wright, S. J. (2018) Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477) in which inexact estimates of the gradient and the Hessian information are used for various steps. Under certain conditions on the inexactness measures, we derive iteration complexity bounds for achieving $$\epsilon $$-approximate second-order optimality that match best-known lower bounds. Our inexactness condition on the gradient is adaptive, allowing for crude accuracy in regions with large gradients. We describe two variants of our approach, one in which the step size along the computed search direction is chosen adaptively, and another in which the step size is pre-defined. To obtain second-order optimality, our algorithms will make use of a negative curvature direction on some steps. These directions can be obtained, with high probability, using the randomized Lanczos algorithm. In this sense, all of our results hold with high probability over the run of the algorithm. We evaluate the performance of our proposed algorithms empirically on several machine learning models. Our approach is a first attempt to introduce inexact Hessian and/or gradient information into the Newton-CG algorithm of Royer & Wright (2018, Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477).
more » « less
Full Text Available
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms

Ma, Ping; Chen, Yongkai; Zhang, Xinlian; Xing, Xin; Ma, Jingyi; Mahoney, Michael W (June 2022, Journal of machine learning research)

Full Text Available

« Prev Next »

Search for: All records