Metric magnitude is a measure of the “size” of point clouds with many desirable geometric properties. It has been adapted to various mathematical contexts and recent work suggests that it can enhance machine learning and optimization algorithms. But its usability is limited due to the computational cost when the dataset is large or when the computation must be carried out repeatedly (e.g. in model training). In this paper, we study the magnitude computation problem, and show efficient ways of approximating it. We show that it can be cast as a convex optimization problem, but not as a submodular optimization. The paper describes two new algorithms – an iterative approximation algorithm that converges fast and is accurate, and a subset selection method that makes the computation even faster. It has been previously proposed that magnitude of model sequences generated during stochastic gradient descent is correlated to generalization gap. Extension of this result using our more scalable algorithms shows that longer sequences in fact bear higher correlations. We also describe new applications of magnitude in machine learning – as an effective regularizer for neural network training, and as a novel clustering criterion. 
                        more » 
                        « less   
                    This content will become publicly available on April 11, 2026
                            
                            Approximating Metric Magnitude of Point Sets
                        
                    
    
            Metric magnitude of a point cloud is a measure of its ``size. It has been adapted to various mathematical contexts and recent work suggests that it can enhance machine learning and optimization algorithms. But its usability is limited due to the computational cost when the dataset is large or when the computation must be carried out repeatedly (e.g. in model training). In this paper, we study the magnitude computation problem, and show efficient ways of approximating it. We show that it can be cast as a convex optimization problem, but not as a submodular optimization. The paper describes two new algorithms -- an iterative approximation algorithm that converges fast and is accurate in practice, and a subset selection method that makes the computation even faster. It has previously been proposed that the magnitude of model sequences generated during stochastic gradient descent is correlated to the generalization gap. Extension of this result using our more scalable algorithms shows that longer sequences bear higher correlations. We also describe new applications of magnitude in machine learning -- as an effective regularizer for neural network training, and as a novel clustering criterion. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10608498
- Publisher / Repository:
- AAAI Press
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 39
- Issue:
- 15
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 15374 to 15381
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The ubiquitous use of machine learning algorithms brings new challenges to traditional database problems such as incremental view update. Much effort is being put in better understanding and debugging machine learning models, as well as in identifying and repairing errors in training datasets. Our focus is on how to assist these activities when they have to retrain the machine learning model after removing problematic training samples in cleaning or selecting different subsets of training data for interpretability. This paper presents an efficient provenance-based approach, PrIU, and its optimized version, PrIU-opt, for incrementally updating model parameters without sacrificing prediction accuracy. We prove the correctness and convergence of the incrementally updated model parameters, and validate it experimentally. Experimental results show that up to two orders of magnitude speed-ups can be achieved by PrIU-opt compared to simply retraining the model from scratch, yet obtaining highly similar models.more » « less
- 
            Binary classification is a fundamental machine learning task defined as correctly assigning new objects to one of two groups based on a set of training objects. Driven by the practical importance of binary classification, numerous machine learning techniques have been developed and refined over the last three decades. Among the most popular techniques are artificial neural networks, decision trees, ensemble methods, logistic regression, and support vector machines. We present here machine learning and pattern recognition algorithms that, unlike the commonly used techniques, are based on combinatorial optimization and make use of information on pairwise relations between the objects of the data set, whether training objects or not. These algorithms solve the respective problems optimally and efficiently, in contrast to the primarily heuristic approaches currently used for intractable problem models in pattern recognition and machine learning. The algorithms described solve efficiently the classification problem as a network flow problem on a graph. The technical tools used in the algorithm are the parametric cut procedure and a process called sparse computation that computes only the pairwise similarities that are “relevant.” Sparse computation enables the scalability of any algorithm that uses pairwise similarities. We present evidence on the effectiveness of the approaches, measured in terms of accuracy and running time, in pattern recognition, image segmentation, and general data mining.more » « less
- 
            Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.more » « less
- 
            Hybrid electric vehicles can achieve better fuel economy than conventional vehicles by utilizing multiple power sources. While these power sources have been controlled by rule-based or optimization-based control algorithms, recent studies have shown that machine learning-based control algorithms such as online Deep Reinforcement Learning (DRL) can effectively control the power sources as well. However, the optimization and training processes for the online DRL-based powertrain control strategy can be very time and resource intensive. In this paper, a new offline–online hybrid DRL strategy is presented where offline vehicle data are exploited to build an initial model and an online learning algorithm explores a new control policy to further improve the fuel economy. In this manner, it is expected that the agent can learn an environment consisting of the vehicle dynamics in a given driving condition more quickly compared to the online algorithms, which learn the optimal control policy by interacting with the vehicle model from zero initial knowledge. By incorporating a priori offline knowledge, the simulation results show that the proposed approach not only accelerates the learning process and makes the learning process more stable, but also leads to a better fuel economy compared to online only learning algorithms.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
