skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, March 22 until 6:00 AM ET on Saturday, March 23 due to maintenance. We apologize for the inconvenience.


Title: A parallel method for earth mover’s distance
We propose a new algorithm to approximate the Earth Mover's distance (EMD). Our main idea is motivated by the theory of optimal transport, in which EMD can be reformulated as a familiar L1 type minimization. We use a regularization which gives us a unique solution for this L1 type problem. The new regularized minimization is very similar to problems which have been solved in the fields of compressed sensing and image processing, where several fast methods are available. In this paper, we adopt a primal-dual algorithm designed there, which uses very simple updates at each iteration and is shown to converge very rapidly. Several numerical examples are provided.  more » « less
Award ID(s):
1700202
NSF-PAR ID:
10090356
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Journal of scientific computing
Volume:
75
Issue:
1
ISSN:
0885-7474
Page Range / eLocation ID:
182-192
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. SUMMARY

    Within the iron metallogenic province of southeast Missouri, USA, there are several mines that contain not only economic iron resources, magnetite and/or hematite, but also contain rare earth elements, copper and gold. An area including three major deposits, Pea Ridge, Bourbon and Kratz Spring, was selected for detailed modelling for the upper crustal magnetic susceptibility and density structures. For the study area, ground gravity and high-resolution airborne magnetic and gravity gradiometry data sets are available. An efficient and novel joint inversion algorithm for the simultaneous inversion of these multiple data sets is presented. The Gramian coupling constraint is used to correlate the reconstructed density and magnetic susceptibility models. The implementation relies on the structures of the sensitivity matrices and an efficient minimization algorithm to achieve significant reductions in the memory requirements and computational costs. Consequently, it is feasible to use a laptop computer for the inversion of multiple data sets, each containing thousands of data points, for the recovery of models on the study area, each including approximately one million model parameters. This is the first time that these multiple data sets have been simultaneously inverted for this area. The L1-norm stabilizer is used to provide compact and focused images of the ore deposits. For contrast, independent inversions of each data set are also discussed. In general, our results provide new insights about the concealed ore deposits in the Mesoproterozoic basement rocks of southeast Missouri. Both short- and long-wavelength anomalies exist in the recovered models; these provide a high-resolution image of the subsurface. The geometry and physical properties of the known deposits are determined very well. Additionally, some unknown concealed deposits are revealed; these could be economically valuable and should be considered in future geophysical and geological investigations.

     
    more » « less
  2. Different to traditional clustering methods that deal with one single type of data, High-Order Co- Clustering (HOCC) aims to cluster multiple types of data simultaneously by utilizing the inter- or/and intra-type relationships across different data types. In existing HOCC methods, data points routinely enter the objective functions with squared residual errors. As a result, outlying data samples can dominate the objective functions, which may lead to incorrect clustering results. Moreover, existing methods usually suffer from soft clustering, where the probabilities to different groups can be very close. In this paper, we propose an L1 -norm symmetric nonnegative matrix tri-factorization method to solve the HOCC problem. Due to the orthogonal constraints and the symmetric L1 -norm formulation in our new objective, conventional auxiliary function approach no longer works. Thus we derive the solution algorithm using the alternating direction method of multipliers. Extensive experiments have been conducted on a real world data set, in which promising empirical results, including less time consumption, strictly orthogonal membership matrix, lower local minima etc., have demonstrated the effectiveness of our proposed method.

     
    more » « less
  3. There are many important high dimensional function classes that have fast agnostic learning algorithms when strong assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be sufficiently confident that the data indeed satisfies the distributional assumption, so that one can trust in the output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs (A,T), such that if the distribution on examples in the data passes the tester T then one can safely trust the output of the agnostic learner A on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with a combined run-time of nÕ(1/є4). This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussian distribution testers do not exist for the L1 and EMD distance measures. Previously it was known that half-spaces are well-approximated with low-degree polynomials relative to the Gaussian distribution. A key step in our analysis is showing that this is the case even relative to distributions whose low-degree moments approximately match those of a Gaussian. We also go beyond spherically-symmetric distributions, and give a tester-learner pair for halfspaces under the uniform distribution on {0,1}n with combined run-time of nÕ(1/є4). This is achieved using polynomial approximation theory and critical index machinery of [Diakonikolas, Gopalan, Jaiswal, Servedio, and Viola 2009]. Can one design agnostic learning algorithms under distributional assumptions and count on future technical work to produce, as a matter of course, tester-learner pairs with similar run-time? Our answer is a resounding no, as we show there exist some well-studied settings for which 2Õ(√n) run-time agnostic learning algorithms are available, yet the combined run-times of tester-learner pairs must be as high as 2Ω(n). On that account, the design of tester-learner pairs is a research direction in its own right independent of standard agnostic learning. To be specific, our lower bounds apply to the problems of agnostically learning convex sets under the Gaussian distribution and for monotone Boolean functions under the uniform distribution over {0,1}n. 
    more » « less
  4. A bstract The identification of interesting substructures within jets is an important tool for searching for new physics and probing the Standard Model at colliders. Many of these substructure tools have previously been shown to take the form of optimal transport problems, in particular the Energy Mover’s Distance (EMD). In this work, we show that the EMD is in fact the natural structure for comparing collider events, which accounts for its recent success in understanding event and jet substructure. We then present a Shape Hunting Algorithm using Parameterized Energy Reconstruction (S haper ), which is a general framework for defining and computing shape-based observables. S haper generalizes N -jettiness from point clusters to any extended, parametrizable shape. This is accomplished by efficiently minimizing the EMD between events and parameterized manifolds of energy flows representing idealized shapes, implemented using the dual-potential Sinkhorn approximation of the Wasserstein metric. We show how the geometric language of observables as manifolds can be used to define novel observables with built-in infrared-and-collinear safety. We demonstrate the efficacy of the S haper framework by performing empirical jet substructure studies using several examples of new shape-based observables. 
    more » « less
  5. Recovering sparse conditional independence graphs from data is a fundamental problem in machine learning with wide applications. A popular formulation of the problem is an L1 regularized maximum likelihood estimation. Many convex optimization algorithms have been designed to solve this formulation to recover the graph structure. Recently, there is a surge of interest to learn algorithms directly based on data, and in this case, learn to map empirical covariance to the sparse precision matrix. However, it is a challenging task in this case, since the symmetric positive definiteness (SPD) and sparsity of the matrix are not easy to enforce in learned algorithms, and a direct mapping from data to precision matrix may contain many parameters. We propose a deep learning architecture, GLAD, which uses an Alternating Minimization (AM) algorithm as our model inductive bias, and learns the model parameters via supervised learning. We show that GLAD learns a very compact and effective model for recovering sparse graphs from data. 
    more » « less