skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning Robust Distance Metric with Side Information via Ratio Minimization of Orthogonally Constrained L21-Norm Distances
Metric Learning, which aims at learning a distance metric for a given data set, plays an important role in measuring the distance or similarity between data objects. Due to its broad usefulness, it has attracted a lot of interest in machine learning and related areas in the past few decades. This paper proposes to learn the distance metric from the side information in the forms of must-links and cannot-links. Given the pairwise constraints, our goal is to learn a Mahalanobis distance that minimizes the ratio of the distances of the data pairs in the must-links to those in the cannot-links. Different from many existing papers that use the traditional squared L2-norm distance, we develop a robust model that is less sensitive to data noise or outliers by using the not-squared L2-norm distance. In our objective, the orthonormal constraint is enforced to avoid degenerate solutions. To solve our objective, we have derived an efficient iterative solution algorithm. We have conducted extensive experiments, which demonstrated the superiority of our method over state-of-the-art.  more » « less
Award ID(s):
1652943 1849359
PAR ID:
10129597
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Page Range / eLocation ID:
3008 to 3014
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Linear discriminant analysis (LDA) is widely used for dimensionality reduction under supervised learning settings. Traditional LDA objective aims to minimize the ratio of squared Euclidean distances that may not perform optimally on noisy data sets. Multiple robust LDA objectives have been proposed to address this problem, but their implementations have two major limitations. One is that their mean calculations use the squared l2-norm distance to center the data, which is not valid when the objective does not use the Euclidean distance. The second problem is that there is no generalized optimization algorithm to solve different robust LDA objectives. In addition, most existing algorithms can only guarantee the solution to be locally optimal, rather than globally optimal. In this paper, we review multiple robust loss functions and propose a new and generalized robust objective for LDA. Besides, to better remove the mean value within data, our objective uses an optimal way to center the data through learning. As one important algorithmic contribution, we derive an efficient iterative algorithm to optimize the resulting non-smooth and non-convex objective function. We theoretically prove that our solution algorithm guarantees that both the objective and the solution sequences converge to globally optimal solutions at a sub-linear convergence rate. The experimental results demonstrate the effectiveness of our new method, achieving significant improvements compared to the other competing methods. 
    more » « less
  2. Registering functions (curves) using time warpings (re-parameterizations) is central to many computer vision and shape analysis solutions. While traditional registration methods minimize penalized-L2 norm, the elastic Riemannian metric and square-root velocity functions (SRVFs) have resulted in significant improvements in terms of theory and practical performance. This solution uses the dynamic programming algorithm to minimize the L2 norm between SRVFs of given functions. However, the computational cost of this elastic dynamic programming framework – O(nT 2 k) – where T is the number of time samples along curves, n is the number of curves, and k < T is a parameter – limits its use in applications involving big data. This paper introduces a deep-learning approach, named SRVF Registration Net or SrvfRegNet to overcome these limitations. SrvfRegNet architecture trains by optimizing the elastic metric-based objective function on the training data and then applies this trained network to the test data to perform fast registration. In case the training and the test data are from different classes, it generalizes to the test data using transfer learning, i.e., retraining of only the last few layers of the network. It achieves the state-of-the-art alignment performance albeit at much reduced computational cost. We demonstrate the efficiency and efficacy of this framework using several standard curve datasets. 
    more » « less
  3. Learning distances that operate directly on multidimensional sequences is challenging because such distances are structural by nature and the vectors in sequences are not independent. Generally, distances for sequences heavily depend on the ground metric between the vectors in sequences. We propose to learn the distance for sequences through learning a ground Mahalanobis metric for the vectors in sequences. The learning samples are sequences of vectors for which how the ground metric between vectors induces the overall distance is given, and the objective is that the distance induced by the learned ground metric produces large values for sequences from different classes and small values for those from the same class. We formulate the metric as a parameter of the distance, bring closer each sequence to an associated virtual sequence w.r.t. the distance to reduce the number of constraints, and develop a general iterative solution for any ground-metric-based sequence distance. Experiments on several sequence datasets demonstrate the effectiveness and efficiency of our method. 
    more » « less
  4. Given a set of facilities and clients, and costs to open facilities, the classic facility location problem seeks to open a set of facilities and assign each client to one open facility to minimize the cost of opening the chosen facilities and the total distance of the clients to their assigned open facilities. Such an objective may induce an unequal cost over certain socioeconomic groups of clients (i.e., total distance traveled by clients in such a group). This is important when planning the location of socially relevant facilities such as emergency rooms. In this work, we consider a fair version of the problem where we are given 𝑟 clients groups that partition the set of clients, and the distance of a given group is defined as the average distance of clients in the group to their respective open facilities. The objective is to minimize the Minkowski 𝑝-norm of vector of group distances, to penalize high access costs to open facilities across 𝑟 groups of clients. This generalizes classic facility location (𝑝 = 1) and the minimization of the maximum group distance (𝑝 = ∞). However, in practice, fairness criteria may not be explicit or even known to a decision maker, and it is often unclear how to select a specific "𝑝" to model the cost of unfairness. To get around this, we study the notion of solution portfolios where for a fixed problem instance, we seek a small portfolio of solutions such that for any Minkowski norm 𝑝, one of these solutions is an 𝑂(1)-approximation. Using the geometric relationship between various 𝑝-norms, we show the existence of a portfolio of cardinality 𝑂(log 𝑟), and a lower bound of (\sqrt{log r}). There may not be common structure across different solutions in this portfolio, which can make planning difficult if the notion of fairness changes over time or if the budget to open facilities is disbursed over time. For example, small changes in 𝑝 could lead to a completely different set of open facilities in the portfolio. Inspired by this, we introduce the notion of refinement, which is a family of solutions for each 𝑝-norm satisfying a combinatorial property. This property requires that (1) the set of facilities open for a higher 𝑝-norm must be a subset of the facilities open for a lower 𝑝-norm, and (2) all clients assigned to an open facility for a lower 𝑝-norm must be assigned to the same open facility for any higher 𝑝-norm. A refinement is 𝛼-approximate if the solution for each 𝑝-norm problem is an 𝛼-approximation for it. We show that it is sufficient to consider only 𝑂(log 𝑟) norms instead of all 𝑝-norms, 𝑝 ∈ [1, ∞] to construct refinements. A natural greedy algorithm for the problem gives a poly(𝑟)-approximate refinement, which we improve to poly(r^1/\sqrt{log 𝑟})-approximate using a recursive algorithm. We improve this ratio to 𝑂(log 𝑟) for the special case of tree metric for uniform facility open cost. Our recursive algorithm extends to other settings, including to a hierarchical facility location problem that models facility location problems at several levels, such as public works departments and schools. A full version of this paper can be found at https://arxiv.org/abs/2211.14873. 
    more » « less
  5. null (Ed.)
    The 2-Wasserstein distance (or RMS distance) is a useful measure of similarity between probability distributions with exciting applications in machine learning. For discrete distributions, the problem of computing this distance can be expressed in terms of finding a minimum-cost perfect matching on a complete bipartite graph given by two multisets of points A, B ⊂ ℝ2, with |A| = |B| = n, where the ground distance between any two points is the squared Euclidean distance between them. Although there is a near-linear time relative ∊-approximation algorithm for the case where the ground distance is Euclidean (Sharathkumar and Agarwal, JACM 2020), all existing relative ∊-approximation algorithms for the RMS distance take Ω(n3/2) time. This is primarily because, unlike Euclidean distance, squared Euclidean distance is not a metric. In this paper, for the RMS distance, we present a new ∊-approximation algorithm that runs in O(n^5/4 poly{log n, 1/∊}) time. Our algorithm is inspired by a recent approach for finding a minimum-cost perfect matching in bipartite planar graphs (Asathulla et al, TALG 2020). Their algorithm depends heavily on the existence of sublinear sized vertex separators as well as shortest path data structures that require planarity. Surprisingly, we are able to design a similar algorithm for a complete geometric graph that is far from planar and does not have any vertex separators. Central components of our algorithm include a quadtree-based distance that approximates the squared Euclidean distance and a data structure that supports both Hungarian search and augmentation in sublinear time. 
    more » « less