skip to main content


Title: Convex clustering with metric learning
The convex clustering formulation of Chi and Lange (2015) is revisited. While this formulation can be precisely and efficiently solved, it uses the standard Euclidean metric to measure the distance between the data points and their corresponding cluster centers and hence its performance deteriorates significantly in the presence of outlier features. To address this issue, this paper considers a formulation that combines convex clustering with metric learning. It is shown that: (1) for any given positive definite Mahalanobis distance metric, the problem of convex clustering can be precisely and efficiently solved using the Alternating Direction Method of Multipliers; (2) the problem of learning a positive definite Mahalanobis distance metric admits a closed-form solution; (3) an algorithm that alternates between convex clustering and metric learning can provide a significant performance boost over not only the original convex clustering formulation but also the recently proposed robust convex clustering formulation of Wang et al. (2017).  more » « less
Award ID(s):
1719017
NSF-PAR ID:
10061871
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Pattern recognition
Volume:
81
ISSN:
0031-3203
Page Range / eLocation ID:
575-584
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many metric learning tasks, such as triplet learning, nearest neighbor retrieval, and visualization, are treated primarily as embedding tasks where the ultimate metric is some variant of the Euclidean distance (e.g., cosine or Mahalanobis), and the algorithm must learn to embed points into the pre-chosen space. The study of non-Euclidean geometries is often not explored, which we believe is due to a lack of tools for learning non-Euclidean measures of distance. Recent work has shown that Bregman divergences can be learned from data, opening a promising approach to learning asymmetric distances. We propose a new approach to learning arbitrary Bergman divergences in a differentiable manner via input convex neural networks and show that it overcomes significant limitations of previous works. We also demonstrate that our method more faithfully learns divergences over a set of both new and previously studied tasks, including asymmetric regression, ranking, and clustering. Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification, showing the general utility of our approach for asymmetric learning. 
    more » « less
  2. The classical Monge–Kantorovich (MK) problem as originally posed is concerned with how best to move a pile of soil or rubble to an excavation or fill with the least amount of work relative to some cost function. When the cost is given by the square of the Euclidean distance, one can define a metric on densities called the Wasserstein distance . In this note, we formulate a natural matrix counterpart of the MK problem for positive-definite density matrices. We prove a number of results about this metric including showing that it can be formulated as a convex optimisation problem, strong duality, an analogue of the Poincaré–Wirtinger inequality and a Lax–Hopf–Oleinik–type result. 
    more » « less
  3. Network alignment is a critical steppingstone behind a variety of multi-network mining tasks. Most of the existing methods essentially optimize a Frobenius-like distance or ranking-based loss, ignoring the underlying geometry of graph data. Optimal transport (OT), together with Wasserstein distance, has emerged to be a powerful approach accounting for the underlying geometry explicitly. Promising as it might be, the state-of-the-art OT-based alignment methods suffer from two fundamental limitations, including (1) effectiveness due to the insufficient use of topology and consistency information and (2) scalability due to the non-convex formulation and repeated computationally costly loss calculation. In this paper, we propose a position-aware regularized optimal transport framework for network alignment named PARROT. To tackle the effectiveness issue, the proposed PARROT captures topology information by random walk with restart, with three carefully designed consistency regularization terms. To tackle the scalability issue, the regularized OT problem is decomposed into a series of convex subproblems and can be efficiently solved by the proposed constrained proximal point method with guaranteed convergence. Extensive experiments show that our algorithm achieves significant improvements in both effectiveness and scalability, outperforming the state-of-the-art network alignment methods and speeding up existing OT-based methods by up to 100 times. 
    more » « less
  4. We propose a unified data-driven framework based on inverse optimal transport that can learn adaptive, nonlinear interaction cost function from noisy and incomplete empirical matching matrix and predict new matching in various matching contexts. We emphasize that the discrete optimal transport plays the role of a variational principle which gives rise to an optimization based framework for modeling the observed empirical matching data. Our formulation leads to a non-convex optimization problem which can be solved efficiently by an alternating optimization method. A key novel aspect of our formulation is the incorporation of marginal relaxation via regularized Wasserstein distance, significantly improving the robustness of the method in the face of noisy or missing empirical matching data. Our model falls into the category of prescriptive models, which not only predict potential future matching, but is also able to explain what leads to empirical matching and quantifies the impact of changes in matching factors. The proposed approach has wide applicability including predicting matching in online dating, labor market, college application and crowdsourcing. We back up our claims with numerical experiments on both synthetic data and real world data sets. 
    more » « less
  5. Agents with aberrant behavior are commonplace in today’s networks. There are fake profiles in social media, malicious websites on the internet, and fake news sources that are prolific in spreading misinformation. The distinguishing characteristic of networks with aberrant agents is that normal agents rarely link to aberrant ones. Based on this manifested behavior, we propose a directed Markov Random Field (MRF) formulation for detecting aberrant agents. The formulation balances two objectives: to have as few links as possible from normal to aberrant agents, as well as to deviate minimally from prior information (if given). The MRF formulation is solved optimally and efficiently. We compare the optimal solution for the MRF formulation to existing algorithms, including PageRank, TrustRank, and AntiTrustRank. To assess the performance of these algorithms, we present a variant of the modularity clustering metric that overcomes the known shortcomings of modularity in directed graphs. We show that this new metric has desirable properties and prove that optimizing it is NP-hard. In an empirical experiment with twenty-three different datasets, we demonstrate that the MRF method outperforms the other detection algorithms. 
    more » « less