skip to main content


This content will become publicly available on April 30, 2024

Title: PARROT: Position-Aware Regularized Optimal Transport for Network Alignment
Network alignment is a critical steppingstone behind a variety of multi-network mining tasks. Most of the existing methods essentially optimize a Frobenius-like distance or ranking-based loss, ignoring the underlying geometry of graph data. Optimal transport (OT), together with Wasserstein distance, has emerged to be a powerful approach accounting for the underlying geometry explicitly. Promising as it might be, the state-of-the-art OT-based alignment methods suffer from two fundamental limitations, including (1) effectiveness due to the insufficient use of topology and consistency information and (2) scalability due to the non-convex formulation and repeated computationally costly loss calculation. In this paper, we propose a position-aware regularized optimal transport framework for network alignment named PARROT. To tackle the effectiveness issue, the proposed PARROT captures topology information by random walk with restart, with three carefully designed consistency regularization terms. To tackle the scalability issue, the regularized OT problem is decomposed into a series of convex subproblems and can be efficiently solved by the proposed constrained proximal point method with guaranteed convergence. Extensive experiments show that our algorithm achieves significant improvements in both effectiveness and scalability, outperforming the state-of-the-art network alignment methods and speeding up existing OT-based methods by up to 100 times.  more » « less
Award ID(s):
2134079 1939725 1947135
NSF-PAR ID:
10428938
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
WWW '23: Proceedings of the ACM Web Conference 2023
Page Range / eLocation ID:
372 to 382
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We consider network topology identification subject to a signal smoothness prior on the nodal observations. A fast dual-based proximal gradient algorithm is developed to efficiently tackle a strongly convex, smoothness-regularized network inverse problem known to yield high-quality graph solutions. Unlike existing solvers, the novel iterations come with global convergence rate guarantees and do not require additional step-size tuning. Reproducible simulated tests demonstrate the effectiveness of the proposed method in accurately recovering random and real-world graphs, markedly faster than state-of-the-art alternatives and without incurring an extra computational burden. 
    more » « less
  2. Optimal transport (OT) measures distances between distributions in a way that depends on the geometry of the sample space. In light of recent advances in computational OT, OT distances are widely used as loss functions in machine learning. Despite their prevalence and advantages, OT loss functions can be extremely sensitive to outliers. In fact, a single adversarially-picked outlier can increase the standard W2-distance arbitrarily. To address this issue, we propose an outlier-robust formulation of OT. Our formulation is convex but challenging to scale at a first glance. Our main contribution is deriving an \emph{equivalent} formulation based on cost truncation that is easy to incorporate into modern algorithms for computational OT. We demonstrate the benefits of our formulation in mean estimation problems under the Huber contamination model in simulations and outlier detection tasks on real data. 
    more » « less
  3. null (Ed.)
    Network alignment finds node correspondences across multiple networks, where the alignment accuracy is of crucial importance because of its profound impact on downstream applications. The vast majority of existing works focus on how to best utilize the topology and attribute information of the input networks as well as the anchor links when available. Nonetheless, it has not been well studied on how to boost the alignment performance through actively obtaining high-quality and informative anchor links, with a few exceptions. The sparse literature on active network alignment introduces the human in the loop to label some seed node correspondence (i.e., anchor links), which are informative from the perspective of querying the most uncertain node given few potential matchings. However, the direct influence of the intrinsic network attribute information on the alignment results has largely remained unknown. In this paper, we tackle this challenge and propose an active network alignment method (Attent) to identify the best nodes to query. The key idea of the proposed method is to leverage effective and efficient influence functions defined over the alignment solution to evaluate the goodness of the candidate nodes for query. Our proposed query strategy bears three distinct advantages, including (1) effectiveness, being able to accurately quantify the influence of the candidate nodes on the alignment results; (2) efficiency, scaling linearly with 15 − 17× speed-up over the straightforward implementation without any quality loss; (3) generality, consistently improving alignment performance of a variety of network alignment algorithms. 
    more » « less
  4. Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport---specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use input-convex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream. 
    more » « less
  5. Wasserstein Barycenter is a principled approach to represent the weighted mean of a given set of probability distributions, utilizing the geometry induced by optimal transport. In this work, we present a novel scalable algorithm to approximate the Wasserstein Barycenters aiming at highdimensional applications in machine learning. Our proposed algorithm is based on the Kantorovich dual formulation of the Wasserstein-2 distance as well as a recent neural network architecture, input convex neural network, that is known to parametrize convex functions. The distinguishing features of our method are: i) it only requires samples from the marginal distributions; ii) unlike the existing approaches, it represents the Barycenter with a generative model and can thus generate infinite samples from the barycenter without querying the marginal distributions; iii) it works similar to Generative Adversarial Model in one marginal case. We demonstrate the efficacy of our algorithm by comparing it with the state-of-art methods in multiple experiments. 
    more » « less