skip to main content


This content will become publicly available on April 30, 2024

Title: PARROT: Position-Aware Regularized Optimal Transport for Network Alignment
Network alignment is a critical steppingstone behind a variety of multi-network mining tasks. Most of the existing methods essentially optimize a Frobenius-like distance or ranking-based loss, ignoring the underlying geometry of graph data. Optimal transport (OT), together with Wasserstein distance, has emerged to be a powerful approach accounting for the underlying geometry explicitly. Promising as it might be, the state-of-the-art OT-based alignment methods suffer from two fundamental limitations, including (1) effectiveness due to the insufficient use of topology and consistency information and (2) scalability due to the non-convex formulation and repeated computationally costly loss calculation. In this paper, we propose a position-aware regularized optimal transport framework for network alignment named PARROT. To tackle the effectiveness issue, the proposed PARROT captures topology information by random walk with restart, with three carefully designed consistency regularization terms. To tackle the scalability issue, the regularized OT problem is decomposed into a series of convex subproblems and can be efficiently solved by the proposed constrained proximal point method with guaranteed convergence. Extensive experiments show that our algorithm achieves significant improvements in both effectiveness and scalability, outperforming the state-of-the-art network alignment methods and speeding up existing OT-based methods by up to 100 times.  more » « less
Award ID(s):
2134079 1939725 1947135
NSF-PAR ID:
10428938
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
WWW '23: Proceedings of the ACM Web Conference 2023
Page Range / eLocation ID:
372 to 382
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Optimal transport (OT) is a versatile framework for comparing probability measures, with many applications to statistics, machine learning and applied mathematics. However, OT distances suffer from computational and statistical scalability issues to high dimensions, which motivated the study of regularized OT methods like slicing, smoothing and entropic penalty. This work establishes a unified framework for deriving limit distributions of empirical regularized OT distances, semiparametric efficiency of the plug-in empirical estimator and bootstrap consistency. We apply the unified framework to provide a comprehensive statistical treatment of (i) average- and max-sliced $p$-Wasserstein distances, for which several gaps in existing literature are closed; (ii) smooth distances with compactly supported kernels, the analysis of which is motivated by computational considerations; and (iii) entropic OT, for which our method generalizes existing limit distribution results and establishes, for the first time, efficiency and bootstrap consistency. While our focus is on these three regularized OT distances as applications, the flexibility of the proposed framework renders it applicable to broad classes of functionals beyond these examples.

     
    more » « less
  2. null (Ed.)
    We consider network topology identification subject to a signal smoothness prior on the nodal observations. A fast dual-based proximal gradient algorithm is developed to efficiently tackle a strongly convex, smoothness-regularized network inverse problem known to yield high-quality graph solutions. Unlike existing solvers, the novel iterations come with global convergence rate guarantees and do not require additional step-size tuning. Reproducible simulated tests demonstrate the effectiveness of the proposed method in accurately recovering random and real-world graphs, markedly faster than state-of-the-art alternatives and without incurring an extra computational burden. 
    more » « less
  3. Optimal transport (OT) measures distances between distributions in a way that depends on the geometry of the sample space. In light of recent advances in computational OT, OT distances are widely used as loss functions in machine learning. Despite their prevalence and advantages, OT loss functions can be extremely sensitive to outliers. In fact, a single adversarially-picked outlier can increase the standard W2-distance arbitrarily. To address this issue, we propose an outlier-robust formulation of OT. Our formulation is convex but challenging to scale at a first glance. Our main contribution is deriving an \emph{equivalent} formulation based on cost truncation that is easy to incorporate into modern algorithms for computational OT. We demonstrate the benefits of our formulation in mean estimation problems under the Huber contamination model in simulations and outlier detection tasks on real data. 
    more » « less
  4. null (Ed.)
    Network alignment finds node correspondences across multiple networks, where the alignment accuracy is of crucial importance because of its profound impact on downstream applications. The vast majority of existing works focus on how to best utilize the topology and attribute information of the input networks as well as the anchor links when available. Nonetheless, it has not been well studied on how to boost the alignment performance through actively obtaining high-quality and informative anchor links, with a few exceptions. The sparse literature on active network alignment introduces the human in the loop to label some seed node correspondence (i.e., anchor links), which are informative from the perspective of querying the most uncertain node given few potential matchings. However, the direct influence of the intrinsic network attribute information on the alignment results has largely remained unknown. In this paper, we tackle this challenge and propose an active network alignment method (Attent) to identify the best nodes to query. The key idea of the proposed method is to leverage effective and efficient influence functions defined over the alignment solution to evaluate the goodness of the candidate nodes for query. Our proposed query strategy bears three distinct advantages, including (1) effectiveness, being able to accurately quantify the influence of the candidate nodes on the alignment results; (2) efficiency, scaling linearly with 15 − 17× speed-up over the straightforward implementation without any quality loss; (3) generality, consistently improving alignment performance of a variety of network alignment algorithms. 
    more » « less
  5. Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport---specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use input-convex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream. 
    more » « less