skip to main content

Title: Learning to Match via Inverse Optimal Transport
We propose a unified data-driven framework based on inverse optimal transport that can learn adaptive, nonlinear interaction cost function from noisy and incomplete empirical matching matrix and predict new matching in various matching contexts. We emphasize that the discrete optimal transport plays the role of a variational principle which gives rise to an optimization based framework for modeling the observed empirical matching data. Our formulation leads to a non-convex optimization problem which can be solved efficiently by an alternating optimization method. A key novel aspect of our formulation is the incorporation of marginal relaxation via regularized Wasserstein distance, significantly improving the robustness of the method in the face of noisy or missing empirical matching data. Our model falls into the category of prescriptive models, which not only predict potential future matching, but is also able to explain what leads to empirical matching and quantifies the impact of changes in matching factors. The proposed approach has wide applicability including predicting matching in online dating, labor market, college application and crowdsourcing. We back up our claims with numerical experiments on both synthetic data and real world data sets.  more » « less
Award ID(s):
1745382 1620342 1818886
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of machine learning research
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of learning the underlying structure of a general discrete pairwise Markov network. Existing approaches that rely on empirical risk minimization may perform poorly in settings with noisy or scarce data. To overcome these limitations, we propose a computationally efficient and robust learning method for this problem with near-optimal sample complexities. Our approach builds upon distributionally robust optimization (DRO) and maximum conditional log-likelihood. The proposed DRO estimator minimizes the worst-case risk over an ambiguity set of adversarial distributions within bounded transport cost or f-divergence of the empirical data distribution. We show that the primal minimax learning problem can be efficiently solved by leveraging sufficient statistics and greedy maximization in the ostensibly intractable dual formulation. Based on DRO’s approximation to Lipschitz and variance regularization, we derive near-optimal sample complexities matching existing results. Extensive empirical evidence with different corruption models corroborates the effectiveness of the proposed methods. 
    more » « less
  2. null (Ed.)
    In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy. We apply this method to synthetic datasets that model data as mixtures of low-rank Gaussians and study the impact that different geometric properties of the data have on alignment. Next, we applied our approach to a neural decoding application where the goal is to predict movement directions and instantaneous velocities from populations of neurons in the macaque primary motor cortex. Our results demonstrate that when clustered structure exists in datasets, and is consistent across trials or time points, a hierarchical alignment strategy that leverages such structure can provide significant improvements in cross-domain alignment. 
    more » « less
  3. null (Ed.)
    Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation. OT, however, is very sensitive to outliers (samples with large noise) in the data since in its objective function, every sample, including outliers, is weighed similarly due to the marginal constraints. To remedy this issue, robust formulations of OT with unbalanced marginal constraints have previously been proposed. However, employing these methods in deep learning problems such as GANs and domain adaptation is challenging due to the instability of their dual optimization solvers. In this paper, we resolve these issues by deriving a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications. We demonstrate the effectiveness of our formulation in two applications of GANs and domain adaptation. Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions. In particular, our optimization computes weights for training samples reflecting how difficult it is for those samples to be generated in the model. In domain adaptation, our robust OT formulation leads to improved accuracy compared to the standard adversarial adaptation methods. 
    more » « less
  4. Identifying cause-effect relations among variables is a key step in the decision-making process. Whereas causal inference requires randomized experiments, researchers and policy makers are increasingly using observational studies to test causal hypotheses due to the wide availability of data and the infeasibility of experiments. The matching method is the most used technique to make causal inference from observational data. However, the pair assignment process in one-to-one matching creates uncertainty in the inference because of different choices made by the experimenter. Recently, discrete optimization models have been proposed to tackle such uncertainty; however, they produce 0-1 nonlinear problems and lack scalability. In this work, we investigate this emerging data science problem and develop a unique computational framework to solve the robust causal inference test instances from observational data with continuous outcomes. In the proposed framework, we first reformulate the nonlinear binary optimization problems as feasibility problems. By leveraging the structure of the feasibility formulation, we develop greedy schemes that are efficient in solving robust test problems. In many cases, the proposed algorithms achieve a globally optimal solution. We perform experiments on real-world data sets to demonstrate the effectiveness of the proposed algorithms and compare our results with the state-of-the-art solver. Our experiments show that the proposed algorithms significantly outperform the exact method in terms of computation time while achieving the same conclusion for causal tests. Both numerical experiments and complexity analysis demonstrate that the proposed algorithms ensure the scalability required for harnessing the power of big data in the decision-making process. Finally, the proposed framework not only facilitates robust decision making through big-data causal inference, but it can also be utilized in developing efficient algorithms for other nonlinear optimization problems such as quadratic assignment problems. History: Accepted by Ram Ramesh, Area Editor for Data Science and Machine Learning. Funding: This work was supported by the Division of Civil, Mechanical and Manufacturing Innovation of the National Science Foundation [Grant 2047094]. Supplemental Material: The online supplements are available at . 
    more » « less
  5. Rare earth elements (REE) are critical elements found in monazite, xenotime, and hydrated REE phosphates which typically form in hydrothermal mineral deposits. Accurate predictions of the solubility of these REE phosphates and the speciation of REE in aqueous fluids are both key to understanding the controls on the transport, fractionation, and deposition of REE in natural systems. Previous monazite and xenotime solubility experiments indicate the presence of large discrepancies between experimentally derived solubility constants versus calculated solubilities by combining different data sources for the thermodynamic properties of minerals and aqueous species at hydrothermal conditions. In this study, these discrepancies were resolved by using the program GEMSFITS to optimize the standard partial molal Gibbs energy of formation (ΔfG°298) of REE aqueous species (REE3+ and REE hydroxyl complexes) at 298.15 K and 1 bar while keeping the thermodynamic properties fixed for the REE phosphates. A comprehensive experimental database was compiled using solubility data available between 25 and 300 °C. The latter permits conducting thermodynamic parameter optimization of ΔfG°298 for REE aqueous species. Optimal matching of the rhabdophane solubility data between 25 and 100 °C requires modifying the ΔfG°298 values of REE3+ by 1–6 kJ/mol, whereas matching of the monazite solubility data between 100 and 300 °C requires modifying the ΔfG°298 values of both REE3+ and REEOH2+ by ∼ 2–10 kJ/mol and ∼ 15–31 kJ/mol, respectively. For xenotime, adjustments of ΔfG°298 values by 1–26 kJ/mol are only necessary for the REE3+ species. The optimizations indicate that the solubility of monazite in acidic solutions is controlled by the light (L)REE3+ species at <150 °C and the LREEOH2+ species at >150 °C, whereas the solubility of xenotime is controlled by the heavy (H)REE3+ species between 25 and 300 °C. Based on the optimization results, we conclude that the revised Helgeson-Kirkham-Flowers equation of state does not reliably predict the thermodynamic properties of REE3+, REEOH2+, and likely other REE hydroxyl species at hydrothermal conditions. We therefore provide an experimental database (ThermoExp_REE) as a basic framework for future updates, extensions with other ligands, and optimizations as new experimental REE data become available. The optimized thermodynamic properties of aqueous species and minerals are available open access to accurately predict the solubility of REE phosphates in fluid-rock systems. 
    more » « less