skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Balancing Relevance and Diversity in Online Bipartite Matching via Submodularity
In bipartite matching problems, vertices on one side of a bipartite graph are paired with those on the other. In its online variant, one side of the graph is available offline, while the vertices on the other side arrive online. When a vertex arrives, an irrevocable and immediate decision should be made by the algorithm; either match it to an available vertex or drop it. Examples of such problems include matching workers to firms, advertisers to keywords, organs to patients, and so on. Much of the literature focuses on maximizing the total relevance—modeled via total weight—of the matching. However, in many real-world problems, it is also important to consider contributions of diversity: hiring a diverse pool of candidates, displaying a relevant but diverse set of ads, and so on. In this paper, we propose the Online Submodular Bipartite Matching (OSBM) problem, where the goal is to maximize a submodular function f over the set of matched edges. This objective is general enough to capture the notion of both diversity (e.g., a weighted coverage function) and relevance (e.g., the traditional linear function)—as well as many other natural objective functions occurring in practice (e.g., limited total budget in advertising settings). We propose novel algorithms that have provable guarantees and are essentially optimal when restricted to various special cases. We also run experiments on real-world and synthetic datasets to validate our algorithms.  more » « less
Award ID(s):
1749864
PAR ID:
10111461
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
AAAI Conference on Artificial Intelligence (AAAI)
Page Range / eLocation ID:
1877-1884
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application. The source code is made available at https://github.com/faezahmed/diverse_matching. 
    more » « less
  2. Butterflies are the smallest non-trivial subgraph in bipartite graphs, and therefore having efficient computations for analyzing them is crucial to improving the quality of certain applications on bipartite graphs. In this paper, we design a framework called ParButterfly that contains new parallel algorithms for the following problems on processing butterflies: global counting, per-vertex counting, per-edge counting, tip decomposition (vertex peeling), and wing decomposition (edge peeling). The main component of these algorithms is aggregating wedges incident on subsets of vertices, and our framework supports different methods for wedge aggregation, including sorting, hashing, histogramming, and batching. In addition, ParButterfly supports different ways of ranking the vertices to speed up counting, including side ordering, approximate and exact degree ordering, and approximate and exact complement coreness ordering. For counting, ParButterfly also supports both exact computation as well as approximate computation via graph sparsification. We prove strong theoretical guarantees on the work and span of the algorithms in ParButterfly. We perform a comprehensive evaluation of all of the algorithms in ParButterfly on a collection of real-world bipartite graphs using a 48-core machine. Our counting algorithms obtain significant parallel speedup, outperforming the fastest sequential algorithms by up to 13.6× with a self-relative speedup of up to 38.5×. Compared to general subgraph counting solutions, we are orders of magnitude faster. Our peeling algorithms achieve self-relative speedups of up to 10.7× and outperform the fastest sequential baseline by up to several orders of magnitude. 
    more » « less
  3. We propose a model for online graph problems where algorithms are given access to an oracle that predicts (e.g., based on modeling assumptions or on past data) the degrees of nodes in the graph. Within this model, we study the classic problem of online bipartite matching, and a natural greedy matching algorithm called MinPredictedDegree, which uses predictions of the degrees of offline nodes. For the bipartite version of a stochastic graph model due to Chung, Lu, and Vu where the expected values of the offline degrees are known and used as predictions, we show that MinPredictedDegree stochastically dominates any other online algorithm, i.e., it is optimal for graphs drawn from this model. Since the “symmetric” version of the model, where all online nodes are identical, is a special case of the well-studied “known i.i.d. model”, it follows that the competitive ratio of MinPredictedDegree on such inputs is at least 0.7299. For the special case of graphs with power law degree distributions, we show that MinPredictedDegree frequently produces matchings almost as large as the true maximum matching on such graphs. We complement these results with an extensive empirical evaluation showing that MinPredictedDegree compares favorably to state-of-the-art online algorithms for online matching. 
    more » « less
  4. Online bipartite matching and allocation models are widely used to analyze and design markets such as Internet advertising, online labor, and crowdsourcing. Traditionally, vertices on one side of the market are fixed and known a priori, while vertices on the other side arrive online and are matched by a central agent to the offline side. The issue of possible conflicts among offline agents emerges in various real scenarios when we need to match each online agent with a set of offline agents. For example, in event-based social networks (e.g., Meetup), offline events conflict for some users since they will be unable to attend mutually-distant events at proximate times; in advertising markets, two competing firms may prefer not to be shown to one user simultaneously; and in online recommendation systems (e.g., Amazon Books), books of the same type “conflict” with each other in some sense due to the diversity requirement for each online buyer. The conflict nature inherent among certain offline agents raises significant challenges in both modeling and online algorithm design. In this paper, we propose a unifying model, generalizing the conflict models proposed in (She et al., TKDE 2016) and (Chen et al., TKDE 16). Our model can capture not only a broad class of conflict constraints on the offline side (which is even allowed to be sensitive to each online agent), but also allows a general arrival pattern for the online side (which is allowed to change over the online phase). We propose an efficient linear programming (LP) based online algorithm and prove theoretically that it has nearly-optimal online performance. Additionally, we propose two LP-based heuristics and test them against two natural baselines on both real and synthetic datasets. Our LP-based heuristics experimentally dominate the baseline algorithms, aligning with our theoretical predictions and supporting our unified approach. 
    more » « less
  5. The maximum bipartite matching problem is among the most fundamental and well-studied problems in combinatorial optimization. A beautiful and celebrated combinatorial algorithm of Hopcroft and Karp [26] shows that maximum bipartite matching can be solved in O(m√n) time on a graph with n vertices and m edges. For the case of very dense graphs, a different approach based on fast matrix multiplication was subsequently developed [27, 39], that achieves a running time of O(n2.371). For the next several decades, these results represented the fastest known algorithms for the problem until in 2013, a ground-breaking work of Madry [36] gave a significantly faster algorithm for sparse graphs. Subsequently, a sequence of works developed increasingly faster algorithms for solving maximum bipartite matching, and more generally directed maximum flow, culminating in a spectacular recent breakthrough [9] that gives an m1+o(1) time algorithm for maximum bipartite matching (and more generally, for min cost flows). These more recent developments collectively represented a departure from earlier combinatorial approaches: they all utilized continuous techniques based on interior-point methods for solving linear programs. This raises a natural question: are continuous techniques essential to obtaining fast algorithms for the bipartite matching problem? Our work makes progress on this question by presenting a new, purely combinatorial algorithm for bipartite matching, that, on moderately dense graphs outperforms both Hopcroft- Karp and the fast matrix multiplication based algorithms. Similar to the classical algorithms for bipartite matching, our approach is based on iteratively augmenting a current matching using augmenting paths in the (directed) residual flow network. A common method for designing fast algorithms for directed flow problems is via the multiplicative weights update (MWU) framework, that effectively reduces the flow problem to decremental single-source shortest paths (SSSP) in directed graphs. Our main observation is that a slight modification of this reduction results in a special case of SSSP that appears significantly easier than general decremental directed SSSP. Our main technical contribution is an efficient algorithm for this special case of SSSP, that outperforms the current state of the art algorithms for general decremental SSSP with adaptive adversary, leading to a deterministic algorithm for bipartite matching, whose running time is Õ(m1/3n5/3). This new algorithm thus starts to outperform the Hopcroft-Karp algorithm in graphs with m = ω(n7/4), and it also outperforms the fast matrix multiplication based algorithms on dense graphs. We believe that this framework for obtaining faster combinatorial algorithms for bipartite matching by exploiting the special properties of the resulting decremental SSSP instances is one of the main conceptual contributions of our work that we hope paves the way for even faster combinatorial algorithms for bipartite matching. Finally, using a standard reduction from the maximum vertex-capacitated s-t flow problem in directed graphs to maximum bipartite matching, we also obtain an O(m1/3n5/3) time deterministic algorithm for maximum vertex-capacitated s-t flow when all vertex capacities are identical. 
    more » « less