In bipartite matching problems, vertices on one side of a bipartite
graph are paired with those on the other. In its online
variant, one side of the graph is available offline, while the
vertices on the other side arrive online. When a vertex arrives,
an irrevocable and immediate decision should be made
by the algorithm; either match it to an available vertex or
drop it. Examples of such problems include matching workers
to firms, advertisers to keywords, organs to patients, and
so on. Much of the literature focuses on maximizing the total
relevance—modeled via total weight—of the matching.
However, in many realworld problems, it is also important
to consider contributions of diversity: hiring a diverse pool of
candidates, displaying a relevant but diverse set of ads, and so
on. In this paper, we propose the Online Submodular Bipartite
Matching (OSBM) problem, where the goal is to maximize
a submodular function f over the set of matched edges.
This objective is general enough to capture the notion of both
diversity (e.g., a weighted coverage function) and relevance
(e.g., the traditional linear function)—as well as many other
natural objective functions occurring in practice (e.g., limited
total budget in advertising settings). We propose novel algorithms
that have provable guarantees and are essentially optimal
when restricted to various special cases. We also run
experiments on realworld and synthetic datasets to validate
our algorithms.
more »
« less
(Optimal) Online Bipartite Matching with Degree Information
We propose a model for online graph problems where algorithms are given access
to an oracle that predicts (e.g., based on modeling assumptions or on past data) the
degrees of nodes in the graph. Within this model, we study the classic problem
of online bipartite matching, and a natural greedy matching algorithm called
MinPredictedDegree, which uses predictions of the degrees of offline nodes. For
the bipartite version of a stochastic graph model due to Chung, Lu, and Vu where the
expected values of the offline degrees are known and used as predictions, we show
that MinPredictedDegree stochastically dominates any other online algorithm, i.e.,
it is optimal for graphs drawn from this model. Since the “symmetric” version of
the model, where all online nodes are identical, is a special case of the wellstudied
“known i.i.d. model”, it follows that the competitive ratio of MinPredictedDegree on
such inputs is at least 0.7299.
For the special case of graphs with power law degree distributions, we show that
MinPredictedDegree frequently produces matchings almost as large as the true
maximum matching on such graphs. We complement these results with an
extensive empirical evaluation showing that MinPredictedDegree compares favorably
to stateoftheart online algorithms for online matching.
more »
« less
 Award ID(s):
 2022448
 NSFPAR ID:
 10430250
 Date Published:
 Journal Name:
 Conference on Neural Information Processing Systems
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))time algorithm that returns a (1+epsilon)approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problem where the metric D corresponds to shortest path distances in a connected unweighted undirected graph. We show that there exists an O^~(n) time algorithm that estimates the cost of graphic TSP to within a factor of (2 − epsilon_0) for some epsilon_0 > 0. This is the first sublinear cost estimation algorithm for graphic TSP that achieves an approximation factor less than 2. We also consider another wellstudied special case of metric TSP, namely, (1, 2)TSP where all distances are either 1 or 2, and give an O^~(n ^ 1.5) time algorithm to estimate optimal cost to within a factor of 1.625. Our estimation algorithms for graphic TSP as well as for (1, 2)TSP naturally lend themselves to O^~(n) space streaming algorithms that give an 11/6approximation for graphic TSP and a 1.625approximation for (1, 2)TSP. These results motivate the natural question if analogously to metric MST, for any epsilon > 0, (1 + epsilon)approximate estimates can be obtained for graphic TSP and (1, 2)TSP using O^~ (n) queries. We answer this question in the negative – there exists an epsilon_0 > 0, such that any algorithm that estimates the cost of graphic TSP ((1, 2)TSP) to within a (1 + epsilon_0)factor, necessarily requires (n^2) queries. This lower bound result highlights a sharp separation between the metric MST and metric TSP problems. Similarly to many classical approximation algorithms for TSP, our sublinear time estimation algorithms utilize subroutines for estimating the size of a maximum matching in the underlying graph. We show that this is not merely an artifact of our approach, and that for any epsilon > 0, any algorithm that estimates the cost of graphic TSP or (1, 2)TSP to within a (1 + epsilon)factor, can also be used to estimate the size of a maximum matching in a bipartite graph to within an epsilon n additive error. This connection allows us to translate known lower bounds for matching size estimation in various models to similar lower bounds for metric TSP cost estimation.more » « less

Efficient allocation of tasks to workers is a central problem in crowdsourcing. In this paper, we consider a special setting inspired by spatial crowdsourcing platforms where both workers and tasks arrive dynamically. Additionally, we assume all tasks are heterogeneous and each workertask assignment brings a known reward. The natural challenge lies in how to incorporate the uncertainty in the arrivals from both workers and tasks into our online allocation policy such that the total expected reward is maximized. To formulate this, we assume the arrival patterns of worker “types” and task “types” can be predicted from historical data. Specifically, we consider a finite time horizon T and assume that in each timestep, a single worker and task are sampled (i.e., “arrive”) from two respective distributions independently, and that this sampling process repeats identically and independently for the entire T online timesteps. Our model, called Online Task Assignment with TwoSided Arrival (OTATSA), is a significant generalization of the classical online task assignment where the set of tasks is assumed to be available offline. For the general version of OTATSA, we present an optimal nonadaptive algorithm which achieves an online competitive ratio of 0.295. For the special case of OTATSA where the reward is a function of just the worker type, we present an improved algorithm (which is adaptive) and achieves a competitive ratio of at least 0.343. On the hardness side, along with showing that the ratio obtained by our nonadaptive algorithm is the best possible among all nonadaptive algorithms, we further show that no (adaptive) algorithm can achieve a ratio better than 0.581 (unconditionally), even for the special case of OTATSA with homogenous tasks (i.e., all rewards are the same). At the heart of our analysis lies a new technical tool (which is a refined notion of the birthdeath process), called the twostage birthdeath process, which may be of independent interest. Finally, we perform numerical experiments on two realworld datasets obtained from crowdsourcing platforms to complement our theoretical results.more » « less

Online bipartite matching and allocation models are widely used to analyze and design markets such as Internet advertising, online labor, and crowdsourcing. Traditionally, vertices on one side of the market are fixed and known a priori, while vertices on the other side arrive online and are matched by a central agent to the offline side. The issue of possible conflicts among offline agents emerges in various real scenarios when we need to match each online agent with a set of offline agents. For example, in eventbased social networks (e.g., Meetup), offline events conflict for some users since they will be unable to attend mutuallydistant events at proximate times; in advertising markets, two competing firms may prefer not to be shown to one user simultaneously; and in online recommendation systems (e.g., Amazon Books), books of the same type “conflict” with each other in some sense due to the diversity requirement for each online buyer. The conflict nature inherent among certain offline agents raises significant challenges in both modeling and online algorithm design. In this paper, we propose a unifying model, generalizing the conflict models proposed in (She et al., TKDE 2016) and (Chen et al., TKDE 16). Our model can capture not only a broad class of conflict constraints on the offline side (which is even allowed to be sensitive to each online agent), but also allows a general arrival pattern for the online side (which is allowed to change over the online phase). We propose an efficient linear programming (LP) based online algorithm and prove theoretically that it has nearlyoptimal online performance. Additionally, we propose two LPbased heuristics and test them against two natural baselines on both real and synthetic datasets. Our LPbased heuristics experimentally dominate the baseline algorithms, aligning with our theoretical predictions and supporting our unified approach.more » « less

Bipartite matching markets pair agents on one side of a market with agents, items, or contracts on the opposing side. Prior work addresses online bipartite matching markets, where agents arrive over time and are dynamically matched to a known set of disposable resources. In this paper, we propose a new model, Online Matching with (offline) Reusable Resources under Known Adversarial Distributions (OMRRKAD), in which resources on the offline side are reusable instead of disposable; that is, once matched, resources become available again at some point in the future. We show that our model is tractable by presenting an LPbased adaptive algorithm that achieves an online competitive ratio of 1/2  epsilon for any given epsilon > 0. We also show that no nonadaptive algorithm can achieve a ratio of 1/2 + o(1) based on the same benchmark LP. Through a datadriven analysis on a massive openlyavailable dataset, we show our model is robust enough to capture the application of taxi dispatching services and ridesharing systems. We also present heuristics that perform well in practice.more » « less