skip to main content

Title: Generalized Stochastic Matching
In this paper, we generalize the recently studied stochastic matching problem to more accurately model a significant medical process, kidney exchange, and several other applications. Up until now the stochastic matching problem that has been studied was as follows: given a graph G= (V,E), each edge is included in the realized sub-graph of G independently with probability pe, and the goal is to find a degree-bounded sub-graph Q of G that has an expected maximum matching that approximates the expected maximum matching of G. This model does not account for possibilities of vertex dropouts, which can be found in several applications, e.g. in kidney exchange when donors or patients opt out of the exchange process as well as in online freelancing and online dating when online profiles are found to be faked. Thus, we will study a more generalized model of stochastic matching in which vertices and edges are both realized independently with some probabilities pv, pe, respectively, which more accurately fits important applications than the previously studied model. We will discuss the first algorithms and analysis for this generalization of the stochastic matching model and prove that they achieve good approximation ratios. In particular, we show that the approximation factor of a natural algorithm for this problem is at least 0.6568 in unweighted graphs, and 1/2+ε in weighted graphs for some constant ε >0. We further improve our result for unweighted graphs to 2/3 using edge degree constrained sub-graphs (EDCS).  more » « less
Award ID(s):
2218678 2114269
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Page Range / eLocation ID:
10008 to 10015
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the stochastic vertex cover problem. In this problem, G = (V, E) is an arbitrary known graph, and G⋆ is an unknown random subgraph of G where each edge e is realized independently with probability p. Edges of G⋆ can only be verified using edge queries. The goal in this problem is to find a minimum vertex cover of G⋆ using a small number of queries. Our main result is designing an algorithm that returns a vertex cover of G⋆ with size at most (3/2+є) times the expected size of the minimum vertex cover, using only O(n/є p) non-adaptive queries. This improves over the best-known 2-approximation algorithm by Behnezhad, Blum and Derakhshan [SODA’22] who also show that Ω(n/p) queries are necessary to achieve any constant approximation. Our guarantees also extend to instances where edge realizations are not fully independent. We complement this upperbound with a tight 3/2-approximation lower bound for stochastic graphs whose edges realizations demonstrate mild correlations. 
    more » « less
  2. We propose a model for online graph problems where algorithms are given access to an oracle that predicts (e.g., based on modeling assumptions or on past data) the degrees of nodes in the graph. Within this model, we study the classic problem of online bipartite matching, and a natural greedy matching algorithm called MinPredictedDegree, which uses predictions of the degrees of offline nodes. For the bipartite version of a stochastic graph model due to Chung, Lu, and Vu where the expected values of the offline degrees are known and used as predictions, we show that MinPredictedDegree stochastically dominates any other online algorithm, i.e., it is optimal for graphs drawn from this model. Since the “symmetric” version of the model, where all online nodes are identical, is a special case of the well-studied “known i.i.d. model”, it follows that the competitive ratio of MinPredictedDegree on such inputs is at least 0.7299. For the special case of graphs with power law degree distributions, we show that MinPredictedDegree frequently produces matchings almost as large as the true maximum matching on such graphs. We complement these results with an extensive empirical evaluation showing that MinPredictedDegree compares favorably to state-of-the-art online algorithms for online matching. 
    more » « less
  3. null (Ed.)
    Consider an algorithm performing a computation on a huge random object (for example a random graph or a "long" random walk). Is it necessary to generate the entire object prior to the computation, or is it possible to provide query access to the object and sample it incrementally "on-the-fly" (as requested by the algorithm)? Such an implementation should emulate the random object by answering queries in a manner consistent with an instance of the random object sampled from the true distribution (or close to it). This paradigm is useful when the algorithm is sub-linear and thus, sampling the entire object up front would ruin its efficiency. Our first set of results focus on undirected graphs with independent edge probabilities, i.e. each edge is chosen as an independent Bernoulli random variable. We provide a general implementation for this model under certain assumptions. Then, we use this to obtain the first efficient local implementations for the Erdös-Rényi G(n,p) model for all values of p, and the Stochastic Block model. As in previous local-access implementations for random graphs, we support Vertex-Pair and Next-Neighbor queries. In addition, we introduce a new Random-Neighbor query. Next, we give the first local-access implementation for All-Neighbors queries in the (sparse and directed) Kleinberg’s Small-World model. Our implementations require no pre-processing time, and answer each query using O(poly(log n)) time, random bits, and additional space. Next, we show how to implement random Catalan objects, specifically focusing on Dyck paths (balanced random walks on the integer line that are always non-negative). Here, we support Height queries to find the location of the walk, and First-Return queries to find the time when the walk returns to a specified location. This in turn can be used to implement Next-Neighbor queries on random rooted ordered trees, and Matching-Bracket queries on random well bracketed expressions (the Dyck language). Finally, we introduce two features to define a new model that: (1) allows multiple independent (and even simultaneous) instantiations of the same implementation, to be consistent with each other without the need for communication, (2) allows us to generate a richer class of random objects that do not have a succinct description. Specifically, we study uniformly random valid q-colorings of an input graph G with maximum degree Δ. This is in contrast to prior work in the area, where the relevant random objects are defined as a distribution with O(1) parameters (for example, n and p in the G(n,p) model). The distribution over valid colorings is instead specified via a "huge" input (the underlying graph G), that is far too large to be read by a sub-linear time algorithm. Instead, our implementation accesses G through local neighborhood probes, and is able to answer queries to the color of any given vertex in sub-linear time for q ≥ 9Δ, in a manner that is consistent with a specific random valid coloring of G. Furthermore, the implementation is memory-less, and can maintain consistency with non-communicating copies of itself. 
    more » « less
  4. In the Maximum Independent Set problem we are asked to find a set of pairwise nonadjacent vertices in a given graph with the maximum possible cardinality. In general graphs, this classical problem is known to be NP-hard and hard to approximate within a factor of n1−ε for any ε > 0. Due to this, investigating the complexity of Maximum Independent Set in various graph classes in hope of finding better tractability results is an active research direction. In H-free graphs, that is, graphs not containing a fixed graph H as an induced subgraph, the problem is known to remain NP-hard and APX-hard whenever H contains a cycle, a vertex of degree at least four, or two vertices of degree at least three in one connected component. For the remaining cases, where every component of H is a path or a subdivided claw, the complexity of Maximum Independent Set remains widely open, with only a handful of polynomial-time solvability results for small graphs H such as P5, P6, the claw, or the fork. We prove that for every such “possibly tractable” graph H there exists an algorithm that, given an H-free graph G and an accuracy parameter ε > 0, finds an independent set in G of cardinality within a factor of (1 – ε) of the optimum in time exponential in a polynomial of log | V(G) | and ε−1. That is, we show that for every graph H for which Maximum Independent Set is not known to be APX-hard in H-free graphs, the problem admits a quasi-polynomial time approximation scheme in this graph class. Our algorithm works also in the more general weighted setting, where the input graph is supplied with a weight function on vertices and we are maximizing the total weight of an independent set. 
    more » « less
  5. We consider the maximum matching problem in the semi-streaming model formalized by Feigenbaum, Kannan, McGregor, Suri, and Zhang that is inspired by giant graphs of today. As our main result, we give a two-pass (1/2 + 1/16)-approximation algorithm for triangle-free graphs and a two-pass (1/2 + 1/32)-approximation algorithm for general graphs; these improve the approximation ratios of 1/2 + 1/52 for bipartite graphs and 1/2 + 1/140 for general graphs by Konrad, Magniez, and Mathieu. In three passes, we are able to achieve approximation ratios of 1/2 + 1/10 for triangle-free graphs and 1/2 + 1/19.753 for general graphs. We also give a multi-pass algorithm where we bound the number of passes precisely—we give a (2/3 − ε)- approximation algorithm that uses 2/(3ε) passes for triangle-free graphs and 4/(3ε) passes for general graphs. Our algorithms are simple and combinatorial, use O(n log n) space, and (can be implemented to) have O(1) update time per edge. For general graphs, our multi-pass algorithm improves the best known deterministic algorithms in terms of the number of passes: * Ahn and Guha give a (2/3−ε)-approximation algorithm that uses O(log(1/ε)/ε2) passes, whereas our (2/3 − ε)-approximation algorithm uses 4/(3ε) passes; * they also give a (1 − ε)-approximation algorithm that uses O(log n · poly(1/ε)) passes, where n is the number of vertices of the input graph; although our algorithm is (2/3−ε)-approximation, our number of passes do not depend on n. Earlier multi-pass algorithms either have a large constant inside big-O notation for the number of passes or the constant cannot be determined due to the involved analysis, so our multi-pass algorithm should use much fewer passes for approximation ratios bounded slightly below 2/3. 
    more » « less