skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On Instance-Optimal Algorithms for a Generalization of Nuts and Bolts and Generalized Sorting
We generalize the classical nuts and bolts problem to a setting where the input is a collection of n nuts and m bolts, and there is no promise of any matching pairs. It is not allowed to compare a nut directly with a nut or a bolt directly with a bolt, and the goal is to perform the fewest nut-bolt comparisons to discover the partial order between the nuts and bolts. We term this problem bipartite sorting. We show that instances of bipartite sorting of the same size exhibit a wide range of complexity, and propose to perform a fine-grained analysis for this problem. We rule out straightforward notions of instance-optimality as being too stringent, and adopt a neighborhood-based definition. Our definition may be of independent interest as a unifying lens for instance-optimal algorithms for other static problems existing in literature. This includes problems like sorting (Estivill-Castro and Woods, ACM Comput. Surv. 1992), convex hull (Afshani, Barbay and Chan, JACM 2017), adaptive joins (Demaine, López-Ortiz and Munro, SODA 2000), and the recent concept of universal optimality for graphs (Haeupler, Hladík, Rozhoň, Tarjan and Tětek, 2023). As our main result on bipartite sorting, we give a randomized algorithm that is within a factor of O(log³(n+m)) of being instance-optimal w.h.p., with respect to the neighborhood-based definition. As our second contribution, we generalize bipartite sorting to DAG sorting, when the underlying DAG is not necessarily bipartite. As an unexpected consequence of a simple algorithm for DAG sorting, we rule out a potential lower bound on the widely-studied problem of sorting with priced information, posed by (Charikar, Fagin, Guruswami, Kleinberg, Raghavan and Sahai, STOC 2000). In this problem, comparing keys i and j has a known cost c_{ij} ∈ ℝ^+ ∪ {∞}, and the goal is to sort the keys in an instance-optimal way, by keeping the total cost of an algorithm as close as possible to ∑_{i=1}^{n-1} c_{x(i)x(i+1)}. Here x(1) < ⋯ < x(n) is the sorted order. While several special cases of cost functions have received a lot of attention in the community, no progress on the general version with arbitrary costs has been reported so far. One reason for this lack of progress seems to be a widely-cited Ω(n) lower bound on the competitive ratio for finding the maximum. This Ω(n) lower bound by (Gupta and Kumar, FOCS 2000) uses costs in {0,1,n, ∞}, and although not extended to sorting, this barrier seems to have stalled any progress on the general cost case. We rule out such a potential lower bound by showing the existence of an algorithm with a Õ(n^{3/4}) competitive ratio for the {0,1,n,∞} cost version. This generalizes the setting of generalized sorting proposed by (Huang, Kannan and Khanna, FOCS 2011), where the costs are either 1 or infinity, and the cost of the cheapest proof is always n-1.  more » « less
Award ID(s):
1910873
PAR ID:
10568355
Author(s) / Creator(s):
;
Editor(s):
Kumar, Amit; Ron-Zewi, Noga
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
317
ISSN:
1868-8969
ISBN:
978-3-95977-348-5
Page Range / eLocation ID:
317-317
Subject(s) / Keyword(s):
Sorting Priced Information Instance Optimality Nuts and Bolts Theory of computation → Abstract machines Theory of computation → Sorting and searching
Format(s):
Medium: X Size: 23 pages; 954206 bytes Other: application/pdf
Size(s):
23 pages 954206 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Guruswami, Venkatesan (Ed.)
    The problem of sorting with priced information was introduced by [Charikar, Fagin, Guruswami, Kleinberg, Raghavan, Sahai (CFGKRS), STOC 2000]. In this setting, different comparisons have different (potentially infinite) costs. The goal is to find a sorting algorithm with small competitive ratio, defined as the (worst-case) ratio of the algorithm’s cost to the cost of the cheapest proof of the sorted order. The simple case of bichromatic sorting posed by [CFGKRS] remains open: We are given two sets A and B of total size N, and the cost of an A-A comparison or a B-B comparison is higher than an A-B comparison. The goal is to sort A ∪ B. An Ω(log N) lower bound on competitive ratio follows from unit-cost sorting. Note that this is a generalization of the famous nuts and bolts problem, where A-A and B-B comparisons have infinite cost, and elements of A and B are guaranteed to alternate in the final sorted order. In this paper we give a randomized algorithm InversionSort with an almost-optimal w.h.p. competitive ratio of O(log³ N). This is the first algorithm for bichromatic sorting with a o(N) competitive ratio. 
    more » « less
  2. In the unit-cost comparison model, a black box takes an input two items and outputs the result of the comparison. Problems like sorting and searching have been studied in this model, and it has been general- ized to include the concept of priced information, where different pairs of items (say database records) have different comparison costs. These comparison costs can be arbitrary (in which case no algorithm can be close to optimal (Charikar et al. STOC 2000)), structured (for exam- ple, the comparison cost may depend on the length of the databases (Gupta et al. FOCS 2001)), or stochastic (Angelov et al. LATIN 2008). Motivated by the database setting where the cost depends on the sizes of the items, we consider the problems of sorting and batched predecessor where two non-uniform sets of items A and B are given as input. (1) In the RAM setting, we consider the scenario where both sets have n keys each. The cost to compare two items in A is a, to compare an item of A to an item of B is b, and to compare two items in B is c. We give upper and lower bounds for the case a ≤ b ≤ c, the case that serves as a warmup for the generalization to the external-memory model. Notice that the case b = 1,a = c = ∞ is the famous “nuts and bolts” problem. ) In the Disk-Access Model (DAM), where transferring elements between disk and internal memory is the main bottleneck, we con- sider the scenario where elements in B are larger than elements in A. The larger items take more I/Os to be brought into memory, consume more space in internal memory, and are required in their entirety for comparisons. A key observation is that the complexity of sorting depends heavily on the interleaving of the small and large items in the final sorted order. If all large elements come after all small elements in the final sorted order, sorting each type separately and concatenating is optimal. However, if the set of predecessors of B in A has size k ≪ n, one must solve an associated batched predecessor problem in order to achieve optimality. We first give output-sensitive lower and upper bounds on the batched predecessor problem, and use these to derive bounds on the complexity of sorting in the two models. Our bounds are tight in most cases, and require novel generalizations of the classical lower bound techniques in external memory to accommodate the non-uniformity of keys. 
    more » « less
  3. In several emerging technologies for computer memory (main memory), the cost of reading is significantly cheaper than the cost of writing. Such asymmetry in memory costs poses a fundamentally different model from the RAM for algorithm design. In this paper we study lower and upper bounds for various problems under such asymmetric read and write costs. We consider both the case in which all but O(1) memory has asymmetric cost, and the case of a small cache of symmetric memory. We model both cases using the (M,w)-ARAM, in which there is a small (symmetric) memory of size M and a large unbounded (asymmetric) memory, both random access, and where reading from the large memory has unit cost, but writing has cost w >> 1. For FFT and sorting networks we show a lower bound cost of Omega(w*n*log_{w*M}(n)), which indicates that it is not possible to achieve asymptotic improvements with cheaper reads when w is bounded by a polynomial in M. Moreover, there is an asymptotic gap (of min(w,log(n)/log(w*M)) between the cost of sorting networks and comparison sorting in the model. This contrasts with the RAM, and most other models, in which the asymptotic costs are the same. We also show a lower bound for computations on an n*n diamond DAG of Omega(w*n^2/M) cost, which indicates no asymptotic improvement is achievable with fast reads. However, we show that for the minimum edit distance problem (and related problems), which would seem to be a diamond DAG, we can beat this lower bound with an algorithm with only O(w*n^2/(M*min(w^{1/3},M^{1/2}))) cost. To achieve this we make use of a "path sketch" technique that is forbidden in a strict DAG computation. Finally, we show several interesting upper bounds for shortest path problems, minimum spanning trees, and other problems. A common theme in many of the upper bounds is that they require redundant computation and a tradeoff between reads and writes. 
    more » « less
  4. The fundamental problems of sorting and searching, traditionally studied in the unit-cost comparison model, have been generalized to include priced information, where different pairs of items have different comparison costs. These costs can be arbitrary (Charikar et al. STOC 2000), structured (Gupta et al. FOCS 2001), or stochastic (Angelov et al. LATIN 2008). Motivated by the database setting where the comparison cost depends on the sizes of the records, we consider the problems of sorting and batched predecessor where two non-uniform sets of items A and B are given as input. In the RAM model, pairwise comparisons (A-A, A-B and B-B) have respective comparison costs a, b and c. We give upper and lower bounds for the case a<= b <= c, which serves as a warmup for the generalization to the external-memory model. In the Disk-Access Model (DAM), where transferring elements between disk and RAM is the main bottleneck, we consider the scenario where elements in B are larger than elements in A. All items are required in their entirety for comparisons in RAM. A key observation is that the complexity of sorting depends on the interleaving of the small and large items in the final sorted order, and with a high degree of interleaving, the lower bound is dominated by an associated batched predecessor problem. We give output-sensitive bounds on the batched predecessor and sorting; our bounds are tight in most cases. Our lower bounds require novel generalizations of lower bound techniques in external memory to accommodate non-uniform keys. 
    more » « less
  5. null (Ed.)
    Abstract Let $$K$$ be an algebraically closed field of prime characteristic $$p$$ , let $$X$$ be a semiabelian variety defined over a finite subfield of $$K$$ , let $$\unicode[STIX]{x1D6F7}:X\longrightarrow X$$ be a regular self-map defined over $$K$$ , let $$V\subset X$$ be a subvariety defined over $$K$$ , and let $$\unicode[STIX]{x1D6FC}\in X(K)$$ . The dynamical Mordell–Lang conjecture in characteristic $$p$$ predicts that the set $$S=\{n\in \mathbb{N}:\unicode[STIX]{x1D6F7}^{n}(\unicode[STIX]{x1D6FC})\in V\}$$ is a union of finitely many arithmetic progressions, along with finitely many $$p$$ -sets, which are sets of the form $$\{\sum _{i=1}^{m}c_{i}p^{k_{i}n_{i}}:n_{i}\in \mathbb{N}\}$$ for some $$m\in \mathbb{N}$$ , some rational numbers $$c_{i}$$ and some non-negative integers $$k_{i}$$ . We prove that this conjecture is equivalent with some difficult diophantine problem in characteristic 0. In the case $$X$$ is an algebraic torus, we can prove the conjecture in two cases: either when $$\dim (V)\leqslant 2$$ , or when no iterate of $$\unicode[STIX]{x1D6F7}$$ is a group endomorphism which induces the action of a power of the Frobenius on a positive dimensional algebraic subgroup of $$X$$ . We end by proving that Vojta’s conjecture implies the dynamical Mordell–Lang conjecture for tori with no restriction. 
    more » « less