skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fast Algorithms for Geometric Consensuses
Let P be a set of n points in ℝ^d in general position. A median hyperplane (roughly) splits the point set P in half. The yolk of P is the ball of smallest radius intersecting all median hyperplanes of P. The egg of P is the ball of smallest radius intersecting all hyperplanes which contain exactly d points of P. We present exact algorithms for computing the yolk and the egg of a point set, both running in expected time O(n^(d-1) log n). The running time of the new algorithm is a polynomial time improvement over existing algorithms. We also present algorithms for several related problems, such as computing the Tukey and center balls of a point set, among others.  more » « less
Award ID(s):
1907400
PAR ID:
10226334
Author(s) / Creator(s):
;
Editor(s):
Cabello, Sergio; Chen, Danny
Date Published:
Journal Name:
36th International Symposium on Computational Geometry, SoCG 2020
Volume:
164
Page Range / eLocation ID:
50:1--50:16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bodlaender, Hans L (Ed.)
    Given a set P ⊂ ℝ^d of n points, with diameter Δ, and a parameter δ ∈ (0,1), it is known that there is a partition of P into sets P_1, …, P_t, each of size O(1/δ²), such that their convex hulls all intersect a common ball of radius δΔ. We prove that a random partition, with a simple alteration step, yields the desired partition, resulting in a (randomized) linear time algorithm (i.e., O(dn)). We also provide a deterministic algorithm with running time O(dn log n). Previous proofs were either existential (i.e., at least exponential time), or required much bigger sets. In addition, the algorithm and its proof of correctness are significantly simpler than previous work, and the constants are slightly better. We also include a number of applications and extensions using the same central ideas. For example, we provide a linear time algorithm for computing a "fuzzy" centerpoint, and prove a no-dimensional weak ε-net theorem with an improved constant. 
    more » « less
  2. Morin, Pat; Oh, Eunjin (Ed.)
    Let S be a set of n points in ℝ^d, where d ≥ 2 is a constant, and let H₁,H₂,…,H_{m+1} be a sequence of vertical hyperplanes that are sorted by their first coordinates, such that exactly n/m points of S are between any two successive hyperplanes. Let |A(S,m)| be the number of different closest pairs in the {(m+1) choose 2} vertical slabs that are bounded by H_i and H_j, over all 1 ≤ i < j ≤ m+1. We prove tight bounds for the largest possible value of |A(S,m)|, over all point sets of size n, and for all values of 1 ≤ m ≤ n. As a result of these bounds, we obtain, for any constant ε > 0, a data structure of size O(n), such that for any vertical query slab Q, the closest pair in the set Q ∩ S can be reported in O(n^{1/2+ε}) time. Prior to this work, no linear space data structure with sublinear query time was known. 
    more » « less
  3. Given a set $$P$$ of $$n$$ points in the plane, we consider the problem of computing the number of points of $$P$$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in the plane can be adapted to this problem. For example, by adapting Matoušek's results, we can build a data structure of $O(n)$ space in $$O(n^{1+\delta})$$ time (for any $$\delta>0$$) so that each query can be answered in $$O(\sqrt{n})$$ time; alternatively, we can build a data structure of $$O(n^2/\log^2 n)$$ space with $$O(n^{1+\delta})$$ preprocessing time (for any $$\delta>0$$) and $$O(\log n)$$ query time. Our techniques lead to improvements for several other classical problems in computational geometry. 1. Given a set of $$n$$ unit disks and a set of $$n$$ points in the plane, the batched unit-disk range counting problem is to compute for each disk the number of points in it. Previous work [Katz and Sharir, 1997] solved the problem in $$O(n^{4/3}\log n)$$ time. We give a new algorithm of $$O(n^{4/3})$$ time, which is optimal as it matches an $$\Omega(n^{4/3})$$-time lower bound. For small $$\chi$$, where $$\chi$$ is the number of pairs of unit disks that intersect, we further improve the algorithm to $$O(n^{2/3}\chi^{1/3}+n^{1+\delta})$$ time, for any $$\delta>0$$. 2. The above result immediately leads to an $$O(n^{4/3})$$ time optimal algorithm for counting the intersecting pairs of circles for a set of $$n$$ unit circles in the plane. The previous best algorithms solve the problem in $$O(n^{4/3}\log n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n^{4/3}\log^{2/3} n)$$ expected time by a randomized algorithm [Agarwal, Pellegrini, and Sharir, 1993]. 3. Given a set $$P$$ of $$n$$ points in the plane and an integer $$k$$, the distance selection problem is to find the $$k$$-th smallest distance among all pairwise distances of $$P$$. The problem can be solved in $$O(n^{4/3}\log^2 n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n\log n+n^{2/3}k^{1/3}\log^{5/3}n)$$ expected time by a randomized algorithm [Chan, 2001]. Our new randomized algorithm runs in $$O(n\log n +n^{2/3}k^{1/3}\log n)$$ expected time. 4. Given a set $$P$$ of $$n$$ points in the plane, the discrete $$2$$-center problem is to compute two smallest congruent disks whose centers are in $$P$$ and whose union covers $$P$$. An $$O(n^{4/3}\log^5 n)$$-time algorithm was known [Agarwal, Sharir, and Welzl, 1998]. Our techniques yield a deterministic algorithm of $$O(n^{4/3}\log^{10/3} n\cdot (\log\log n)^{O(1)})$$ time and a randomized algorithm of $$O(n^{4/3}\log^3 n\cdot (\log\log n)^{1/3})$$ expected time. 
    more » « less
  4. Keil, Mark; Mondal, Debajyoti (Ed.)
    We adapt and generalize a heuristic for k-center clustering to the permutation case, where every pre x of the ordering is a guaranteed approximate solution. The one-hop greedy permutations work by choosing at each step the farthest unchosen point and then looking in its local neighborhood for a point that covers the most points at a certain scale. This balances the competing demands of reducing the coverage radius and also covering as many points as possible. This idea first appeared in the work of Garcia-Diaz et al. and their algorithm required O(n2 log n) time for a fi xed k (i.e. not the whole permutation). We show how to use geometric data structures to approximate the entire permutation in O(n log D) time for metrics sets with spread D. Notably, this running time is asymptotically the same as the running time for computing the ordinary greedy permutation. 
    more » « less
  5. Clustering plays a crucial role in computer science, facilitating data analysis and problem-solving across numerous fields. By partitioning large datasets into meaningful groups, clustering reveals hidden structures and relationships within the data, aiding tasks such as unsupervised learning, classification, anomaly detection, and recommendation systems. Particularly in relational databases, where data is distributed across multiple tables, efficient clustering is essential yet challenging due to the computational complexity of joining tables. This paper addresses this challenge by introducing efficient algorithms for k-median and k-means clustering on relational data without the need for pre-computing the join query results. For the relational k-median clustering, we propose the first efficient relative approximation algorithm. For the relational k-means clustering, our algorithm significantly improves both the approximation factor and the running time of the known relational k-means clustering algorithms, which suffer either from large constant approximation factors, or expensive running time. Given a join query q and a database instance D of O(N) tuples, for both k-median and k-means clustering on the results of q on D, we propose randomized (1+ε)γ-approximation algorithms that run in roughly O(k2Nfhw)+T_γ(k2) time, where ε ∈ (0,1) is a constant parameter decided by the user, \fhw is the fractional hyper-tree width of Q, while γ and T_γ(x) represent the approximation factor and the running time, respectively, of a traditional clustering algorithm in the standard computational setting over x points. 
    more » « less