skip to main content


Title: Discrete Planar Map Matching
Route reconstruction is an important application for Geographic Information Systems (GIS) that rely heavily upon GPS data and other location data from IoT devices. Many of these techniques rely on geometric methods involving the \frechet\ distance to compare curve similarity. The goal of reconstruction, or map matching, is to find the most similar path within a given graph to a given input curve, which is often approximate location data. This process can be approximated by sampling the curves and using the \dfd. Due to power and coverage constraints, the GPS data itself may be sparse causing improper constraints along the edges during the reconstruction if only the continuous \frechet\ distance is used. Here, we look at two variations of discrete map matching: one constraining the walk length and the other limiting the number of vertices visited in the graph. %, and the constraint that the walk may not self-intersect. We give an efficient algorithm to solve the question based on walk length showing it is in \textbf{P}. We prove the other problem is \npc\ and the minimization variant is \apx\ while also giving a parameterized algorithm to solve the problem.  more » « less
Award ID(s):
1817602
NSF-PAR ID:
10179147
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 31st Canadian Conference in Computational Geometry (CCCG 2019)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Let f be a drawing in the Euclidean plane of a graph G, which is understood to be a 1-dimensional simplicial complex. We assume that every edge of G is drawn by f as a curve of constant algebraic complexity, and the ratio of the length of the longest simple path to the the length of the shortest edge is poly(n). In the drawing f, a path P of G, or its image in the drawing π = f(P), is β-stretch if π is a simple (non-self-intersecting) curve, and for every pair of distinct points p ∈ P and q ∈ P , the length of the sub-curve of π connecting f(p) with f(q) is at most β∥f(p) − f(q)∥, where ∥.∥ denotes the Euclidean distance. We introduce and study the β-stretch Path Problem (βSP for short), in which we are given a pair of vertices s and t of G, and we are to decide whether in the given drawing of G there exists a β-stretch path P connecting s and t. We also output P if it exists. The βSP quantifies a notion of “near straightness” for paths in a graph G, motivated by gerrymandering regions in a map, where edges of G represent natural geographical/political boundaries that may be chosen to bound election districts. The notion of a β-stretch path naturally extends to cycles, and the extension gives a measure of how gerrymandered a district is. Furthermore, we show that the extension is closely related to several studied measures of local fatness of geometric shapes. We prove that βSP is strongly NP-complete. We complement this result by giving a quasi-polynomial time algorithm, that for a given ε > 0, β ∈ O(poly(log |V (G)|)), and s, t ∈ V (G), outputs a β-stretch path between s and t, if a (1 − ε)β-stretch path between s and t exists in the drawing. 
    more » « less
  2. Given only a finite collection of points sampled from a Riemannian manifold embedded in a Euclidean space, in this paper we propose a new method to numerically solve elliptic and parabolic partial differential equations (PDEs) supplemented with boundary conditions. Since the construction of triangulations on unknown manifolds can be both difficult and expensive, both in terms of computational and data requirements, our goal is to solve these problems without a triangulation. Instead, we rely only on using the sample points to define quadrature formulas on the unknown manifold. Our main tool is the diffusion maps algorithm. We re-analyze this well-known method in a variational sense for manifolds with boundary. Our main result is that the variational diffusion maps graph Laplacian is a consistent estimator of the Dirichlet energy on the manifold. This improves upon previous results and provides a rigorous justification of the well-known relationship between diffusion maps and the Neumann eigenvalue problem. Moreover, using semigeodesic coordinates we derive the first uniform asymptotic expansion of the diffusion maps kernel integral operator for manifolds with boundary. This expansion relies on a novel lemma which relates the extrinsic Euclidean distance to the coordinate norm in a normal collar of the boundary. We then use a recently developed method of estimating the distance to boundary function (notice that the boundary location is assumed to be unknown) to construct a consistent estimator for boundary integrals. Finally, by combining these various estimators, we illustrate how to impose Dirichlet and Neumann conditions for some common PDEs based on the Laplacian. Several numerical examples illustrate our theoretical findings. 
    more » « less
  3. We consider several variants of the map-matching problem, which seeks to find a path Q in graph G that has the smallest distance to a given trajectory P (which is likely not to be exactly on the graph). In a typical application setting, P models a noisy GPS trajectory from a person traveling on a road network, and the desired path Q should ideally correspond to the actual path in G that the person has traveled. Existing map-matching algorithms in the literature consider all possible paths in G as potential candidates for Q. We find solutions to the map-matching problem under different settings. In particular, we restrict the set of paths to shortest paths, or concatenations of shortest paths, in G. As a distance measure, we use the Fréchet distance, which is a suitable distance measure for curves since it takes the continuity of the curves into account. 
    more » « less
  4. null (Ed.)
    Consider an algorithm performing a computation on a huge random object (for example a random graph or a "long" random walk). Is it necessary to generate the entire object prior to the computation, or is it possible to provide query access to the object and sample it incrementally "on-the-fly" (as requested by the algorithm)? Such an implementation should emulate the random object by answering queries in a manner consistent with an instance of the random object sampled from the true distribution (or close to it). This paradigm is useful when the algorithm is sub-linear and thus, sampling the entire object up front would ruin its efficiency. Our first set of results focus on undirected graphs with independent edge probabilities, i.e. each edge is chosen as an independent Bernoulli random variable. We provide a general implementation for this model under certain assumptions. Then, we use this to obtain the first efficient local implementations for the Erdös-Rényi G(n,p) model for all values of p, and the Stochastic Block model. As in previous local-access implementations for random graphs, we support Vertex-Pair and Next-Neighbor queries. In addition, we introduce a new Random-Neighbor query. Next, we give the first local-access implementation for All-Neighbors queries in the (sparse and directed) Kleinberg’s Small-World model. Our implementations require no pre-processing time, and answer each query using O(poly(log n)) time, random bits, and additional space. Next, we show how to implement random Catalan objects, specifically focusing on Dyck paths (balanced random walks on the integer line that are always non-negative). Here, we support Height queries to find the location of the walk, and First-Return queries to find the time when the walk returns to a specified location. This in turn can be used to implement Next-Neighbor queries on random rooted ordered trees, and Matching-Bracket queries on random well bracketed expressions (the Dyck language). Finally, we introduce two features to define a new model that: (1) allows multiple independent (and even simultaneous) instantiations of the same implementation, to be consistent with each other without the need for communication, (2) allows us to generate a richer class of random objects that do not have a succinct description. Specifically, we study uniformly random valid q-colorings of an input graph G with maximum degree Δ. This is in contrast to prior work in the area, where the relevant random objects are defined as a distribution with O(1) parameters (for example, n and p in the G(n,p) model). The distribution over valid colorings is instead specified via a "huge" input (the underlying graph G), that is far too large to be read by a sub-linear time algorithm. Instead, our implementation accesses G through local neighborhood probes, and is able to answer queries to the color of any given vertex in sub-linear time for q ≥ 9Δ, in a manner that is consistent with a specific random valid coloring of G. Furthermore, the implementation is memory-less, and can maintain consistency with non-communicating copies of itself. 
    more » « less
  5. Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs enable incorporation of full genomic diversity within a population. However, transitioning from a simple string-based reference to graphs requires addressing many computational challenges, one of which concerns accurately mapping sequencing read sets to graphs. Paired-end Illumina sequencing is a commonly used sequencing platform in genomics, where the paired-end distance constraints allow disambiguation of repeats. Many recent works have explored provably good index-based and alignment-based strategies for mapping individual reads to graphs. However, validating distance constraints efficiently over graphs is not trivial, and existing sequence to graph mappers rely on heuristics. We introduce a mathematical formulation of the problem, and provide a new algorithm to solve it exactly. We take advantage of the high sparsity of reference graphs, and use sparse matrix-matrix multiplications (SpGEMM) to build an index which can be queried efficiently by a mapping algorithm for validating the distance constraints. Effectiveness of the algorithm is demonstrated using real reference graphs, including a human MHC variation graph, and a pan-genome de-Bruijn graph built using genomes of 20 B. anthracis strains. While the one-time indexing time can vary from a few minutes to a few hours using our algorithm, answering a million distance queries takes less than a second. 
    more » « less