Graphs in metric spaces appear in a wide range of data sets, and there is a large body of work focused on comparing, matching, or analyzing collections of graphs in different ambient spaces. In this survey, we provide an overview of a diverse collection of distance measures that can be defined on the set of finite graphs immersed (and in some cases, embedded) in a metric space. For each of the distance measures, we recall their definitions and investigate which of the properties of a metric they satisfy. Furthermore we compare the distance measures based on these properties and discuss their computational complexity.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Abstract Background 3D imaging, such as X-ray CT and MRI, has been widely deployed to study plant root structures. Many computational tools exist to extract coarse-grained features from 3D root images, such as total volume, root number and total root length. However, methods that can accurately and efficiently compute fine-grained root traits, such as root number and geometry at each hierarchy level, are still lacking. These traits would allow biologists to gain deeper insights into the root system architecture.
Results We present TopoRoot, a high-throughput computational method that computes fine-grained architectural traits from 3D images of maize root crowns or root systems. These traits include the number, length, thickness, angle, tortuosity, and number of children for the roots at each level of the hierarchy. TopoRoot combines state-of-the-art algorithms in computer graphics, such as topological simplification and geometric skeletonization, with customized heuristics for robustly obtaining the branching structure and hierarchical information. TopoRoot is validated on both CT scans of excavated field-grown root crowns and simulated images of root systems, and in both cases, it was shown to improve the accuracy of traits over existing methods. TopoRoot runs within a few minutes on a desktop workstation for images at the resolution range of 400^3, with minimal need for human intervention in the form of setting three intensity thresholds per image.
Conclusions TopoRoot improves the state-of-the-art methods in obtaining more accurate and comprehensive fine-grained traits of maize roots from 3D imaging. The automation and efficiency make TopoRoot suitable for batch processing on large numbers of root images. Our method is thus useful for phenomic studies aimed at finding the genetic basis behind root system architecture and the subsequent development of more productive crops.
-
In this paper we prove that the problem of deciding contractibility of an arbitrary closed curve on the boundary of a 3-manifold is in NP. We emphasize that the manifold and the curve are both inputs to the problem. Moreover, our algorithm also works if the curve is given as a compressed word. Previously, such an algorithm was known for simple (non-compressed) curves, and, in very limited cases, for curves with self-intersections. Furthermore, our algorithm is fixed-parameter tractable in the size of the input 3-manifold. As part of our proof, we obtain new polynomial-time algorithms for compressed curves on surfaces, which we believe are of independent interest. We provide a polynomial-time algorithm which, given an orientable surface and a compressed loop on the surface, computes a canonical form for the loop as a compressed word. In particular, contractibility of compressed curves on surfaces can be decided in polynomial time; prior published work considered only constant genus surfaces. More generally, we solve the following normal subgroup membership problem in polynomial time: given an arbitrary orientable surface, a compressed closed curve, and a collection of disjoint normal curves, there is a polynomial-time algorithm to decide if the curve lies in the normal subgroup generated by components of the normal curves in the fundamental group of the surface after attaching the curves to a basepoint.more » « less
-
The Frechet distance is often used to measure distances between paths, with applications in areas ranging from map matching to GPS trajectory analysis to hand- writing recognition. More recently, the Frechet distance has been generalized to a distance between two copies of the same graph embedded or immersed in a metric space; this more general setting opens up a wide range of more complex applications in graph analysis. In this paper, we initiate a study of some of the fundamental topological properties of spaces of paths and of graphs mapped to R^n under the Frechet distance, in an effort to lay the theoretical groundwork for understanding how these distances can be used in practice. In particular, we prove whether or not these spaces, and the metric balls therein, are path-connected.more » « less
-
We study subtrajectory clustering under the Fréchet distance. Given one or more trajectories, the task is to split the trajectories into several parts, such that the parts have a good clustering structure. We approach this problem via a new set cover formulation, which we think provides a natural formalization of the problem as it is studied in many applications. Given a polygonal curve P with n vertices in fixed dimension, integers k, ℓ ≥ 1, and a real value Δ > 0, the goal is to find k center curves of complexity at most ℓ such that every point on P is covered by a subtrajectory that has small Fréchet distance to one of the k center curves (≤ Δ). In many application scenarios, one is interested in finding clusters of small complexity, which is controlled by the parameter ℓ. Our main result is a bicriterial approximation algorithm: if there exists a solution for given parameters k, ℓ, and Δ, then our algorithm finds a set of k' center curves of complexity at most ℓ with covering radius Δ' with k' in O(kℓ2 log (kℓ)), and Δ' ≤ 19Δ. Moreover, within these approximation bounds, we can minimize k while keeping the other parameters fixed. If ℓ is a constant independent of n, then, the approximation factor for the number of clusters k is O(log k) and the approximation factor for the radius Δ is constant. In this case, the algorithm has expected running time in Õ(km2 + mn) and uses space in O(n + m), where m=⌈L/Δ⌉ and L is the total arclength of the curve P.more » « less
-
This paper is motivated by a practical problem: many U.S. states have public hearings on "communities of interest" as part of their redistricting process, but no state has as yet adopted a concrete method of spatializing and aggregating community maps in order to take them into account in the drawing of new boundaries for electoral districts. Below, we describe a year-long project that collected and synthesized thousands of community maps through partnerships with grassroots organizations and/or government offices. The submissions were then aggregated by geographical clustering with a modified Hausdorff distance; then, the text from the narrative submissions was classified with semantic labels so that short runs of a Markov chain could be used to form semantic sub-clusters. The resulting dataset is publicly available, including the raw data of submitted community maps as well as post-processed community clusters and a scoring system for measuring how well districting plans respect the clusters. We provide a discussion of the strengths and weaknesses of this methodology and conclude with proposed directions for future work.more » « less