Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Ultrametric matrices appear in many domains of mathematics and science; nevertheless, they can be large and dense, making them difficult to store and manipulate, unlike large but sparse matrices. In this manuscript, we exploit that ultrametric matrices can be represented as binary trees to sparsify them via an orthonormal base change based on Haar-like wavelets. We show that, with overwhelmingly high probability, only an asymptotically negligible fraction of the off-diagonal entries in random but large ultrametric matrices remain non-zero after the base change; and develop an algorithm to sparsify such matrices directly from their tree representation. We also identify the subclass of matrices diagonalized by the Haar-like wavelets and supply a sufficient condition to approximate the spectrum of ultrametric matrices outside this subclass. Our methods give computational access to a covariance matrix model of the microbiologists’ Tree of Life, which was previously inaccessible due to its size, and motivate introducing a new wavelet-based (beta-diversity) metric to compare microbial environments. Unlike the established metrics, the new metric may be used to identify internal nodes (i.e. splits) in the Tree that link microbial composition and environmental factors in a statistically significant manner.more » « less
-
The metric dimension of a graph is the smallest number of nodes required to identify allother nodes uniquely based on shortest path distances. Applications of metric dimensioninclude discovering the source of a spread in a network, canonically labeling graphs, andembedding symbolic data in low-dimensional Euclidean spaces. This survey gives a self-contained introduction to metric dimension and an overview of the quintessential resultsand applications. We discuss methods for approximating the metric dimension of generalgraphs, and specific bounds and asymptotic behavior for deterministic and random familiesof graphs. We conclude with related concepts and directions for future work.more » « less
-
We introduce the notion of Levenshtein graphs, an analog to Hamming graphs but using the edit distance instead of the Hamming distance; in particular, vertices in Levenshtein graphs may be strings (i.e., words or sequences of characters in a reference alphabet) of possibly different lengths. We study various properties of these graphs, including a necessary and sufficient condition for their shortest path distance to be identical to the edit distance, and characterize their automorphism group and determining number. We also bound the metric dimension (i.e. minimum resolving set size) of Levenshtein graphs. Regarding the latter, recall that a run is a string composed of identical characters. We construct a resolving set of two-run strings and an algorithm that computes the edit distance between a string of length k and any single-run or two-run string in operations.more » « less
An official website of the United States government
