Topological Data Analysis is a growing area of data science, which aims at computing and characterizing the geometry and topology of data sets, in order to produce useful descriptors for subsequent statistical and machine learning tasks. Its main computational tool is persistent homology, which amounts to track the topological changes in growing families of subsets of the data set itself, called filtrations, and encode them in an algebraic object, called persistence module. Even though algorithms and theoretical properties of modules are now well-known in the single-parameter case, that is, when there is only one filtration to study, much less is known in the multi-parameter case, where several filtrations are given at once. Though more complicated, the resulting persistence modules are usually richer and encode more information, making them better descriptors for data science. In this article, we present the first approximation scheme, which is based on fibered barcodes and exact matchings, two constructions that stem from the theory of single-parameter persistence, for computing and decomposing general multi-parameter persistence modules. Our algorithm has controlled complexity and running time, and works in arbitrary dimension, i.e., with an arbitrary number of filtrations. Moreover, when restricting to specific classes of multi-parameter persistence modules, namely the ones that can be decomposed into intervals, we establish theoretical results about the approximation error between our estimate and the true module in terms of interleaving distance. Finally, we present empirical evidence validating output quality and speed-up on several data sets.
more »
« less
Fourier Dimension Estimates for Sets of Exact Approximation Order: The Well-Approximable Case
Abstract We obtain a Fourier dimension estimate for sets of exact approximation order introduced by Bugeaud for certain approximation functions $$\psi $$. This Fourier dimension estimate implies that these sets of exact approximation order contain normal numbers.
more »
« less
- Award ID(s):
- 1803086
- PAR ID:
- 10382107
- Date Published:
- Journal Name:
- International Mathematics Research Notices
- ISSN:
- 1073-7928
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Optimization-based k-space sampling pattern design often involves the Jacobian matrix of non-uniform fast Fourier transform (NUFFT) operations. Previous works relying on auto-differentiation can be time-consuming and less accurate. This work proposes an approximation method using the relationship between exact non-uniform DFT (NDFT) and NUFFT, demonstrating improved results for the sampling pattern optimization problem.more » « less
-
null (Ed.)Most of existing statistical theories on deep neural networks have sample complexities cursed by the data dimension and therefore cannot well explain the empirical success of deep learning on high-dimensional data. To bridge this gap, we propose to exploit the low-dimensional structures of the real world datasets and establish theoretical guarantees of convolutional residual networks (ConvResNet) in terms of function approximation and statistical recovery for binary classification problem. Specifically, given the data lying on a 𝑑-dimensional manifold isometrically embedded in ℝ^𝐷, we prove that if the network architecture is properly chosen, ConvResNets can (1) approximate Besov functions on manifolds with arbitrary accuracy, and (2) learn a classifier by minimizing the empirical logistic risk, which gives an excess risk in the order of 𝑛−2s/(2s+d), where 𝑠 is a smoothness parameter. This implies that the sample complexity depends on the intrinsic dimension 𝑑, instead of the data dimension 𝐷. Our results demonstrate that ConvResNets are adaptive to low-dimensional structures of data sets.more » « less
-
We establish a version of the fractal uncertainty principle, obtained by Bourgain and Dyatlov in 2016, in higher dimensions. The Fourier support is limited to sets Y⊂ℝd which can be covered by finitely many products of δ-regular sets in one dimension, but relative to arbitrary axes. Our results remain true if Y is distorted by diffeomorphisms. Our method combines the original approach by Bourgain and Dyatlov, in the more quantitative 2017 rendition by Jin and Zhang, with Cartan set techniques.more » « less
-
Mulzer, Wolfgang; Phillips, Jeff M (Ed.)A fundamental question is whether one can maintain a maximum independent set (MIS) in polylogarithmic update time for a dynamic collection of geometric objects in Euclidean space. For a set of intervals, it is known that no dynamic algorithm can maintain an exact MIS in sublinear update time. Therefore, the typical objective is to explore the trade-off between update time and solution size. Substantial efforts have been made in recent years to understand this question for various families of geometric objects, such as intervals, hypercubes, hyperrectangles, and fat objects. We present the first fully dynamic approximation algorithm for disks of arbitrary radii in the plane that maintains a constant-factor approximate MIS in polylogarithmic expected amortized update time. Moreover, for a fully dynamic set of n unit disks in the plane, we show that a 12-approximate MIS can be maintained with worst-case update time O(log n), and optimal output-sensitive reporting. This result generalizes to fat objects of comparable sizes in any fixed dimension d, where the approximation ratio depends on the dimension and the fatness parameter. Further, we note that, even for a dynamic set of disks of unit radius in the plane, it is impossible to maintain O(1+ε)-approximate MIS in truly sublinear update time, under standard complexity assumptions. Our results build on two recent technical tools: (i) The MIX algorithm by Cardinal et al. (ESA 2021) that can smoothly transition from one independent set to another; hence it suffices to maintain a family of independent sets where the largest one is an O(1)-approximate MIS. (ii) A dynamic nearest/farthest neighbor data structure for disks by Kaplan et al. (DCG 2020) and Liu (SICOMP 2022), which generalizes the dynamic convex hull data structure by Chan (JACM 2010), and quickly yields a "replacement" disk (if any) when a disk in one of our independent sets is deleted.more » « less
An official website of the United States government

