skip to main content


This content will become publicly available on June 4, 2024

Title: Study of Manifold Geometry Using Multiscale Non-Negative Kernel Graphs
Modern machine learning systems are increasingly trained on large amounts of data embedded in high-dimensional spaces. Often this is done without analyzing the structure of the dataset. In this work, we propose a framework to study the geometric structure of the data. We make use of our recently introduced non-negative kernel (NNK) regression graphs to estimate the point density, intrinsic dimension, and linearity of the data manifold (curvature). We further generalize the graph construction and geometric estimation to multiple scales by iteratively merging neighborhoods in the input data. Our experiments demonstrate the effectiveness of our proposed approach over other baselines in estimating the local geometry of the data manifolds on synthetic and real datasets.  more » « less
Award ID(s):
2009032
NSF-PAR ID:
10433762
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Page Range / eLocation ID:
1 to 5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012]. 
    more » « less
  2. Abstract

    The Event Horizon Telescope (EHT) is a millimeter very long baseline interferometry (VLBI) array that has imaged the apparent shadows of the supermassive black holes M87* and Sagittarius A*. Polarimetric data from these observations contain a wealth of information on the black hole and accretion flow properties. In this work, we develop polarimetric geometric modeling methods for mm-VLBI data, focusing on approaches that fit data products with differing degrees of invariance to broad classes of calibration errors. We establish a fitting procedure using a polarimetric “m-ring” model to approximate the image structure near a black hole. By fitting this model to synthetic EHT data from general relativistic magnetohydrodynamic models, we show that the linear and circular polarization structure can be successfully approximated with relatively few model parameters. We then fit this model to EHT observations of M87* taken in 2017. In total intensity and linear polarization, the m-ring fits are consistent with previous results from imaging methods. In circular polarization, the m-ring fits indicate the presence of event-horizon-scale circular polarization structure, with a persistent dipolar asymmetry and orientation across several days. The same structure was recovered independently of observing band, used data products, and model assumptions. Despite this broad agreement, imaging methods do not produce similarly consistent results. Our circular polarization results, which imposed additional assumptions on the source structure, should thus be interpreted with some caution. Polarimetric geometric modeling provides a useful and powerful method to constrain the properties of horizon-scale polarized emission, particularly for sparse arrays like the EHT.

     
    more » « less
  3. Abstract

    In this study, we discuss the characterization and quantification of composite microstructures formed by the external field manipulation of high aspect ratio magnetic particles in an elastomeric matrix. In our prior work, we have demonstrated that the simultaneous application of electric and magnetic fields on hard magnetic particles with geometric anisotropy can create a hierarchy of structures at different length scales, which can be used to achieve a wide range of properties. We aim to characterize these hierarchical structures and relate them to final composite properties so we can achieve our ultimate goal of designing a material for a prescribed performance. The complex particle structures are formed during processing by using electric and magnetic fields, and they are then locked-in by curing the polymer matrix around the particles. The model materials used in the study are barium hexaferrite (BHF) particles and polydimethylsiloxane (PDMS) elastomer. BHF was selected for its hard magnetic properties and high geometric anisotropy. PDMS was selected for its good mechanical properties and its tunable curing kinetics. The resulting BHF-PDMS composites are magnetoactive, i.e., they will deform and actuate in response to magnetic fields. In order to investigate the resulting particle orientation, distribution and alignment and to predict the composite’s macro scale properties, we developed techniques to quantify the particle structures.

    The general framework we developed allows us to quantify and directly compare the microstructures created within the composites. To identify structures at the different length scales, images of the composite are taken using both optical microscopy and scanning electron microscopy. We then use ImageJ to analyze them and gather data on particle size, location, and orientation angle. The data is then exported to MATLAB, and is used to run a Minimum Spanning Tree Algorithm to classify the particle structures, and von Mises Distributions to quantify the alignment of said structures.

    Important findings show 1) the ability to control structure using a combination of external electric, magnetic and thermal fields; 2) that electric fields promote long range order while magnetic fields promote short-range order; and 3) the resulting hierarchical structure greatly influence bulk material properties. Manipulating particles in a composite material is technologically important because changes in microstructure can alter the properties of the bulk material. The multifield processing we investigate here can form the basis for next generation additive manufacturing platforms where desired properties are tailored locally through in-situ hierarchical control of particle arrangements.

     
    more » « less
  4. Several data analysis techniques employ similarity relationships between data points to uncover the intrinsic dimension and geometric structure of the underlying data-generating mechanism. In this paper we work under the model assumption that the data is made of random perturbations of feature vectors lying on a low-dimensional manifold. We study two questions: how to define the similarity relationships over noisy data points, and what is the resulting impact of the choice of similarity in the extraction of global geometric information from the underlying manifold. We provide concrete mathematical evidence that using a local regularization of the noisy data to define the similarity improves the approximation of the hidden Euclidean distance between unperturbed points. Furthermore, graph-based objects constructed with the locally regularized similarity function satisfy better error bounds in their recovery of global geometric ones. Our theory is supported by numerical experiments that demonstrate that the gain in geometric understanding facilitated by local regularization translates into a gain in classification accuracy in simulated and real data. 
    more » « less
  5. Several data analysis techniques employ similarity relationships between data points to uncover the intrinsic dimension and geometric structure of the underlying data-generating mechanism. In this paper we work under the model assumption that the data is made of random perturbations of feature vectors lying on a low-dimensional manifold. We study two questions: how to define the similarity relationships over noisy data points, and what is the resulting impact of the choice of similarity in the extraction of global geometric information from the underlying manifold. We provide concrete mathematical evidence that using a local regularization of the noisy data to define the similarity improves the ap- proximation of the hidden Euclidean distance between unperturbed points. Furthermore, graph-based objects constructed with the locally regularized similarity function satisfy bet- ter error bounds in their recovery of global geometric ones. Our theory is supported by numerical experiments that demonstrate that the gain in geometric understanding facili- tated by local regularization translates into a gain in classification accuracy in simulated and real data. 
    more » « less