Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Reid, Nancy (Ed.)Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to find the low-dimensional structure of data. Dimension reduction for large, high-dimensional data is not merely a way to reduce the data; the new representations and descriptors obtained by ML reveal the geometric shape of high-dimensional point clouds and allow one to visualize, denoise, and interpret them. This review presents the underlying principles of ML, its representative methods, and their statistical foundations, all from a practicing statistician's perspective. It describes the trade-offs and what theory tells us about the parameter and algorithmic choices we make in order to obtain reliable conclusions. Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.more » « less
- 
            Manifold embedding algorithms map high-dimensional data down to coordinates in a much lower-dimensional space. One of the aims of dimension reduction is to find intrinsic coordinates that describe the data manifold. The coordinates returned by the embedding algorithm are abstract, and finding their physical or domain-related meaning is not formalized and often left to domain experts. This paper studies the problem of recovering the meaning of the new low-dimensional representation in an automatic, principled fashion. We propose a method to explain embedding coordinates of a manifold as non-linear compositions of functions from a user-defined dictionary. We show that this problem can be set up as a sparse linear Group Lasso recovery problem, find sufficient recovery conditions, and demonstrate its effectiveness on datamore » « less
- 
            M. Ranzato; A. Beygelzimer; Y. Dauphin; P.S. Liang; J. Wortman Vaughan (Ed.)The null space of the k-th order Laplacian Lk, known as the {\em k-th homology vector space}, encodes the non-trivial topology of a manifold or a network. Understanding the structure of the homology embedding can thus disclose geometric or topological information from the data. The study of the null space embedding of the graph Laplacian L0 has spurred new research and applications, such as spectral clustering algorithms with theoretical guarantees and estimators of the Stochastic Block Model. In this work, we investigate the geometry of the k-th homology embedding and focus on cases reminiscent of spectral clustering. Namely, we analyze the {\em connected sum} of manifolds as a perturbation to the direct sum of their homology embeddings. We propose an algorithm to factorize the homology embedding into subspaces corresponding to a manifold's simplest topological components. The proposed framework is applied to the {\em shortest homologous loop detection} problem, a problem known to be NP-hard in general. Our spectral loop detection algorithm scales better than existing methods and is effective on diverse data such as point clouds and images.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available