Summary The fused lasso, also known as totalvariation denoising, is a locally adaptive function estimator over a regular grid of design points. In this article, we extend the fused lasso to settings in which the points do not occur on a regular grid, leading to a method for nonparametric regression. This approach, which we call the $K$nearestneighbours fused lasso, involves computing the $K$nearestneighbours graph of the design points and then performing the fused lasso over this graph. We show that this procedure has a number of theoretical advantages over competing methods: specifically, it inherits local adaptivity from its connection to the fused lasso, and it inherits manifold adaptivity from its connection to the $K$nearestneighbours approach. In a simulation study and an application to flu data, we show that excellent results are obtained. For completeness, we also study an estimator that makes use of an $\epsilon$graph rather than a $K$nearestneighbours graph and contrast it with the $K$nearestneighbours fused lasso.
more »
« less
Learning the Helix Topology of Musical Pitch
To explain the consonance of octaves, music psychologists represent pitch as a helix where azimuth and axial coordinate correspond to pitch class and pitch height respectively. This article addresses the problem of discovering this helical structure from unlabeled audio data. We measure Pearson correlations in the constantQ transform (CQT) domain to build a Knearest neighbor graph between frequency subbands. Then, we run the Isomap manifold learning algorithm to represent this graph in a threedimensional space in which straight lines approximate graph geodesics. Experiments on isolated musical notes demonstrate that the resulting manifold resembles a helix which makes a full turn at every octave. A circular shape is also found in English speech, but not in urban noise. We discuss the impact of various design choices on the visualization: instrumentarium, loudness mapping function, and number of neighbors K.
more »
« less
 Award ID(s):
 1633206
 NSFPAR ID:
 10301445
 Date Published:
 Journal Name:
 ICASSP 2020  2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
 Page Range / eLocation ID:
 11 to 15
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Image retrieval relies heavily on the quality of the data modeling and the distance measurement in the feature space. Building on the concept of image manifold, we first propose to represent the feature space of images, learned via neural networks, as a graph. Neighborhoods in the feature space are now defined by the geodesic distance between images, represented as graph vertices or manifold samples. When limited images are available, this manifold is sparsely sampled, making the geodesic computation and the corresponding retrieval harder. To address this, we augment the manifold samples with geometrically aligned text, thereby using a plethora of sentences to teach us about images. In addition to extensive results on standard datasets illustrating the power of text to help in image retrieval, a new public dataset based on CLEVR is introduced to quantify the semantic similarity between visual data and text data. The experimental results show that the joint embedding manifold is a robust representation, allowing it to be a better basis to perform image retrieval given only an image and a textual instruction on the desired modifications over the image.more » « less

Abstract. Similaritysearchisafundamentalbuildingblockforinformation retrieval on a variety of datasets. The notion of a neighbor is often based on binary considerations, such as the k nearest neighbors. However, considering that data is often organized as a manifold with low intrinsic dimension, the notion of a neighbor must recognize higherorder relationship, to capture neighbors in all directions. Proximity graphs such as the Relative Neighbor Graphs (RNG), use trinary relationships which capture the notion of direc tion and have been successfully used in a number of applications. However, the current algorithms for computing the RNG, despite widespread use, are approximate and not scalable. This paper proposes a novel type of graph, the Generalized Relative Neighborhood Graph (GRNG) for use in a pivot layer that then guides the efficient and exact construction of the RNG of a set of exemplars. It also shows how to extend this to a multilayer hier archy which significantly improves over the stateoftheart methods which can only construct an approximate RNG.more » « less

Abstract Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \ x_i  x_j \^2} {\sigma ^2} ) $ is widely used in graphbased geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma $, and a common practice called selftuned kernel adaptively sets a $\sigma _i$ at each point $x_i$ by the $k$nearest neighbor (kNN) distance. When $x_i$s are sampled from a $d$dimensional manifold embedded in a possibly highdimensional space, unlike with fixedbandwidth kernels, theoretical results of graph Laplacian convergence with selftuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted)Laplacian for a new family of kNN selftuned kernels $W^{(\alpha )}_{ij} = k_0( \frac{ \ x_i  x_j \^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $, where $\hat{\rho }$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $\alpha $. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\varDelta _p$. Specifically, we prove the pointwise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hat{\rho }$ which bounds the relative estimation error $\hat{\rho }  \bar{\rho }/\bar{\rho }$ uniformly with high probability, where $\bar{\rho } = p^{1/d}$ and $p$ is the data density function. Our theoretical results reveal the advantage of the selftuned kernel over the fixedbandwidth kernel via smaller variance error in lowdensity regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and handwritten digit image data.more » « less

Convolutional neural networks (CNNs) are revolutionizing imaging science for two and threedimensional images over Euclidean domains. However, many data sets are intrinsically nonEuclidean and are better modeled through other mathematical structures, such as graphs or manifolds. This state of affairs has led to the development of geometric deep learning, which refers to a body of research that aims to translate the principles of CNNs to these nonEuclidean structures. In the process, various challenges have arisen, including how to define such geometric networks, how to compute and train them efficiently, and what are their mathematical properties. In this letter we describe the geometric wavelet scattering transform, which is a type of geometric CNN for graphs and manifolds consisting of alternating multiscale geometric wavelet transforms and nonlinear activation functions. As the name suggests, the geometric wavelet scattering transform is an adaptation of the Euclidean wavelet scattering transform, first introduced by S. Mallat, to graph and manifold data. Like its Euclidean counterpart, the geometric wavelet scattering transform has several desirable properties. In the manifold setting these properties include isometric invariance up to a user specified scale and stability to small diffeomorphisms. Numerical results on manifold and graph data sets, including graph and manifold classification tasks as well as others, illustrate the practical utility of the approach.more » « less