skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 5, 2025

Title: A Poincare Inequality and Consistency Results for Signal Sampling on Large Graphs
Large-scale graph machine learning is challenging as the complexity of learning models scales with the graph size. Subsampling the graph is a viable alternative, but sampling on graphs is nontrivial as graphs are non-Euclidean. Existing graph sampling techniques require not only computing the spectra of large matrices but also repeating these computations when the graph changes, e.g., grows. In this pa- per, we introduce a signal sampling theory for a type of graph limit—the graphon. We prove a Poincare ́ inequality for graphon signals and show that complements of node subsets satisfying this inequality are unique sampling sets for Paley-Wiener spaces of graphon signals. Exploiting connections with spectral clustering and Gaussian elimination, we prove that such sampling sets are consistent in the sense that unique sampling sets on a convergent graph sequence converge to unique sampling sets on the graphon. We then propose a related graphon signal sampling algorithm for large graphs, and demonstrate its good empirical performance on graph machine learning tasks.  more » « less
Award ID(s):
2134108
PAR ID:
10568532
Author(s) / Creator(s):
; ;
Publisher / Repository:
International Conference on Learning Representations (ICLR)
Date Published:
Format(s):
Medium: X
Location:
Vienna
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper studies stochastic games on large graphs and their graphon limits. We propose a new formulation of graphon games based on a single typical player’s label-state distribution. In contrast, other recently proposed models of graphon games work directly with a continuum of players, which involves serious measure-theoretic technicalities. In fact, by viewing the label as a component of the state process, we show in our formulation that graphon games are a special case of mean field games, albeit with certain inevitable degeneracies and discontinuities that make most existing results on mean field games inapplicable. Nonetheless, we prove the existence of Markovian graphon equilibria under fairly general assumptions as well as uniqueness under a monotonicity condition. Most importantly, we show how our notion of graphon equilibrium can be used to construct approximate equilibria for large finite games set on any (weighted, directed) graph that converges in cut norm. The lack of players’ exchangeability necessitates a careful definition of approximate equilibrium, allowing heterogeneity among the players’ approximation errors, and we show how various regularity properties of the model inputs and underlying graphon lead naturally to different strengths of approximation. Funding: D. Lacker was partially supported by the Air Force Office of Scientific Research [Grant FA9550-19-1-0291] and the National Science Foundation [Award DMS-2045328]. 
    more » « less
  2. null (Ed.)
    Graphs are nowadays ubiquitous in the fields of signal processing and machine learning. As a tool used to express relationships between objects, graphs can be deployed to various ends: (i) clustering of vertices, (ii) semi-supervised classification of vertices, (iii) supervised classification of graph signals, and (iv) denoising of graph signals. However, in many practical cases graphs are not explicitly available and must therefore be inferred from data. Validation is a challenging endeavor that naturally depends on the downstream task for which the graph is learnt. Accordingly, it has often been difficult to compare the efficacy of different algorithms. In this work, we introduce several ease-to-use and publicly released benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature. 
    more » « less
  3. In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observations in many domains. Further, in many cases the target entities for analysis are actually signals on such graphs. We propose to compare and organize such datasets of graph signals by using an earth mover’s distance (EMD) with a geodesic cost over the underlying graph. Typically, EMD is computed by optimizing over the cost of transporting one probability distribution to another over an underlying metric space. However, this is inefficient when computing the EMD between many signals. Here, we propose an unbalanced graph EMD that efficiently embeds the unbalanced EMD on an underlying graph into an L1 space, whose metric we call unbalanced diffusion earth mover’s distance (UDEMD). Next, we show how this gives distances between graph signals that are robust to noise. Finally, we apply this to organizing patients based on clinical notes, embedding cells modeled as signals on a gene graph, and organizing genes modeled as signals over a large cell graph. In each case, we show that UDEMD-based embeddings find accurate distances that are highly efficient compared to other methods. 
    more » « less
  4. The scattering transform is a multilayered, wavelet-based transform initially introduced as a mathematical model of convolutional neural networks (CNNs) that has played a foundational role in our understanding of these networks’ stability and invariance properties. In subsequent years, there has been widespread interest in extending the success of CNNs to data sets with non- Euclidean structure, such as graphs and manifolds, leading to the emerging field of geometric deep learning. In order to improve our understanding of the architectures used in this new field, several papers have proposed generalizations of the scattering transform for non-Euclidean data structures such as undirected graphs and compact Riemannian manifolds without boundary. Analogous to the original scattering transform, these works prove that these variants of the scattering transform have desirable stability and invariance properties and aim to improve our understanding of the neural networks used in geometric deep learning. In this paper, we introduce a general, unified model for geometric scattering on measure spaces. Our proposed framework includes previous work on compact Riemannian manifolds without boundary and undirected graphs as special cases but also applies to more general settings such as directed graphs, signed graphs, and manifolds with boundary. We propose a new criterion that identifies to which groups a useful representation should be invariant and show that this criterion is sufficient to guarantee that the scattering transform has desirable stability and invariance properties. Additionally, we consider finite measure spaces that are obtained from randomly sampling an unknown manifold. We propose two methods for constructing a data-driven graph on which the associated graph scattering transform approximates the scattering transform on the underlying manifold. Moreover, we use a diffusion-maps based approach to prove quantitative estimates on the rate of convergence of one of these approximations as the number of sample points tends to infinity. Lastly, we showcase the utility of our method on spherical images, a directed graph stochastic block model, and on high-dimensional single-cell data. 
    more » « less
  5. Graph signal processing (GSP) techniques are powerful tools that model complex relationships within large datasets, being now used in a myriad of applications in different areas including data science, communication networks, epidemiology, and sociology. Simple graphs can only model pairwise relationships among data which prevents their application in modeling networks with higher-order relationships. For this reason, some efforts have been made to generalize well-known graph signal processing techniques to more complex graphs such as hypergraphs, which allow capturing higher-order relationships among data. In this article, we provide a new hypergraph signal processing framework (t-HGSP) based on a novel tensor-tensor product algebra that has emerged as a powerful tool for preserving the intrinsic structures of tensors. The proposed framework allows the generalization of traditional GSP techniques while keeping the dimensionality characteristic of the complex systems represented by hypergraphs. To this end, the core elements of the t-HGSP framework are introduced, including the shifting operators and the hypergraph signal. The hypergraph Fourier space is also defined, followed by the concept of bandlimited signals and sampling. In our experiments, we demonstrate the benefits of our approach in applications such as clustering and denoising. 
    more » « less