NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning low-rank latent mesoscale structures in networks

https://doi.org/10.1038/s41467-023-42859-2

Lyu, Hanbaek; Kureh, Yacoub H.; Vendrow, Joshua; Porter, Mason A. (January 2024, Nature Communications)

Abstract Researchers in many fields use networks to represent interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In this paper, we present an approach to describe low-rank mesoscale structures in networks. We find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. Such low-rank mesoscale structures allow one to reconstruct networks by approximating subgraphs of a network using combinations of latent motifs. Employing subgraph sampling and nonnegative matrix factorization enables the discovery of these latent motifs. The ability to encode and reconstruct networks using a small set of latent motifs has many applications in network analysis, including network comparison, network denoising, and edge inference.
more » « less
Phase transition in one-dimensional excitable media with variable interaction range

Aguirre, Ander; Lyu, Hanbaek; Sivakoff, David (August 2024, arXiv)

We investigate two discrete models of excitable media on a one-dimensional integer lattice ℤ: the κ-color Cyclic Cellular Automaton (CCA) and the κ-color Firefly Cellular Automaton (FCA). In both models, sites are assigned uniformly random colors from ℤ/κℤ. Neighboring sites with colors within a specified interaction range r tend to synchronize their colors upon a particular local event of 'excitation'. We establish that there are three phases of CCA/FCA on ℤ as we vary the interaction range r. First, if r is too small (undercoupled), there are too many non-interacting pairs of colors, and the whole graph ℤ will be partitioned into non-interacting intervals of sites with no excitation within each interval. If r is within a sweet spot (critical), then we show the system clusters into ever-growing monochromatic intervals. For the critical interaction range r=⌊κ/2⌋, we show the density of edges of differing colors at time t is Θ(t−1/2) and each site excites Θ(t1/2) times up to time t. Lastly, if r is too large (overcoupled), then neighboring sites can excite each other and such 'defects' will generate waves of excitation at a constant rate so that each site will get excited at least at a linear rate. For the special case of FCA with r=⌊2/κ⌋+1, we show that every site will become (κ+1)-periodic eventually.
more » « less
Full Text Available
Supervised Matrix Factorization: Local Landscape Analysis and Applications

Lee, Joowon; Lyu, Hanbaek; Yao, Weixin (July 2024, Proceedings of Machine Learning Research)

Supervised matrix factorization (SMF) is a classical machine learning method that seeks low-dimensional feature extraction and classification tasks at the same time. Training an SMF model involves solving a non-convex and factor-wise constrained optimization problem with at least three blocks of parameters. Due to the high non-convexity and constraints, theoretical understanding of the optimization landscape of SMF has been limited. In this paper, we provide an extensive local landscape analysis for SMF and derive several theoretical and practical applications. Analyzing diagonal blocks of the Hessian naturally leads to a block coordinate descent (BCD) algorithm with adaptive step sizes. We provide global convergence and iteration complexity guarantees for this algorithm. Full Hessian analysis gives minimum L2-regularization to guarantee local strong convexity and robustness of parameters. We establish a local estimation guarantee under a statistical SMF model. We also propose a novel GPU-friendly neural implementation of the BCD algorithm and validate our theoretical findings through numerical experiments. Our work contributes to a deeper understanding of SMF optimization, offering insights into the optimization landscape and providing practical solutions to enhance its performance.
more » « less
Full Text Available
Sparseness-constrained nonnegative tensor factorization for detecting topics at different time scales

https://doi.org/10.3389/fams.2024.1287074

Kassab, Lara; Kryshchenko, Alona; Lyu, Hanbaek; Molitor, Denali; Needell, Deanna; Rebrova, Elizaveta; Yuan, Jiahong (July 2024, Frontiers in Applied Mathematics and Statistics)

Temporal text data, such as news articles or Twitter feeds, often comprises a mixture of long-lasting trends and transient topics. Effective topic modeling strategies should detect both types and clearly locate them in time. We first demonstrate that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) can automatically identify topics of variable persistence. We then introduce sparseness-constrained NCPD (S-NCPD) and its online variant to control the duration of the detected topics more effectively and efficiently, along with theoretical analysis of the proposed algorithms. Through an extensive study on both semi-synthetic and real-world datasets, we find that our S-NCPD and its online variant can identify both short- and long-lasting temporal topics in a quantifiable and controlled manner, which traditional topic modeling methods are unable to achieve. Additionally, the online variant of S-NCPD shows a faster reduction in reconstruction error and results in more coherent topics compared to S-NCPD, thus achieving both computational efficiency and quality of the resulting topics. Our findings indicate that S-NCPD and its online variant are effective tools for detecting and controlling the duration of topics in temporal text data, providing valuable insights into both persistent and transient trends.
more » « less
Full Text Available
Four-Parameter Coalescing Ballistic Annihilation

https://doi.org/10.1007/s10955-024-03305-9

Affeld, Kimberly; Dean, Christian; Junge, Matthew; Lyu, Hanbaek; Panish, Connor; Reeves, Lily (July 2024, Journal of Statistical Physics)

In coalescing ballistic annihilation, infinitely many particlesmove with fixed velocities across the real line and, upon colliding, either mutually annihilate or generate a new particle. We compute the critical density in symmetric three-velocity systems with four-parameter reaction equations.
more » « less
Full Text Available
SCALING LIMIT OF SOLITON LENGTHS IN A MULTICOLOR BOX-BALL SYSTEM

Lewis, Joel; Lyu, Hanbaek; Pylyavskyy, Pablo; Sen, Arnab (June 2024, Forum of Mathematics, Sigma)

The box-ball systems are integrable cellular automata whose long-time behavior is characterized by soliton solutions, with rich connections to other integrable systems such as the Korteweg-de Vries equation. In this paper, we consider a multicolor box-ball system with two types of random initial configurations and obtain sharp scaling limits of the soliton lengths as the system size tends to infinity. We obtain a sharp scaling limit of soliton lengths that turns out to be different from the single color case as established in [LLP20]. A large part of our analysis is devoted to studying the associated carrier process, which is a multi-dimensional Markov chain on the orthant, whose excursions and running maxima are closely related to soliton lengths. We establish the sharp scaling of its ruin probabilities, Skorokhod decomposition, strong law of large numbers, and weak diffusive scaling limit to a semimartingale reflecting Brownian motion with explicit parameters. We also establish and utilize complementary descriptions of the soliton lengths and numbers in terms of the modified Greene-Kleitman invariants for the box-ball systems and associated circular exclusion processes.
more » « less
Full Text Available
A latent linear model for nonlinear coupled oscillators on graphs

Goyal, Agam; Wu, Zhaoxing; Yim, Richard P; Chen, Binhao; Xu, Zihong; Lyu, Hanbaek (November 2023, arXiv)

A system of coupled oscillators on an arbitrary graph is locally driven by the tendency to mutual synchronization be- tween nearby oscillators, but can and often exhibit nonlinear behavior on the whole graph. Understanding such nonlin- ear behavior has been a key challenge in predicting whether all oscillators in such a system will eventually synchronize. In this paper, we demonstrate that, surprisingly, such nonlinear behavior of coupled oscillators can be effectively lin- earized in certain latent dynamic spaces. The key insight is that there is a small number of ‘latent dynamics filters’, each with a specific association with synchronizing and non-synchronizing dynamics on subgraphs so that any observed dynamics on subgraphs can be approximated by a suitable linear combination of such elementary dynamic patterns. Taking an ensemble of subgraph-level predictions provides an interpretable predictor for whether the system on the whole graph reaches global synchronization. We propose algorithms based on supervised matrix factorization to learn such latent dynamics filters. We demonstrate that our method performs competitively in synchronization prediction tasks against baselines and black-box classification algorithms, despite its simple and interpretable architecture.
more » « less
Full Text Available
Exponentially Convergent Algorithms for Supervised Matrix Factorization

Lee, Joowon; Lyu, Hanbaek; Yao, Weixin (August 2023, Advances in Neural Information Processing Systems)

Supervised matrix factorization (SMF) is a classical machine learning method that simultaneously seeks feature extraction and classification tasks, which are not necessarily a priori aligned objectives. Our goal is to use SMF to learn low-rank latent factors that offer interpretable, data-reconstructive, and class-discriminative features, addressing challenges posed by high-dimensional data. Training SMF model involves solving a nonconvex and possibly constrained optimization with at least three blocks of parameters. Known algorithms are either heuristic or provide weak convergence guarantees for special cases. In this paper, we provide a novel framework that ‘lifts’ SMF as a low-rank matrix estimation problem in a combined factor space and propose an efficient algorithm that provably converges exponentially fast to a global minimizer of the objective with arbitrary initialization under mild assumptions. Our framework applies to a wide range of SMF-type problems for multi-class classification with auxiliary features. To showcase an application, we demonstrate that our algorithm successfully identified well-known cancer-associated gene groups for various cancers.
more » « less
Full Text Available
Three-velocity coalescing ballistic annihilation

https://doi.org/10.1214/23-EJP948

Benitez, Luis; Junge, Matthew; Lyu, Hanbaek; Redman, Maximus; Reeves, Lily (January 2023, Electronic Journal of Probability)

Three-velocity ballistic annihilation is an interacting system in which stationary, left-, and right-moving particles are placed at random throughout the real line and mutually annihilate upon colliding. We introduce a coalescing variant in which collisions may generate new particles. For a symmetric three-parameter family of such systems, we compute the survival probability of stationary particles at a given initial density. This allows us to describe a phase-transition for stationary particle survival.
more » « less
Full Text Available
Learning to predict synchronization of coupled oscillators on randomly generated graphs

https://doi.org/10.1038/s41598-022-18953-8

Bassi, Hardeep; Yim, Richard P.; Vendrow, Joshua; Koduluka, Rohith; Zhu, Cherlin; Lyu, Hanbaek (December 2022, Scientific Reports)

Abstract Suppose we are given a system of coupled oscillators on an unknown graph along with the trajectory of the system during some period. Can we predict whether the system will eventually synchronize? Even with a known underlying graph structure, this is an important yet analytically intractable question in general. In this work, we take an alternative approach to the synchronization prediction problem by viewing it as a classification problem based on the fact that any given system will eventually synchronize or converge to a non-synchronizing limit cycle. By only using some basic statistics of the underlying graphs such as edge density and diameter, our method can achieve perfect accuracy when there is a significant difference in the topology of the underlying graphs between the synchronizing and the non-synchronizing examples. However, in the problem setting where these graph statistics cannot distinguish the two classes very well (e.g., when the graphs are generated from the same random graph model), we find that pairing a few iterations of the initial dynamics along with the graph statistics as the input to our classification algorithms can lead to significant improvement in accuracy; far exceeding what is known by the classical oscillator theory. More surprisingly, we find that in almost all such settings, dropping out the basic graph statistics and training our algorithms with only initial dynamics achieves nearly the same accuracy. We demonstrate our method on three models of continuous and discrete coupled oscillators—the Kuramoto model, Firefly Cellular Automata, and Greenberg-Hastings model. Finally, we also propose an “ensemble prediction” algorithm that successfully scales our method to large graphs by training on dynamics observed from multiple random subgraphs.
more » « less
Full Text Available

« Prev Next »

Search for: All records