skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 12, 2026

Title: Exploring the energy landscape of RBMs: reciprocal space insights into bosons, hierarchical learning and symmetry breaking
Abstract Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. In this work, we address two central challenges: clarifying the connections between different deep generative models and deepening our understanding of their learning mechanisms. We focus on Restricted Boltzmann Machines (RBMs), a class of generative models known for their universal approximation capabilities for discrete distributions. By introducing a reciprocal space formulation for RBMs, we reveal a connection between these models, diffusion processes, and systems of coupled bosons. Our analysis shows that at initialization, the RBM operates at a saddle point, where the local curvature is determined by the singular values of the weight matrix, whose distribution follows the Marc̆enko-Pastur law and exhibits rotational symmetry. During training, this rotational symmetry is broken due to hierarchical learning, where different degrees of freedom progressively capture features at multiple levels of abstraction. This leads to a symmetry breaking in the energy landscape, reminiscent of Landau’s theory. This symmetry breaking in the energy landscape is characterized by the singular values and the weight matrix eigenvector matrix. We derive the corresponding free energy in a mean-field approximation. We show that in the limit of infinite size RBM, the reciprocal variables are Gaussian distributed. Our findings indicate that in this regime, there will be some modes for which the diffusion process will not converge to the Boltzmann distribution. To illustrate our results, we trained replicas of RBMs with different hidden layer sizes using the MNIST dataset. Our findings not only bridge the gap between disparate generative frameworks but also shed light on the fundamental processes underpinning learning in deep generative models.  more » « less
Award ID(s):
2212550
PAR ID:
10651647
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IOP Publishing
Date Published:
Journal Name:
Machine Learning: Science and Technology
Volume:
6
Issue:
3
ISSN:
2632-2153
Page Range / eLocation ID:
035030
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Particle collisions at accelerators like the Large Hadron Collider (LHC), recorded by experiments such as ATLAS and CMS, enable precise standard model measurements and searches for new phenomena. Simulating these collisions significantly influences experiment design and analysis but incurs immense computational costs, projected at millions of CPU-years annually during the high luminosity LHC (HL-LHC) phase. Currently, simulating a single event with Geant4 consumes around 1000 CPU seconds, with calorimeter simulations especially demanding. To address this, we propose a conditioned quantum-assisted generative model, integrating a conditioned variational autoencoder (VAE) and a conditioned restricted Boltzmann machine (RBM). Our RBM architecture is tailored for D-Wave’s Pegasus-structured advantage quantum annealer for sampling, leveraging the flux bias for conditioning. This approach combines classical RBMs as universal approximators for discrete distributions with quantum annealing’s speed and scalability. We also introduce an adaptive method for efficiently estimating effective inverse temperature, and validate our framework on Dataset 2 of CaloChallenge. 
    more » « less
  2. Abstract Deep generative learning cannot only be used for generating new data with statistical characteristics derived from input data but also for anomaly detection, by separating nominal and anomalous instances based on their reconstruction quality. In this paper, we explore the performance of three unsupervised deep generative models—variational autoencoders (VAEs) with Gaussian, Bernoulli, and Boltzmann priors—in detecting anomalies in multivariate time series of commercial-flight operations. We created two VAE models with discrete latent variables (DVAEs), one with a factorized Bernoulli prior and one with a restricted Boltzmann machine (RBM) with novel positive-phase architecture as prior, because of the demand for discrete-variable models in machine-learning applications and because the integration of quantum devices based on two-level quantum systems requires such models. To the best of our knowledge, our work is the first that applies DVAE models to anomaly-detection tasks in the aerospace field. The DVAE with RBM prior, using a relatively simple—and classically or quantum-mechanically enhanceable—sampling technique for the evolution of the RBM’s negative phase, performed better in detecting anomalies than the Bernoulli DVAE and on par with the Gaussian model, which has a continuous latent space. The transfer of a model to an unseen dataset with the same anomaly but without re-tuning of hyperparameters or re-training noticeably impaired anomaly-detection performance, but performance could be improved by post-training on the new dataset. The RBM model was robust to change of anomaly type and phase of flight during which the anomaly occurred. Our studies demonstrate the competitiveness of a discrete deep generative model with its Gaussian counterpart on anomaly-detection problems. Moreover, the DVAE model with RBM prior can be easily integrated with quantum sampling by outsourcing its generative process to measurements of quantum states obtained from a quantum annealer or gate-model device. 
    more » « less
  3. Magnetic Random-Access Memory (MRAM) based p-bit neuromorphic computing devices are garnering increasing interest as a means to compactly and efficiently realize machine learning operations in Restricted Boltzmann Machines (RBMs). When embedded within an RBM resistive crossbar array, the p-bit based neuron realizes a tunable sigmoidal activation function. Since the stochasticity of activation is dependent on the energy barrier of the MRAM device, it is essential to assess the impact of process variation on the voltage-dependent behavior of the sigmoid function. Other influential performance factors arise from varying energy barriers on power consumption requiring a simulation environment to facilitate the multi-objective optimization of device and network parameters. Herein, transportable Python scripts are developed to analyze the output variation under changes in device dimensions on the accuracy of machine learning applications. Evaluation with RBM circuits using the MNIST dataset reveal impacts and limits for processing variation of device fabrication in terms of the resulting energy vs. accuracy tradeoffs, and the resulting simulation framework is available via a Creative Commons license. 
    more » « less
  4. As we approach the High Luminosity Large Hadron Collider (HL-LHC) set to begin collisions by the end of this decade, it is clear that the computational demands of traditional collision simulations have become untenably high. Current methods, relying heavily on first-principles Monte Carlo simulations for event showers in calorimeters, are estimated to require millions of CPU-years annually, a demand that far exceeds current capabilities. This bottleneck presents a unique opportunity for breakthroughs in computational physics through the integration of generative AI with quantum computing technologies. We propose a Quantum-Assisted deep generative model. In particular, we combine a variational autoencoder (VAE) with a Restricted Boltzmann Machine (RBM) embedded in its latent space as a prior. The RBM in latent space provides further expressiveness compared to legacy VAE where the prior is a fixed Gaussian distribution. By crafting the RBM couplings, we leverage D-Wave’s Quantum Annealer to significantly speed up the shower sampling time. By combining classical and quantum computing, this framework sets a path towards utilizing large-scale quantum simulations as priors in deep generative models and demonstrate their ability to generate high-quality synthetic data for the HL-LHC experiments. 
    more » « less
  5. Li, Jinyan (Ed.)
    Selection protocols such as SELEX, where molecules are selected over multiple rounds for their ability to bind to a target of interest, are popular methods for obtaining binders for diagnostic and therapeutic purposes. We show that Restricted Boltzmann Machines (RBMs), an unsupervised two-layer neural network architecture, can successfully be trained on sequence ensembles from single rounds of SELEX experiments for thrombin aptamers. RBMs assign scores to sequences that can be directly related to their fitnesses estimated through experimental enrichment ratios. Hence, RBMs trained from sequence data at a given round can be used to predict the effects of selection at later rounds. Moreover, the parameters of the trained RBMs are interpretable and identify functional features contributing most to sequence fitness. To exploit the generative capabilities of RBMs, we introduce two different training protocols: one taking into account sequence counts, capable of identifying the few best binders, and another based on unique sequences only, generating more diverse binders. We then use RBMs model to generate novel aptamers with putative disruptive mutations or good binding properties, and validate the generated sequences with gel shift assay experiments. Finally, we compare the RBM’s performance with different supervised learning approaches that include random forests and several deep neural network architectures. 
    more » « less