skip to main content


Title: Correlator convolutional neural networks as an interpretable architecture for image-like quantum matter data
Abstract Image-like data from quantum systems promises to offer greater insight into the physics of correlated quantum matter. However, the traditional framework of condensed matter physics lacks principled approaches for analyzing such data. Machine learning models are a powerful theoretical tool for analyzing image-like data including many-body snapshots from quantum simulators. Recently, they have successfully distinguished between simulated snapshots that are indistinguishable from one and two point correlation functions. Thus far, the complexity of these models has inhibited new physical insights from such approaches. Here, we develop a set of nonlinearities for use in a neural network architecture that discovers features in the data which are directly interpretable in terms of physical observables. Applied to simulated snapshots produced by two candidate theories approximating the doped Fermi-Hubbard model, we uncover that the key distinguishing features are fourth-order spin-charge correlators. Our approach lends itself well to the construction of simple, versatile, end-to-end interpretable architectures, thus paving the way for new physical insights from machine learning studies of experimental and numerical data.  more » « less
Award ID(s):
1934714 1934598
NSF-PAR ID:
10294954
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Nature Communications
Volume:
12
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION Solving quantum many-body problems, such as finding ground states of quantum systems, has far-reaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, fault-tolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum many-body physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediate-size experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform non-ML algorithms in challenging quantum many-body problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict ground-state properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexity-theoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum many-body problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; ground-state properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomial-time (NP)–complete problems cannot be solved in randomized polynomial time, we prove that no polynomial-time classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, two-dimensional random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum many-body problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of near-term quantum platforms. Classical algorithms for quantum many-body problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task. 
    more » « less
  2. BACKGROUND Optical sensing devices measure the rich physical properties of an incident light beam, such as its power, polarization state, spectrum, and intensity distribution. Most conventional sensors, such as power meters, polarimeters, spectrometers, and cameras, are monofunctional and bulky. For example, classical Fourier-transform infrared spectrometers and polarimeters, which characterize the optical spectrum in the infrared and the polarization state of light, respectively, can occupy a considerable portion of an optical table. Over the past decade, the development of integrated sensing solutions by using miniaturized devices together with advanced machine-learning algorithms has accelerated rapidly, and optical sensing research has evolved into a highly interdisciplinary field that encompasses devices and materials engineering, condensed matter physics, and machine learning. To this end, future optical sensing technologies will benefit from innovations in device architecture, discoveries of new quantum materials, demonstrations of previously uncharacterized optical and optoelectronic phenomena, and rapid advances in the development of tailored machine-learning algorithms. ADVANCES Recently, a number of sensing and imaging demonstrations have emerged that differ substantially from conventional sensing schemes in the way that optical information is detected. A typical example is computational spectroscopy. In this new paradigm, a compact spectrometer first collectively captures the comprehensive spectral information of an incident light beam using multiple elements or a single element under different operational states and generates a high-dimensional photoresponse vector. An advanced algorithm then interprets the vector to achieve reconstruction of the spectrum. This scheme shifts the physical complexity of conventional grating- or interference-based spectrometers to computation. Moreover, many of the recent developments go well beyond optical spectroscopy, and we discuss them within a common framework, dubbed “geometric deep optical sensing.” The term “geometric” is intended to emphasize that in this sensing scheme, the physical properties of an unknown light beam and the corresponding photoresponses can be regarded as points in two respective high-dimensional vector spaces and that the sensing process can be considered to be a mapping from one vector space to the other. The mapping can be linear, nonlinear, or highly entangled; for the latter two cases, deep artificial neural networks represent a natural choice for the encoding and/or decoding processes, from which the term “deep” is derived. In addition to this classical geometric view, the quantum geometry of Bloch electrons in Hilbert space, such as Berry curvature and quantum metrics, is essential for the determination of the polarization-dependent photoresponses in some optical sensors. In this Review, we first present a general perspective of this sensing scheme from the viewpoint of information theory, in which the photoresponse measurement and the extraction of light properties are deemed as information-encoding and -decoding processes, respectively. We then discuss demonstrations in which a reconfigurable sensor (or an array thereof), enabled by device reconfigurability and the implementation of neural networks, can detect the power, polarization state, wavelength, and spatial features of an incident light beam. OUTLOOK As increasingly more computing resources become available, optical sensing is becoming more computational, with device reconfigurability playing a key role. On the one hand, advanced algorithms, including deep neural networks, will enable effective decoding of high-dimensional photoresponse vectors, which reduces the physical complexity of sensors. Therefore, it will be important to integrate memory cells near or within sensors to enable efficient processing and interpretation of a large amount of photoresponse data. On the other hand, analog computation based on neural networks can be performed with an array of reconfigurable devices, which enables direct multiplexing of sensing and computing functions. We anticipate that these two directions will become the engineering frontier of future deep sensing research. On the scientific frontier, exploring quantum geometric and topological properties of new quantum materials in both linear and nonlinear light-matter interactions will enrich the information-encoding pathways for deep optical sensing. In addition, deep sensing schemes will continue to benefit from the latest developments in machine learning. Future highly compact, multifunctional, reconfigurable, and intelligent sensors and imagers will find applications in medical imaging, environmental monitoring, infrared astronomy, and many other areas of our daily lives, especially in the mobile domain and the internet of things. Schematic of deep optical sensing. The n -dimensional unknown information ( w ) is encoded into an m -dimensional photoresponse vector ( x ) by a reconfigurable sensor (or an array thereof), from which w′ is reconstructed by a trained neural network ( n ′ = n and w′   ≈   w ). Alternatively, x may be directly deciphered to capture certain properties of w . Here, w , x , and w′ can be regarded as points in their respective high-dimensional vector spaces ℛ n , ℛ m , and ℛ n ′ . 
    more » « less
  3. Abstract

    Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code.

    Significance Statement

    Information such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable.

     
    more » « less
  4. The goal of generative models is to learn the intricate relations between the data to create new simulated data, but current approaches fail in very high dimensions. When the true data-generating process is based on physical processes, these impose symmetries and constraints, and the generative model can be created by learning an effective description of the underlying physics, which enables scaling of the generative model to very high dimensions. In this work, we propose Lagrangian deep learning (LDL) for this purpose, applying it to learn outputs of cosmological hydrodynamical simulations. The model uses layers of Lagrangian displacements of particles describing the observables to learn the effective physical laws. The displacements are modeled as the gradient of an effective potential, which explicitly satisfies the translational and rotational invariance. The total number of learned parameters is only of order 10, and they can be viewed as effective theory parameters. We combine N-body solver fast particle mesh (FastPM) with LDL and apply it to a wide range of cosmological outputs, from the dark matter to the stellar maps, gas density, and temperature. The computational cost of LDL is nearly four orders of magnitude lower than that of the full hydrodynamical simulations, yet it outperforms them at the same resolution. We achieve this with only of order 10 layers from the initial conditions to the final output, in contrast to typical cosmological simulations with thousands of time steps. This opens up the possibility of analyzing cosmological observations entirely within this framework, without the need for large dark-matter simulations.

     
    more » « less
  5. Abstract

    We introduce a new R package “MrIML” (“Mister iml”; Multi‐response Interpretable Machine Learning). MrIML provides a powerful and interpretable framework that enables users to harness recent advances in machine learning to quantify multilocus genomic relationships, to identify loci of interest for future landscape genetics studies, and to gain new insights into adaptation across environmental gradients. Relationships between genetic variation and environment are often nonlinear and interactive; these characteristics have been challenging to address using traditional landscape genetic approaches. Our package helps capture this complexity and offers functions that fit and interpret a wide range of highly flexible models that are routinely used for single‐locus landscape genetics studies but are rarely extended to estimate response functions for multiple loci. To demonstrate the package's broad functionality, we test its ability to recover landscape relationships from simulated genomic data. We also apply the package to two empirical case studies. In the first, we model genetic variation of North American balsam poplar (Populus balsamifera, Salicaceae) populations across environmental gradients. In the second case study, we recover the landscape and host drivers of feline immunodeficiency virus genetic variation in bobcats (Lynx rufus). The ability to model thousands of loci collectively and compare models from linear regression to extreme gradient boosting, within the same analytical framework, has the potential to be transformative. The MrIML framework is also extendable and not limited to modelling genetic variation; for example, it can quantify the environmental drivers of microbiomes and coinfection dynamics.

     
    more » « less