Non-perturbative renormalization for the neural network-QFT correspondence
Abstract

In a recent work (Halversonet al2021Mach. Learn.: Sci. Technol.2035002), Halverson, Maiti and Stoner proposed a description of neural networks (NNs) in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory while finiteNcorrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for NNs (since inputs can be arbitrary), however, the renormalization group (RG) provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from Halversonet al(2021Mach. Learn.: Sci. Technol.2035002) by providing an analysis in terms of the non-perturbative RG using the Wetterich-Morris equation. An important difference with usual non-perturbative RG analysis is that only the effective infrared 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate NNs behavior beyond more »

Authors:
; ;
Award ID(s):
Publication Date:
NSF-PAR ID:
10363314
Journal Name:
Machine Learning: Science and Technology
Volume:
3
Issue:
1
Page Range or eLocation-ID:
Article No. 015027
ISSN:
2632-2153
Publisher:
IOP Publishing
National Science Foundation
##### More Like this
1. Abstract

The Dicke model—a paradigmatic example of superradiance in quantum optics—describes an ensemble of atoms which are collectively coupled to a leaky cavity mode. As a result of the cooperative nature of these interactions, the system’s dynamics is captured by the behavior of a single mean-field, collective spin. In this mean-field limit, it has recently been shown that the interplay between photon losses and periodic driving of light–matter coupling can lead to time-crystalline-like behavior of the collective spin (Gonget al2018Phys. Rev. Lett.120040404). In this work, we investigate whether such a Dicke time crystal (TC) is stable to perturbations that explicitly break the mean-field solvability of the conventional Dicke model. In particular, we consider the addition of short-range interactions between the atoms which breaks the collective coupling and leads to complex many-body dynamics. In this context, the interplay between periodic driving, dissipation and interactions yields a rich set of dynamical responses, including long-lived and metastable Dicke-TCs, where losses can cool down the many-body heating resulting from the continuous pump of energy from the periodic drive. Specifically, when the additional short-range interactions are ferromagnetic, we observe time crystalline behavior at non-perturbative values of the coupling strength, suggesting the possible existence of stablemore »

2. Abstract

The renormalization group (RG) is a class of theoretical techniques used to explain the collective physics of interacting, many-body systems. It has been suggested that the RG formalism may be useful in finding and interpreting emergent low-dimensional structure in complex systems outside of the traditional physics context, such as in biology or computer science. In such contexts, one common dimensionality-reduction framework already in use is information bottleneck (IB), in which the goal is to compress an ‘input’ signalXwhile maximizing its mutual information with some stochastic ‘relevance’ variableY. IB has been applied in the vertebrate and invertebrate processing systems to characterize optimal encoding of the future motion of the external world. Other recent work has shown that the RG scheme for the dimer model could be ‘discovered’ by a neural network attempting to solve an IB-like problem. This manuscript explores whether IB and any existing formulation of RG are formally equivalent. A class of soft-cutoff non-perturbative RG techniques are defined by families of non-deterministic coarsening maps, and hence can be formally mapped onto IB, and vice versa. For concreteness, this discussion is limited entirely to Gaussian statistics (GIB), for which IB has exact, closed-form solutions. Under this constraint, GIB hasmore »

3. Abstract

Physical systems with non-trivial topological order find direct applications in metrology (Klitzinget al1980Phys.Rev. Lett.45494–7) and promise future applications in quantum computing (Freedman 2001Found. Comput. Math.1183–204; Kitaev 2003Ann. Phys.3032–30). The quantum Hall effect derives from transverse conductance, quantized to unprecedented precision in accordance with the system’s topology (Laughlin 1981Phys. Rev.B235632–33). At magnetic fields beyond the reach of current condensed matter experiment, around$104$T, this conductance remains precisely quantized with values based on the topological order (Thoulesset al1982Phys. Rev. Lett.49405–8). Hitherto, quantized conductance has only been measured in extended 2D systems. Here, we experimentally studied narrow 2D ribbons, just 3 or 5 sites wide along one direction, using ultracold neutral atoms where such large magnetic fields can be engineered (Jaksch and Zoller 2003New J. Phys.556; Miyakeet al2013Phys. Rev. Lett.111185302; Aidelsburgeret al2013Phys. Rev. Lett.111185301; Celiet al2014Phys. Rev. Lett.112043001; Stuhlet al2015Science3491514; Manciniet al2015Science3491510; Anet al2017Sci. Adv.3). We microscopically imaged the transverse spatial motion underlying the quantized Hall effect. Our measurements identify the topological Chern numbers with typical uncertainty of$5%$, and show that although band topology is only properly defined in infinite systems, its signatures are striking even in nearly vanishingly thin systems.

4. Abstract Due to its construction, the nonperturbative renormalizationgroup (RG) evolution of the constant, field-independent term (which is constantwith respect to field variations but depends on the RG scale k ) requiresspecial care within the Functional Renormalization Group (FRG) approach.In several instances, the constant term of the potential has nophysical meaning. However, there are special cases where it receives importantapplications. In low dimensions ( d = 1), in a quantum mechanical model, thisterm is associated with the ground-state energy of the anharmonic oscillator.In higher dimensions ( d = 4), it is identical to the Λ termof the Einstein equationsand it plays a role in cosmic inflation. Thus, instatistical field theory, in flat space, the constant term could be associatedwith the free energy, while in curved space, it could be naturally associatedwith the cosmological constant. It is known that one has to use a subtractionmethod for the quantum anharmonic oscillator in d = 1 to remove the k 2 term that appears in the RGflow in its high-energy (UV) limit in order to recover thecorrect results for the ground-state energy. The subtraction is needed becausethe Gaussian fixed point is missing in the RG flow once the constant term isincluded. However, if themore »
5. (Ed.)
Electroencephalography (EEG) is a popular clinical monitoring tool used for diagnosing brain-related disorders such as epilepsy [1]. As monitoring EEGs in a critical-care setting is an expensive and tedious task, there is a great interest in developing real-time EEG monitoring tools to improve patient care quality and efficiency [2]. However, clinicians require automatic seizure detection tools that provide decisions with at least 75% sensitivity and less than 1 false alarm (FA) per 24 hours [3]. Some commercial tools recently claim to reach such performance levels, including the Olympic Brainz Monitor [4] and Persyst 14 [5]. In this abstract, we describe our efforts to transform a high-performance offline seizure detection system [3] into a low latency real-time or online seizure detection system. An overview of the system is shown in Figure 1. The main difference between an online versus offline system is that an online system should always be causal and has minimum latency which is often defined by domain experts. The offline system, shown in Figure 2, uses two phases of deep learning models with postprocessing [3]. The channel-based long short term memory (LSTM) model (Phase 1 or P1) processes linear frequency cepstral coefficients (LFCC) [6] features from each EEGmore »