The inception of physics-constrained or physics-informed machine learning represents a paradigm shift, addressing the challenges associated with data scarcity and enhancing model interpretability. This innovative approach incorporates the fundamental laws of physics as constraints, guiding the training process of machine learning models. In this work, the physics-constrained convolutional recurrent neural network is further extended for solving spatial-temporal partial differential equations with arbitrary boundary conditions. Two notable advancements are introduced: the implementation of boundary conditions as soft constraints through finite difference-based differentiation, and the establishment of an adaptive weighting mechanism for the optimal allocation of weights to various losses. These enhancements significantly augment the network's ability to manage intricate boundary conditions and expedite the training process. The efficacy of the proposed model is validated through its application to two-dimensional phase transition, fluid dynamics, and reaction-diffusion problems, which are pivotal in materials modeling. Compared to traditional physics-constrained neural networks, the physics-constrained convolutional recurrent neural network demonstrates a tenfold increase in prediction accuracy within a similar computational budget. Moreover, the model's exceptional performance in extrapolating solutions for the Burgers' equation underscores its utility. Therefore, this research establishes the physics-constrained recurrent neural network as a viable surrogate model for sophisticated spatial-temporal PDE systems, particularly beneficial in scenarios plagued by sparse and noisy datasets.
more »
« less
High-speed CMOS-free purely spintronic asynchronous recurrent neural network
The exceptional capabilities of the human brain provide inspiration for artificially intelligent hardware that mimics both the function and the structure of neurobiology. In particular, the recent development of nanodevices with biomimetic characteristics promises to enable the development of neuromorphic architectures with exceptional computational efficiency. In this work, we propose biomimetic neurons comprised of domain wall-magnetic tunnel junctions that can be integrated into the first trainable CMOS-free recurrent neural network with biomimetic components. This paper demonstrates the computational effectiveness of this system for benchmark tasks and its superior computational efficiency relative to alternative approaches for recurrent neural networks.
more »
« less
- Award ID(s):
- 1910800
- PAR ID:
- 10588144
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- APL Machine Learning
- Volume:
- 1
- Issue:
- 1
- ISSN:
- 2770-9019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With the prosperous development of Deep Neural Network (DNNs), numerous Process-In-Memory (PIM) designs have emerged to accelerate DNN models with exceptional throughput and energy-efficiency. PIM accelerators based on Non-Volatile Memory (NVM) or volatile memory offer distinct advantages for computational efficiency and performance. NVM based PIM accelerators, demonstrated success in DNN inference, face limitations in on-device learning due to high write energy, latency, and instability. Conversely, fast volatile memories, like SRAM, offer rapid read/write operations for DNN training, but suffer from significant leakage currents and large memory footprints. In this paper, for the first time, we present a fully-digital sparse processing in hybrid NVM-SRAM design, synergistically combines the strengths of NVM and SRAM, tailored for on-device continual learning. Our designed NVM and SRAM based PIM circuit macros could support both storage and processing of N:M structured sparsity pattern, significantly improving the storage and computing efficiency. Exhaustive experiments demonstrate that our hybrid system effectively reduces area and power consumption while maintaining high accuracy, offering a scalable and versatile solution for on-device continual learning.more » « less
-
null (Ed.)Channel state information (CSI) plays a vital role in scheduling and capacity-approaching transmission optimization of massive MIMO communication systems. In frequency division duplex (FDD) MIMO systems, forward link CSI reconstruction at transmitter relies on CSI feedback from receiving nodes and must carefully weigh the tradeoff between reconstruction accuracy and feedback bandwidth. Recent application of recurrent neural networks (RNN) has demonstrated promising results of massive MIMO CSI feedback compression. However, the cost of computation and memory associated with RNN deep learning remains high. In this work, we exploit channel temporal coherence to improve learning accuracy and feedback efficiency. Leveraging a Markovian model, we develop a deep convolutional neural network (CNN)-based framework called MarkovNet to efficiently encode CSI feedback to improve accuracy and efficiency. We explore important physical insights including spherical normalization of input data and deep learning network optimizations in feedback compression. We demonstrate that MarkovNet provides a substantial performance improvement and computational complexity reduction over the RNN-based work.We demonstrate MarkovNet’s performance under different MIMO configurations and for a range of feedback intervals and rates. CSI recovery with MarkovNet outperforms RNN-based CSI estimation with only a fraction of computational cost.more » « less
-
Abstract Recurrent neural networks (RNNs) are often used to model circuits in the brain and can solve a variety of difficult computational problems requiring memory, error correction, or selection (Hopfield, 1982; Maass et al., 2002; Maass, 2011). However, fully connected RNNs contrast structurally with their biological counterparts, which are extremely sparse (about 0.1%). Motivated by the neocortex, where neural connectivity is constrained by physical distance along cortical sheets and other synaptic wiring costs, we introduce locality masked RNNs (LM-RNNs) that use task-agnostic predetermined graphs with sparsity as low as 4%. We study LM-RNNs in a multitask learning setting relevant to cognitive systems neuroscience with a commonly used set of tasks, 20-Cog-tasks (Yang et al., 2019). We show through reductio ad absurdum that 20-Cog-tasks can be solved by a small pool of separated autapses that we can mechanistically analyze and understand. Thus, these tasks fall short of the goal of inducing complex recurrent dynamics and modular structure in RNNs. We next contribute a new cognitive multitask battery, Mod-Cog, consisting of up to 132 tasks that expands by about seven-fold the number of tasks and task complexity of 20-Cog-tasks. Importantly, while autapses can solve the simple 20-Cog-tasks, the expanded task set requires richer neural architectures and continuous attractor dynamics. On these tasks, we show that LM-RNNs with an optimal sparsity result in faster training and better data efficiency than fully connected networks.more » « less
-
In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights.more » « less
An official website of the United States government
