Multi-robot cooperative control has been extensively studied using model-based distributed control methods. However, such control methods rely on sensing and perception modules in a sequential pipeline design, and the separation of perception and controls may cause processing latencies and compounding errors that affect control performance. End-to-end learning overcomes this limitation by implementing direct learning from onboard sensing data, with control commands output to the robots. Challenges exist in end-to-end learning for multi-robot cooperative control, and previous results are not scalable. We propose in this article a novel decentralized cooperative control method for multi-robot formations using deep neural networks, in which inter-robot communication is modeled by a graph neural network (GNN). Our method takes LiDAR sensor data as input, and the control policy is learned from demonstrations that are provided by an expert controller for decentralized formation control. Although it is trained with a fixed number of robots, the learned control policy is scalable. Evaluation in a robot simulator demonstrates the triangular formation behavior of multi-robot teams of different sizes under the learned control policy.
more »
« less
End-to-end differentiability and tensor processing unit computing to accelerate materials’ inverse design
Numerical simulations have revolutionized material design. However, although simulations excel at mapping an input material to its output property, their direct application to inverse design has traditionally been limited by their high computing cost and lack of differentiability. Here, taking the example of the inverse design of a porous matrix featuring targeted sorption isotherm, we introduce a computational inverse design framework that addresses these challenges, by programming differentiable simulation on TensorFlow platform that leverages automated end-to-end differentiation. Thanks to its differentiability, the simulation is used to directly train a deep generative model, which outputs an optimal porous matrix based on an arbitrary input sorption isotherm curve. Importantly, this inverse design pipeline leverages the power of tensor processing units (TPU)—an emerging family of dedicated chips, which, although they are specialized in deep learning, are flexible enough for intensive scientific simulations. This approach holds promise to accelerate inverse materials design.
more »
« less
- Award ID(s):
- 1922167
- PAR ID:
- 10467396
- Publisher / Repository:
- Nature
- Date Published:
- Journal Name:
- npj Computational Materials
- Volume:
- 9
- Issue:
- 1
- ISSN:
- 2057-3960
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The incorporation of high‐performance optoelectronic devices into photonic neuromorphic processors can substantially accelerate computationally intensive matrix multiplication operations in machine learning (ML) algorithms. However, the conventional designs of individual devices and system are largely disconnected, and the system optimization is limited to the manual exploration of a small design space. Here, a device‐system end‐to‐end design methodology is reported to optimize a free‐space optical general matrix multiplication (GEMM) hardware accelerator by engineering a spatially reconfigurable array made from chalcogenide phase change materials. With a highly parallelized integrated hardware emulator with experimental information, the design of unit device to directly optimize GEMM calculation accuracy is achieved by exploring a large parameter space through reinforcement learning algorithms, including deep Q‐learning neural network, Bayesian optimization, and their cascaded approach. The algorithm‐generated physical quantities show a clear correlation between system performance metrics and device specifications. Furthermore, physics‐aware training approaches are employed to deploy optimized hardware to the tasks of image classification, materials discovery, and a closed‐loop design of optical ML accelerators. The demonstrated framework offers insights into the end‐to‐end and co‐design of optoelectronic devices and systems with reduced human supervision and domain knowledge barriers.more » « less
-
Simulating the time evolution of Partial Differential Equations (PDEs) of large-scale systems is crucial in many scientific and engineering domains such as fluid dynamics, weather forecasting and their inverse optimization problems. However, both classical solvers and recent deep learning-based surrogate models are typically extremely computationally intensive, because of their local evolution: they need to update the state of each discretized cell at each time step during inference. Here we develop Latent Evolution of PDEs (LE-PDE), a simple, fast and scalable method to accelerate the simulation and inverse optimization of PDEs. LE-PDE learns a compact, global representation of the system and efficiently evolves it fully in the latent space with learned latent evolution models. LE-PDE achieves speedup by having a much smaller latent dimension to update during long rollout as compared to updating in the input space. We introduce new learning objectives to effectively learn such latent dynamics to ensure long-term stability. We further introduce techniques for speeding-up inverse optimization of boundary conditions for PDEs via backpropagation through time in latent space, and an annealing technique to address the non-differentiability and sparse interaction of boundary conditions. We test our method in a 1D benchmark of nonlinear PDEs, 2D Navier-Stokes flows into turbulent phase and an inverse optimization of boundary conditions in 2D Navier-Stokes flow. Compared to state-of-the-art deep learning-based surrogate models and other strong baselines, we demonstrate up to 128x reduction in the dimensions to update, and up to 15x improvement in speed, while achieving competitive accuracy.more » « less
-
Bhatele, A.; Hammond, J.; Baboulin, M.; Kruse, C. (Ed.)The reactive force field (ReaxFF) interatomic potential is a powerful tool for simulating the behavior of molecules in a wide range of chemical and physical systems at the atomic level. Unlike traditional classical force fields, ReaxFF employs dynamic bonding and polarizability to enable the study of reactive systems. Over the past couple decades, highly optimized parallel implementations have been developed for ReaxFF to efficiently utilize modern hardware such as multi-core processors and graphics processing units (GPUs). However, the complexity of the ReaxFF potential poses challenges in terms of portability to new architectures (AMD and Intel GPUs, RISC-V processors, etc.), and limits the ability of computational scientists to tailor its functional form to their target systems. In this regard, the convergence of cyber-infrastructure for high performance computing (HPC) and machine learning (ML) presents new opportunities for customization, programmer productivity and performance portability. In this paper, we explore the benefits and limitations of JAX, a modern ML library in Python representing a prime example of the convergence of HPC and ML software, for implementing ReaxFF. We demonstrate that by leveraging auto-differentiation, just-in-time compilation, and vectorization capabilities of JAX, one can attain a portable, performant, and easy to maintain ReaxFF software. Beyond enabling MD simulations, end-to-end differentiability of trajectories produced by ReaxFF implemented with JAX makes it possible to perform related tasks such as force field parameter optimization and meta-analysis without requiring any significant software developments. We also discuss scalability limitations using the current version of JAX for ReaxFF simulations.more » « less
-
null (Ed.)One of the key challenges arising when compilers vectorize loops for today’s SIMD-compatible architectures is to decide if vectorization or interleaving is beneficial. Then, the compiler has to determine the number of instructions to pack together and the interleaving level (stride). Compilers are designed today to use fixed-cost models that are based on heuristics to make vectorization decisions on loops. However, these models are unable to capture the data dependency, the computation graph, or the organization of instructions. Alternatively, software engineers often hand-write the vectorization factors of every loop. This, however, places a huge burden on them, since it requires prior experience and significantly increases the development time. In this work, we explore a novel approach for handling loop vectorization and propose an end-to-end solution using deep reinforcement learning (RL). We conjecture that deep RL can capture different instructions, dependencies, and data structures to enable learning a sophisticated model that can better predict the actual performance cost and determine the optimal vectorization factors. We develop an end-to-end framework, from code to vectorization, that integrates deep RL in the LLVM compiler. Our proposed framework takes benchmark codes as input and extracts the loop codes. These loop codes are then fed to a loop embedding generator that learns an embedding for these loops. Finally, the learned embeddings are used as input to a Deep RL agent, which dynamically determines the vectorization factors for all the loops. We further extend our framework to support random search, decision trees, supervised neural networks, and nearest-neighbor search. We evaluate our approaches against the currently used LLVM vectorizer and loop polyhedral optimization techniques. Our experiments show 1.29×−4.73× performance speedup compared to baseline and only 3% worse than the brute-force search on a wide range of benchmarks.more » « less
An official website of the United States government

