Abstract As climate change causes the environment to shift away from the local optimum that populations have adapted to, fitness declines are predicted to occur. Recently, methods known as genomic offsets (GOs) have become a popular tool to predict population responses to climate change from landscape genomic data. Populations with a high GO have been interpreted to have a high “genomic vulnerability” to climate change. GOs are often implicitly interpreted as a fitness offset, or a change in fitness of an individual or population in a new environment compared to a reference. However, there are several different types of fitness offset that can be calculated, and the appropriate choice depends on the management goals. This study uses hypothetical and empirical data to explore situations in which different types of fitness offsets may or may not be correlated with each other or with a GO. The examples reveal that even when GOs predict fitness offsets in a common garden experiment, this does not necessarily validate their ability to predict fitness offsets to environmental change. Conceptual examples are also used to show how a large GO can arise under a positive fitness offset, and thus cannot be interpreted as a population vulnerability. These issues can be resolved with robust validation experiments that can evaluate which fitness offsets are correlated with GOs.
more »
« less
Designing mechanically tough graphene oxide materials using deep reinforcement learning
Abstract Graphene oxide (GO) is playing an increasing role in many technologies. However, it remains unanswered how to strategically distribute the functional groups to further enhance performance. We utilize deep reinforcement learning (RL) to design mechanically tough GOs. The design task is formulated as a sequential decision process, and policy-gradient RL models are employed to maximize the toughness of GO. Results show that our approach can stably generate functional group distributions with a toughness value over two standard deviations above the mean of random GOs. In addition, our RL approach reaches optimized functional group distributions within only 5000 rollouts, while the simplest design task has 2 × 1011possibilities. Finally, we show that our approach is scalable in terms of the functional group density and the GO size. The present research showcases the impact of functional group distribution on GO properties, and illustrates the effectiveness and data efficiency of the deep RL approach.
more »
« less
- Award ID(s):
- 2119276
- PAR ID:
- 10378979
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- npj Computational Materials
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2057-3960
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a control prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the prior policy has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a wide range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.more » « less
-
Abstract The development of high‐performance elastomers for additive manufacturing requires overcoming complex property trade‐offs that challenge conventional material discovery pipelines. Here, a human‐in‐the‐loop reinforcement learning (RL) approach is used to discover polyurethane elastomers that overcome pervasive stress–strain property tradeoffs. Starting with a diverse training set of 92 formulations, a coupled multi‐component reward system was identified that guides RL agents toward materials with both high strength and extensibility. Through three rounds of iterative optimization combining RL predictions with human chemical intuition, we identified elastomers with more than double the average toughness compared to the initial training set. The final exploitation round, aided by solubility prescreening, predicted twelve materials exhibiting both high strength (>10 MPa) and high strain at break (>200%). Analysis of the high‐performing materials revealed structure‐property insights, including the benefits of high molar mass urethane oligomers, a high density of urethane functional groups, and incorporation of rigid low molecular weight diols and unsymmetric diisocyanates. These findings demonstrate that machine‐guided, human‐augmented design is a powerful strategy for accelerating polymer discovery in applications where data is scarce and expensive to acquire, with broad applicability to multi‐objective materials optimization.more » « less
-
Semantic communication marks a new paradigm shift from bit-wise data transmission to semantic information delivery for the purpose of bandwidth reduction. To more effectively carry out specialized downstream tasks at the receiver end, it is crucial to define the most critical semantic message in the data based on the task or goal-oriented features. In this work, we propose a novel goal-oriented communication (GO-COM) framework, namely Goal-Oriented Semantic Variational Autoencoder (GOS-VAE), by focusing on the extraction of the semantics vital to the downstream tasks. Specifically, we adopt a Vector Quantized Variational Autoencoder (VQ-VAE) to compress media data at the transmitter side. Instead of targeting the pixel-wise image data reconstruction, we measure the quality-of-service at the receiver end based on a pre-defined task-incentivized model. Moreover, to capture the relevant semantic features in the data reconstruction, imitation learning is adopted to measure the data regeneration quality in terms of goal-oriented semantics. Our experimental results demonstrate the power of imitation learning in characterizing goal-oriented semantics and bandwidth efficiency of our proposed GOS-VAE.more » « less
-
Abstract Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the unavailability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), typically benefit from the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of Transformers and long short-term memory networks, which constitute model-free RL solutions and work directly on the observation space, with an approach termed the belief-input method, which works on the belief space by exploiting the learned POMDP model for belief inference. We apply these methods to the real-world problem of optimal maintenance planning for railway assets and compare the results with the current real-life policy. We show that the RL policy learned by the belief-input method is able to outperform the real-life policy by yielding significantly reduced life-cycle costs.more » « less
An official website of the United States government
