Abstract As climate change causes the environment to shift away from the local optimum that populations have adapted to, fitness declines are predicted to occur. Recently, methods known as genomic offsets (GOs) have become a popular tool to predict population responses to climate change from landscape genomic data. Populations with a high GO have been interpreted to have a high “genomic vulnerability” to climate change. GOs are often implicitly interpreted as a fitness offset, or a change in fitness of an individual or population in a new environment compared to a reference. However, there are several different types of fitness offset that can be calculated, and the appropriate choice depends on the management goals. This study uses hypothetical and empirical data to explore situations in which different types of fitness offsets may or may not be correlated with each other or with a GO. The examples reveal that even when GOs predict fitness offsets in a common garden experiment, this does not necessarily validate their ability to predict fitness offsets to environmental change. Conceptual examples are also used to show how a large GO can arise under a positive fitness offset, and thus cannot be interpreted as a population vulnerability. These issues can be resolved with robust validation experiments that can evaluate which fitness offsets are correlated with GOs.
more »
« less
Designing mechanically tough graphene oxide materials using deep reinforcement learning
Abstract Graphene oxide (GO) is playing an increasing role in many technologies. However, it remains unanswered how to strategically distribute the functional groups to further enhance performance. We utilize deep reinforcement learning (RL) to design mechanically tough GOs. The design task is formulated as a sequential decision process, and policy-gradient RL models are employed to maximize the toughness of GO. Results show that our approach can stably generate functional group distributions with a toughness value over two standard deviations above the mean of random GOs. In addition, our RL approach reaches optimized functional group distributions within only 5000 rollouts, while the simplest design task has 2 × 1011possibilities. Finally, we show that our approach is scalable in terms of the functional group density and the GO size. The present research showcases the impact of functional group distribution on GO properties, and illustrates the effectiveness and data efficiency of the deep RL approach.
more »
« less
- Award ID(s):
- 2119276
- PAR ID:
- 10378979
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- npj Computational Materials
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2057-3960
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a control prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the prior policy has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a wide range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.more » « less
-
Abstract Hydrogels containing thermosensitive polymers such as poly(N‐isopropylacrylamide) (P(NIPAm)) may contract during heating and show great promise in fields ranging from soft robotics to thermosensitive biosensors. However, these gels often exhibit low stiffness, tensile strength, and mechanical toughness, limiting their applicability. Through copolymerization of P(NIPAm) with poly(Acrylic acid) (P(AAc)) and introduction of ferric ions (Fe3+) that coordinate with functional groups along the P(AAc) chains, here a thermoresponsive hydrogel with enhanced mechanical extensibility, strength, and toughness is introduced. Using both experimentation and constitutive modeling, it is found that increasing the ratio of m(AAc):m(NIPAm) in the prepolymer decreases strength and toughness but improves extensibility. In contrast, increasing Fe3+concentration generally improves strength and toughness with little decrease in extensibility. Due to reversible coordination of the Fe3+bonds, these gels display excellent recovery of mechanical strength during cyclic loading and self‐healing ability. While thermosensitive contraction imbued by the underlying P(NIPAm) decreases slightly with increased Fe3+concentration, the temperature transition range is widened and shifted upward toward that of human body temperature (between 30 and 40 °C), perhaps rendering these gels suitable as in vivo biosensors. Finally, these gels display excellent adsorptive properties with a variety of materials, rendering them possible candidates in adhesive applications.more » « less
-
Abstract Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the unavailability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), typically benefit from the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of Transformers and long short-term memory networks, which constitute model-free RL solutions and work directly on the observation space, with an approach termed the belief-input method, which works on the belief space by exploiting the learned POMDP model for belief inference. We apply these methods to the real-world problem of optimal maintenance planning for railway assets and compare the results with the current real-life policy. We show that the RL policy learned by the belief-input method is able to outperform the real-life policy by yielding significantly reduced life-cycle costs.more » « less
-
Abstract The ability to reuse trained models in Reinforcement Learning (RL) holds substantial practical value in particular for complex tasks. While model reusability is widely studied for supervised models in data management, to the best of our knowledge, this is the first ever principled study that is proposed for RL. To capture trained policies, we develop a framework based on an expressive and lossless graph data model that accommodates Temporal Difference Learning and Deep-RL based RL algorithms. Our framework is able to capture arbitrary reward functions that can be composed at inference time. The framework comes with theoretical guarantees and shows that it yields the same result as policies trained from scratch. We design a parameterized algorithm that strikes a balance between efficiency and quality w.r.t cumulative reward. Our experiments with two common RL tasks (query refinement and robot movement) corroborate our theory and show the effectiveness and efficiency of our algorithms.more » « less