Conventional computational models of climate adaptation frameworks inadequately consider decision-makers’ capacity to learn, update, and improve decisions. Here, we investigate the potential of reinforcement learning (RL), a machine learning technique that efficaciously acquires knowledge from the environment and systematically optimizes dynamic decisions, in modeling and informing adaptive climate decision-making. We consider coastal flood risk mitigations for Manhattan, New York City, USA (NYC), illustrating the benefit of continuously incorporating observations of sea-level rise into systematic designs of adaptive strategies. We find that when designing adaptive seawalls to protect NYC, the RL-derived strategy significantly reduces the expected net cost by 6 to 36% under the moderate emissions scenario SSP2-4.5 (9 to 77% under the high emissions scenario SSP5-8.5), compared to conventional methods. When considering multiple adaptive policies, including accomodation and retreat as well as protection, the RL approach leads to a further 5% (15%) cost reduction, showing RL’s flexibility in coordinatively addressing complex policy design problems. RL also outperforms conventional methods in controlling tail risk (i.e., low probability, high impact outcomes) and in avoiding losses induced by misinformation about the climate state (e.g., deep uncertainty), demonstrating the importance of systematic learning and updating in addressing extremes and uncertainties related to climate adaptation. 
                        more » 
                        « less   
                    
                            
                            Bridging adaptive management and reinforcement learning for more robust decisions
                        
                    
    
            From out-competing grandmasters in chess to informing high-stakes healthcare decisions, emerging methods from artificial intelligence are increasingly capable of making complex and strategic decisions in diverse, high-dimensional and uncertain situations. But can these methods help us devise robust strategies for managing environmental systems under great uncertainty? Here we explore how reinforcement learning (RL), a subfield of artificial intelligence, approaches decision problems through a lens similar to adaptive environmental management: learning through experience to gradually improve decisions with updated knowledge. We review where RL holds promise for improving evidence-informed adaptive management decisions even when classical optimization methods are intractable and discuss technical and social issues that arise when applying RL to adaptive management problems in the environmental domain. Our synthesis suggests that environmental management and computer science can learn from one another about the practices, promises and perils of experience-based decision-making. This article is part of the theme issue ‘Detecting and attributing the causes of biodiversity change: needs, gaps and solutions’. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1942280
- PAR ID:
- 10501740
- Publisher / Repository:
- Philosophical Transactions of the Royal Society B
- Date Published:
- Journal Name:
- Philosophical Transactions of the Royal Society B: Biological Sciences
- Volume:
- 378
- Issue:
- 1881
- ISSN:
- 0962-8436
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Reasoning with declarative knowledge (RDK) and sequential decision‐making (SDM) are two key research areas in artificial intelligence. RDK methods reason with declarative domain knowledge, including commonsense knowledge, that is either provided a priori or acquired over time, while SDM methods (probabilistic planning [PP] and reinforcement learning [RL]) seek to compute action policies that maximize the expected cumulative utility over a time horizon; both classes of methods reason in the presence of uncertainty. Despite the rich literature in these two areas, researchers have not fully explored their complementary strengths. In this paper, we survey algorithms that leverage RDK methods while making sequential decisions under uncertainty. We discuss significant developments, open problems, and directions for future work.more » « less
- 
            Machine learning (ML) methods already permeate environmental decision-making, from processing high-dimensional data on earth systems to monitoring compliance with environmental regulations. Of the ML techniques available to address pressing environmental problems (e.g., climate change, biodiversity loss), Reinforcement Learning (RL) may both hold the greatest promise and present the most pressing perils. This paper explores how RL-driven policy refracts existing power relations in the environmental domain while also creating unique challenges to ensuring equitable and accountable environmental decision processes. We leverage examples from RL applications to climate change mitigation and fisheries management to explore how RL technologies shift the distribution of power between resource users, governing bodies, and private industry.more » « less
- 
            null (Ed.)Deep reinforcement learning (RL) has recently been successfully applied to networking contexts including routing, flow scheduling, congestion control, packet classification, cloud resource management, and video streaming. Deep-RL-driven systems automate decision making, and have been shown to outperform state-of-the-art handcrafted systems in important domains. However, the (typical) non-explainability of decisions induced by the deep learning machinery employed by these systems renders reasoning about crucial system properties, including correctness and security, extremely difficult. We show that despite the obscurity of decision making in these contexts, verifying that deep-RL-driven systems adhere to desired, designer-specified behavior, is achievable. To this end, we initiate the study of formal verification of deep RL and present Verily, a system for verifying deep-RL-based systems that leverages recent advances in verification of deep neural networks. We employ Verily to verify recently-introduced deep-RL-driven systems for adaptive video streaming, cloud resource management, and Internet congestion control. Our results expose scenarios in which deep-RL-driven decision making yields undesirable behavior. We discuss guidelines for building deep-RL-driven systems that are both safer and easier to verify.more » « less
- 
            Decision-making under uncertainty (DMU) is present in many important problems. An open challenge is DMU in non-stationary environments, where the dynamics of the environment can change over time. Reinforcement Learning (RL), a popular approach for DMU problems, learns a policy by interacting with a model of the environment offline. Unfortunately, if the environment changes the policy can become stale and take sub-optimal actions, and relearning the policy for the updated environment takes time and computational effort. An alternative is online planning approaches such as Monte Carlo Tree Search (MCTS), which perform their computation at decision time. Given the current environment, MCTS plans using high-fidelity models to determine promising action trajectories. These models can be updated as soon as environmental changes are detected to immediately incorporate them into decision making. However, MCTS’s convergence can be slow for domains with large state-action spaces. In this paper, we present a novel hybrid decision-making approach that combines the strengths of RL and planning while mitigating their weaknesses. Our approach, called Policy Augmented MCTS (PA-MCTS), integrates a policy’s actin-value estimates into MCTS, using the estimates to seed the action trajectories favored by the search. We hypothesize that PA-MCTS will converge more quickly than standard MCTS while making better decisions than the policy can make on its own when faced with nonstationary environments. We test our hypothesis by comparing PA-MCTS with pure MCTS and an RL agent applied to the classical CartPole environment. We find that PC-MCTS can achieve higher cumulative rewards than the policy in isolation under several environmental shifts while converging in significantly fewer iterations than pure MCTS.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    