skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Power and accountability in reinforcement learning applications to environmental policy
Machine learning (ML) methods already permeate environmental decision-making, from processing high-dimensional data on earth systems to monitoring compliance with environmental regulations. Of the ML techniques available to address pressing environmental problems (e.g., climate change, biodiversity loss), Reinforcement Learning (RL) may both hold the greatest promise and present the most pressing perils. This paper explores how RL-driven policy refracts existing power relations in the environmental domain while also creating unique challenges to ensuring equitable and accountable environmental decision processes. We leverage examples from RL applications to climate change mitigation and fisheries management to explore how RL technologies shift the distribution of power between resource users, governing bodies, and private industry.  more » « less
Award ID(s):
1942280
PAR ID:
10337385
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Conference Proceedings on Neural Information Processing Systems, 2021
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. From out-competing grandmasters in chess to informing high-stakes healthcare decisions, emerging methods from artificial intelligence are increasingly capable of making complex and strategic decisions in diverse, high-dimensional and uncertain situations. But can these methods help us devise robust strategies for managing environmental systems under great uncertainty? Here we explore how reinforcement learning (RL), a subfield of artificial intelligence, approaches decision problems through a lens similar to adaptive environmental management: learning through experience to gradually improve decisions with updated knowledge. We review where RL holds promise for improving evidence-informed adaptive management decisions even when classical optimization methods are intractable and discuss technical and social issues that arise when applying RL to adaptive management problems in the environmental domain. Our synthesis suggests that environmental management and computer science can learn from one another about the practices, promises and perils of experience-based decision-making. This article is part of the theme issue ‘Detecting and attributing the causes of biodiversity change: needs, gaps and solutions’. 
    more » « less
  2. na (Ed.)
    Nitrous oxide (N2O) emissions from agriculture are rising due to increased fertilizer use and intensive farming, posing a major challenge for climate mitigation. This study introduces a novel reinforcement learning (RL) framework to optimize farm management strategies that balance crop productivity with environmental impact, particularly N2O emissions. By modeling agricultural decision-making as a partially observable Markov decision process (POMDP), the framework accounts for uncertainties in environmental conditions and observational data. The approach integrates deep Q-learning with recurrent neural networks (RNNs) to train adaptive agents within a simulated farming environment. A Probabilistic Deep Learning (PDL) model was developed to estimate N2O emissions, achieving a high Prediction Interval Coverage Probability (PICP) of 0.937 within a 95% confidence interval on the available dataset. While the PDL model’s generalizability is currently constrained by the limited observational data, the RL framework itself is designed for broad applicability, capable of extending to diverse agricultural practices and environmental conditions. Results demonstrate that RL agents reduce N2O emissions without compromising yields, even under climatic variability. The framework’s flexibility allows for future integration of expanded datasets or alternative emission models, ensuring scalability as more field data becomes available. This work highlights the potential of artificial intelligence to advance climate-smart agriculture by simultaneously addressing productivity and sustainability goals in dynamic real-world settings. 
    more » « less
  3. Climate change is one of the greatest challenges facing humanity, and we, as machine learning (ML) experts, may wonder how we can help. Here we describe how ML can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by ML, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the ML community to join the global effort against climate change. 
    more » « less
  4. Circuit linearity calibration can represent a set of high-dimensional search problems if the observability is limited. For example, linearity calibration of digital-to-time converters (DTC), an essential building block of modern digital phaselocked loops (DPLLs), is an example of a high-dimensional search problem as difficulty of measuring ps delays hinders prior methods that calibrate stage by stage. And, a calibrated DTC can become nonlinear again due to changes in temperature (T) and power supply voltage (V). Prior work reports a deep reinforcement learning framework that is capable of performing DTC linearity calibration with nonlinear calibration banks; however, this prior work does not address maintaining calibration in the face of temperature and supply voltage variations. In this paper, we present a meta-reinforcement learning (RL) method that can enable the RL agent to quickly adapt to a new environment when the temperature and/or voltage change. Inspired by the Style Generative Adversarial Networks (StyleGANs), we propose to treat temperature and voltage changes as the styles of the circuits. In contrast to traditional methods employing circuit sensors to detect changes in T and V, we utilize a machine learning (ML) sensor, to implicitly infer a wide range of environmental changes. The style information from the ML sensor is subsequently injected into a small portion of the policy network, modulating its weights. As a proof of concept, we first designed a 5-bit DTC at the normal voltage (1V) and normal temperature (27℃) corner (NVNT) as the environment. The RL agent begins its training in the NVNT environment. Following this initial phase, the agent is then tasked with adapting to environments with different temperature and supply voltages. Our results show that the proposed technique can reduce the Integral Non-Linearity (INL) to less than 0.5 LSB within 10, 000 search steps in a changed environment. Compared to starting learning from a random initialized policy and a trained policy, the proposed meta-RL approach takes 63% and 47% fewer steps to complete the linearity calibration, respectively. Our method is also applicable to the calibration of many other kinds of analog and RF circuits. 
    more » « less
  5. Conventional computational models of climate adaptation frameworks inadequately consider decision-makers’ capacity to learn, update, and improve decisions. Here, we investigate the potential of reinforcement learning (RL), a machine learning technique that efficaciously acquires knowledge from the environment and systematically optimizes dynamic decisions, in modeling and informing adaptive climate decision-making. We consider coastal flood risk mitigations for Manhattan, New York City, USA (NYC), illustrating the benefit of continuously incorporating observations of sea-level rise into systematic designs of adaptive strategies. We find that when designing adaptive seawalls to protect NYC, the RL-derived strategy significantly reduces the expected net cost by 6 to 36% under the moderate emissions scenario SSP2-4.5 (9 to 77% under the high emissions scenario SSP5-8.5), compared to conventional methods. When considering multiple adaptive policies, including accomodation and retreat as well as protection, the RL approach leads to a further 5% (15%) cost reduction, showing RL’s flexibility in coordinatively addressing complex policy design problems. RL also outperforms conventional methods in controlling tail risk (i.e., low probability, high impact outcomes) and in avoiding losses induced by misinformation about the climate state (e.g., deep uncertainty), demonstrating the importance of systematic learning and updating in addressing extremes and uncertainties related to climate adaptation. 
    more » « less