skip to main content

Title: Synthesizing Action Sequences for Modifying Model Decisions
When a model makes a consequential decision, e.g., denying someone a loan, it needs to additionally generate actionable, realistic feedback on what the person can do to favorably change the decision. We cast this problem through the lens of program synthesis, in which our goal is to synthesize an optimal (realistically cheapest or simplest) sequence of actions that if a person executes successfully can change their classification. We present a novel and general approach that combines search-based program synthesis and test-time adversarial attacks to construct action sequences over a domain-specific set of actions. We demonstrate the effectiveness of our approach on a number of deep neural networks.
Authors:
; ;
Award ID(s):
1652140 1704117
Publication Date:
NSF-PAR ID:
10176329
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
34
Issue:
04
Page Range or eLocation-ID:
5462 to 5469
ISSN:
2159-5399
Sponsoring Org:
National Science Foundation
More Like this
  1. Many transit agencies operating paratransit and microtransit ser-vices have to respond to trip requests that arrive in real-time, which entails solving hard combinatorial and sequential decision-making problems under uncertainty. To avoid decisions that lead to signifi-cant inefficiency in the long term, vehicles should be allocated to requests by optimizing a non-myopic utility function or by batching requests together and optimizing a myopic utility function. While the former approach is typically offline, the latter can be performed online. We point out two major issues with such approaches when applied to paratransit services in practice. First, it is difficult to batch paratransit requests together as they are temporally sparse. Second, the environment in which transit agencies operate changes dynamically (e.g., traffic conditions can change over time), causing the estimates that are learned offline to become stale. To address these challenges, we propose a fully online approach to solve the dynamic vehicle routing problem (DVRP) with time windows and stochastic trip requests that is robust to changing environmental dynamics by construction. We focus on scenarios where requests are relatively sparse-our problem is motivated by applications to paratransit services. We formulate DVRP as a Markov decision process and use Monte Carlo tree search tomore »evaluate actions for any given state. Accounting for stochastic requests while optimizing a non-myopic utility function is computationally challenging; indeed, the action space for such a problem is intractably large in practice. To tackle the large action space, we leverage the structure of the problem to design heuristics that can sample promising actions for the tree search. Our experiments using real-world data from our partner agency show that the proposed approach outperforms existing state-of-the-art approaches both in terms of performance and robustness.« less
  2. As coastal landscapes change, management professionals are working hard to transition research results into actions that support scientifically informed decisions impacting coastal communities. This type of research faces many challenges due to competing priorities, but boundary spanning organizations can help mediate these conflicts by forming transdisciplinary collaborations. The National Sea Grant College Program (Sea Grant), a National Oceanic and Atmospheric Administration based agency, is a networked organization of 34 university-based state programs that uses a three pronged approach of research, extension, and education to move academic research into the hands of stakeholders and decision makers. The objective of this study is to better understand strategies for successful research to application (R2A) projects that address complex environmental problems occurring in a socio-economic context. Specifically, this work examines R2A projects from the Sea Grant network to better understand the drivers for project development and common deliverables produced through the R2A process. We identify five common facilitating factors that enabled ‘successful’ R2A across all projects: platforms for partnerships, iterative communication, transparent planning, clear examples of R2A, and graduate student involvement. By providing examples of successful frameworks, we hope to encourage more organizations to engage in the R2A process.
  3. Decision-making under uncertainty (DMU) is present in many important problems. An open challenge is DMU in non-stationary environments, where the dynamics of the environment can change over time. Reinforcement Learning (RL), a popular approach for DMU problems, learns a policy by interacting with a model of the environment offline. Unfortunately, if the environment changes the policy can become stale and take sub-optimal actions, and relearning the policy for the updated environment takes time and computational effort. An alternative is online planning approaches such as Monte Carlo Tree Search (MCTS), which perform their computation at decision time. Given the current environment, MCTS plans using high-fidelity models to determine promising action trajectories. These models can be updated as soon as environmental changes are detected to immediately incorporate them into decision making. However, MCTS’s convergence can be slow for domains with large state-action spaces. In this paper, we present a novel hybrid decision-making approach that combines the strengths of RL and planning while mitigating their weaknesses. Our approach, called Policy Augmented MCTS (PA-MCTS), integrates a policy’s actin-value estimates into MCTS, using the estimates to seed the action trajectories favored by the search. We hypothesize that PA-MCTS will converge more quickly than standard MCTSmore »while making better decisions than the policy can make on its own when faced with nonstationary environments. We test our hypothesis by comparing PA-MCTS with pure MCTS and an RL agent applied to the classical CartPole environment. We find that PC-MCTS can achieve higher cumulative rewards than the policy in isolation under several environmental shifts while converging in significantly fewer iterations than pure MCTS.« less
  4. Community and citizen science on climate change-influenced topics offers a way for participants to actively engage in understanding the changes and documenting the impacts. As in broader climate change education, a focus on the negative impacts can often leave participants feeling a sense of powerlessness. In large scale projects where participation is primarily limited to data collection, it is often difficult for volunteers to see how the data can inform decision making that can help create a positive future. In this paper, we propose and test a method of linking community and citizen science engagement to thinking about and planning for the future through scenarios story development using the data collected by the volunteers. We used a youth focused wild berry monitoring program that spanned urban and rural Alaska to test this method across diverse age levels and learning settings. Using qualitative analysis of educator interviews and youth work samples, we found that using a scenario stories development mini-workshop allowed the youth to use their own data and the data from other sites to imagine the future and possible actions to sustain berry resources for their communities. This process allowed youth to exercise key cognitive skills for sustainability, including systemsmore »thinking, futures thinking, and strategic thinking. The analysis suggested that youth would benefit from further practicing the skill of envisioning oneself as an agent of change in the environment. Educators valued working with lead scientists on the project and the experience for youth to participate in the interdisciplinary program. They also identified the combination of the berry data collection, analysis and scenarios stories activities as a teaching practice that allowed the youth to situate their citizen science participation in a personal, local and cultural context. The majority of the youth groups pursued some level of stewardship action following the activity. The most common actions included collecting additional years of berry data, communicating results to a broader community, and joining other community and citizen science projects. A few groups actually pursued solutions illustrated in the scenario stories. The pairing of community and citizen science with scenario stories development provides a promising method to connect data to action for a sustainable and resilient future.« less
  5. Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.