skip to main content


Title: A hybrid search agent in pommerman
In this paper, we explore the possibility of search-based agents in games with resource-intensive forward models. We implemented a player agent in the Pommerman framework and put it against the baseline agent to measure its performance. We implemented a heuristic agent and improved it by enabling depth-limited tree search in specific gameplay moments. We also compared different node selection methods during depth-limited tree search. Our result shows that depth-limited tree search is still viable when presented with inefficient forward models and exploitation-driven selection method is the most efficient in this specific domain.  more » « less
Award ID(s):
1717324
NSF-PAR ID:
10132605
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
FDG '18: Proceedings of the 13th International Conference on the Foundations of Digital Game
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. We detail a new prediction-oriented procedure aimed at volcanic hazardassessment based on geophysical mass flow models constrained withheterogeneous and poorly defined data. Our method relies on an itemizedapplication of the empirical falsification principle over an arbitrarily wideenvelope of possible input conditions. We thus provide a first step towards aobjective and partially automated experimental design construction. Inparticular, instead of fully calibrating model inputs on past observations,we create and explore more general requirements of consistency, and then weseparately use each piece of empirical data to remove those input values thatare not compatible with it. Hence, partial solutions are defined to the inverseproblem. This has several advantages compared to a traditionally posedinverse problem: (i) the potentially nonempty inverse images of partialsolutions of multiple possible forward models characterize the solutions tothe inverse problem; (ii) the partial solutions can provide hazard estimatesunder weaker constraints, potentially including extreme cases that areimportant for hazard analysis; (iii) if multiple models are applicable,specific performance scores against each piece of empirical information canbe calculated. We apply our procedure to the case study of the Atenquiquevolcaniclastic debris flow, which occurred on the flanks of Nevado de Colimavolcano (Mexico), 1955. We adopt and compare three depth-averaged modelscurrently implemented in the TITAN2D solver, available from https://vhub.org(Version 4.0.0 – last access: 23 June 2016). The associated inverse problemis not well-posed if approached in a traditional way. We show that our procedurecan extract valuable information for hazard assessment, allowing the explorationof the impact of synthetic flows that are similar to those that occurred in thepast but different in plausible ways. The implementation of multiple models isthus a crucial aspect of our approach, as they can allow the covering of otherplausible flows. We also observe that model selection is inherently linked tothe inversion problem.

     
    more » « less
  2. Abstract

    Supervised machine learning via artificial neural network (ANN) has gained significant popularity for many geomechanics applications that involves multi‐phase flow and poromechanics. For unsaturated poromechanics problems, the multi‐physics nature and the complexity of the hydraulic laws make it difficult to design the optimal setup, architecture, and hyper‐parameters of the deep neural networks. This paper presents a meta‐modeling approach that utilizes deep reinforcement learning (DRL) to automatically discover optimal neural network settings that maximize a pre‐defined performance metric for the machine learning constitutive laws. This meta‐modeling framework is cast as a Markov Decision Process (MDP) with well‐defined states (subsets of states representing the proposed neural network (NN) settings), actions, and rewards. Following the selection rules, the artificial intelligence (AI) agent, represented in DRL via NN, self‐learns from taking a sequence of actions and receiving feedback signals (rewards) within the selection environment. By utilizing the Monte Carlo Tree Search (MCTS) to update the policy/value networks, the AI agent replaces the human modeler to handle the otherwise time‐consuming trial‐and‐error process that leads to the optimized choices of setup from a high‐dimensional parametric space. This approach is applied to generate two key constitutive laws for the unsaturated poromechanics problems: (1) the path‐dependent retention curve with distinctive wetting and drying paths. (2) The flow in the micropores, governed by an anisotropic permeability tensor. Numerical experiments have shown that the resultant ML‐generated material models can be integrated into a finite element (FE) solver to solve initial‐boundary‐value problems as replacements of the hand‐craft constitutive laws.

     
    more » « less
  3. This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process and develops an efficient search algorithm that is able to solve practical instances of the MIP model faster than commercial solvers. The formulation is novel for it directly maximizes the Gini reduction, an effective split selection criterion that has never been modeled in a mathematical program for its nonconvexity. The proposed approach differs from other optimal classification tree models in that it does not attempt to optimize the whole tree; therefore, the flexibility of the recursive partitioning scheme is retained, and the optimization model is more amenable. The approach is implemented in an open-source R package named bsnsing. Benchmarking experiments on 75 open data sets suggest that bsnsing trees are the most capable of discriminating new cases compared with trees trained by other decision tree codes including the rpart, C50, party, and tree packages in R. Compared with other optimal decision tree packages, including DL8.5, OSDT, GOSDT, and indirectly more, bsnsing stands out in its training speed, ease of use, and broader applicability without losing in prediction accuracy. History: Accepted by RamRamesh, Area Editor for Data Science & Machine Learning. Funding: This work was supported by the National Science Foundation Division of Civil, MechanicalandManufacturing Innovation [Grant 1944068]. Supplemental Material: Data are available at https://doi.org/10.1287/ijoc.2022.1225 . 
    more » « less
  4. Abstract

    We synthesize insights from current understanding of drought impacts at stand‐to‐biogeographic scales, including management options, and we identify challenges to be addressed with new research. Large stand‐level shifts underway in western forests already are showing the importance of interactions involving drought, insects, and fire. Diebacks, changes in composition and structure, and shifting range limits are widely observed. In the easternUS, the effects of increasing drought are becoming better understood at the level of individual trees, but this knowledge cannot yet be confidently translated to predictions of changing structure and diversity of forest stands. While eastern forests have not experienced the types of changes seen in western forests in recent decades, they too are vulnerable to drought and could experience significant changes with increased severity, frequency, or duration in drought. Throughout the continental United States, the combination of projected large climate‐induced shifts in suitable habitat from modeling studies and limited potential for the rapid migration of tree populations suggests that changing tree and forest biogeography could substantially lag habitat shifts already underway. Forest management practices can partially ameliorate drought impacts through reductions in stand density, selection of drought‐tolerant species and genotypes, artificial regeneration, and the development of multistructured stands. However, silvicultural treatments also could exacerbate drought impacts unless implemented with careful attention to site and stand characteristics. Gaps in our understanding should motivate new research on the effects of interactions involving climate and other species at the stand scale and how interactions and multiple responses are represented in models. This assessment indicates that, without a stronger empirical basis for drought impacts at the stand scale, more complex models may provide limited guidance.

     
    more » « less
  5. Sequential decision-making under uncertainty is present in many important problems. Two popular approaches for tackling such problems are reinforcement learning and online search (e.g., Monte Carlo tree search). While the former learns a policy by interacting with the environment (typically done before execution), the latter uses a generative model of the environment to sample promising action trajectories at decision time. Decision-making is particularly challenging in non-stationary environments, where the environment in which an agent operates can change over time. Both approaches have shortcomings in such settings -- on the one hand, policies learned before execution become stale when the environment changes and relearning takes both time and computational effort. Online search, on the other hand, can return sub-optimal actions when there are limitations on allowed runtime. In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment. We prove theoretical results showing conditions under which PA-MCTS selects the one-step optimal action and also bound the error accrued while following PA-MCTS as a policy. We compare and contrast our approach with AlphaZero, another hybrid planning approach, and Deep Q Learning on several OpenAI Gym environments. Through extensive experiments, we show that under non-stationary settings with limited time constraints, PA-MCTS outperforms these baselines. 
    more » « less