skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Evaluating Heuristics in Engineering Design: A Reinforcement Learning Approach
Heuristics are essential for addressing the complexities of engineering design processes. The goodness of heuristics is context-dependent. Appropriately tailored heuristics can enable designers to find good solutions efficiently, and inappropriate heuristics can result in cognitive biases and inferior design outcomes. While there have been several efforts at understanding which heuristics are used by designers, there is a lack of normative understanding about when different heuristics are suitable. Towards addressing this gap, this paper presents a reinforcement learning-based approach to evaluate the goodness of heuristics for three sub-problems commonly faced by designers while carrying out design under resource constraints: (i) learning the mapping between the design space and the performance space, (ii) sequential information acquisition in design, and (iii) decision to stop information acquisition. Using a multi-armed bandit formulation and simulation studies, we learn the heuristics that are suitable for these sub-problems under different resource constraints and problem complexities. The results of our simulation study indicate that the proposed reinforcement learning-based approach can be effective for determining the quality of heuristics for different sub-problems, and how the effectiveness of the heuristics changes as a function of the designer's preference (e.g., performance versus cost), the complexity of the problem, and the resources available.  more » « less
Award ID(s):
1728165
PAR ID:
10282918
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ASME IDETC
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Heuristics are essential for addressing the complexities of engineering design processes. The goodness of heuristics is context-dependent. Appropriately tailored heuristics can enable designers to find good solutions efficiently, and inappropriate heuristics can result in cognitive biases and inferior design outcomes. While there have been several efforts at understanding which heuristics are used by designers, there is a lack of normative understanding about when different heuristics are suitable. Towards addressing this gap, this paper presents a reinforcement learning-based approach to evaluate the goodness of heuristics for three sub-problems commonly faced by designers: (1) learning the map between the design space and the performance space, (2) acquiring sequential information, and (3) stopping the information acquisition process. Using a multi-armed bandit formulation and simulation studies, we learn the suitable heuristics for these individual sub-problems under different resource constraints and problem complexities. Additionally, we learn the optimal heuristics for the combined problem (i.e., the one composing all three sub-problems), and we compare them to ones learned at the sub-problem level. The results of our simulation study indicate that the proposed reinforcement learning-based approach can be effective for determining the quality of heuristics for different problems, and how the effectiveness of the heuristics changes as a function of the designer’s preference (e.g., performance versus cost), the complexity of the problem, and the resources available. 
    more » « less
  2. Abstract Design heuristics are traditionally used as qualitative principles to guide the design process, but they have also been used to improve the efficiency of design optimization. Using design heuristics as soft constraints or search operators has been shown for some problems to reduce the number of function evaluations needed to achieve a certain level of convergence. However, in other cases, enforcing heuristics can reduce diversity and slow down convergence. This paper studies the question of when and how a given set of design heuristics represented in different forms (soft constraints, repair operators, and biased sampling) can be utilized in an automated way to improve efficiency for a given design problem. An approach is presented for identifying promising heuristics for a given problem by estimating the overall impact of a heuristic based on an exploratory screening study. Two impact indices are formulated: weighted influence index and hypervolume difference index. Using this approach, the promising heuristics for four design problems are identified and the efficacy of selectively enforcing only these promising heuristics over both enforcement of all available heuristics and not enforcing any heuristics is benchmarked. In all problems, it is found that enforcing only the promising heuristics as repair operators enables finding good designs faster than by enforcing all available heuristics or not enforcing any heuristics. Enforcing heuristics as soft constraints or biased sampling functions results in improvements in efficiency for some of the problems. Based on these results, guidelines for designers to leverage heuristics effectively in design optimization are presented. 
    more » « less
  3. Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algorithms achieve data efficiency by leveraging a key insight: having samples of the exogenous variables, past decisions can be revisited in hindsight to infer counterfactual consequences that can accelerate policy improvements. We compare HL against classic baselines in the multi-secretary and airline revenue management problems. We also scale our algorithms to a business-critical cloud resource management problem – allocating Virtual Machines (VMs) to physical machines, and simulate their performance with real datasets from a large public cloud provider. We find that HL algorithms outperform domain-specific heuristics, as well as state-of-the-art reinforcement learning methods. 
    more » « less
  4. As Cloud's adoption surges across industries, the limitations of its default scheduler, particularly on large scales or for jobs outside of its initial design scope, have become increasingly prominent. While the default schedulers in various cloud platforms were primarily engineered to focus on simple and predictable tasks, reinforcement learning (RL)-based schedulers are attracting attention as they can predict a larger and more diverse cloud environment. Nevertheless, there are practical constraints to the use of RL. Retraining for adaptation is necessary for each new environment, and exploration taken during each training may lead to unexpected performance degradation at runtime. To address these issues, this paper presents Dejavu which combines reinforcement learning with neural networks to learn and resolve scheduling problems more effectively. To tackle the extended training time and performance degradation by unexpected explorations, we apply pretraining using Demonstrations from existing heuristics. This guides the RL agent to explore in a safe and efficient manner. Furthermore, we design a robust reward function to push Dejavu to compete with and eventually outperform, the exploited heuristics and other baselines. The experimental results demonstrate the efficacy of Dejavu, showing remarkable improvements in key metrics. Compared to the default scheduler, it boosts resource utilization by 6 % and shortens scheduling time by 3% during the scheduling period. 
    more » « less
  5. null (Ed.)
    Abstract Designers make information acquisition decisions, such as where to search and when to stop the search. Such decisions are typically made sequentially, such that at every search step designers gain information by learning about the design space. However, when designers begin acquiring information, their decisions are primarily based on their prior knowledge. Prior knowledge influences the initial set of assumptions that designers use to learn about the design space. These assumptions are collectively termed as inductive biases. Identifying such biases can help us better understand how designers use their prior knowledge to solve problems in the light of uncertainty. Thus, in this study, we identify inductive biases in humans in sequential information acquisition tasks. To do so, we analyze experimental data from a set of behavioral experiments conducted in the past [1–5]. All of these experiments were designed to study various factors that influence sequential information acquisition behaviors. Across these studies, we identify similar decision making behaviors in the participants in their very first decision to “choose x”. We find that their choices of “x” are not uniformly distributed in the design space. Since such experiments are abstractions of real design scenarios, it implies that further contextualization of such experiments would only increase the influence of these biases. Thus, we highlight the need to study the influence of such biases to better understand designer behaviors. We conclude that in the context of Bayesian modeling of designers’ behaviors, utilizing the identified inductive biases would enable us to better model designer’s priors for design search contexts as compared to using non-informative priors. 
    more » « less