skip to main content


Title: Cooperative Inverse Decision Theory for Uncertain Preferences
Inverse decision theory (IDT) aims to learn a performance metric for classification by eliciting expert classifications on examples. However, elicitation in practical settings may require many classifications of potentially ambiguous examples. To improve the efficiency of elicitation, we propose the cooperative inverse decision theory (CIDT) framework as a formalization of the performance metric elicitation problem. In cooperative inverse decision theory, the expert and a machine play a game where both are rewarded according to the expert’s performance metric, but the machine does not initially know what this function is. We show that optimal policies in this framework produce active learning that leads to an exponential improvement in sample complexity over previous work. One of our key findings is that a broad class of sub-optimal experts can be represented as having uncertain preferences. We use this finding to show such experts naturally fit into our proposed framework extending inverse decision theory to efficiently deal with decision data that is sub-optimal due to noise, conflicting experts, or systematic error  more » « less
Award ID(s):
2205329 2046795
NSF-PAR ID:
10461017
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the International Workshop on Artificial Intelligence and Statistics
ISSN:
1525-531X
Page Range / eLocation ID:
5854-5868
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Inverse multiobjective optimization provides a general framework for the unsupervised learning task of inferring parameters of a multiobjective decision making problem (DMP), based on a set of observed decisions from the human expert. However, the performance of this framework relies critically on the availability of an accurate DMP, sufficient decisions of high quality, and a parameter space that contains enough information about the DMP. To hedge against the uncertainties in the hypothetical DMP, the data, and the parameter space, we investigate in this paper the distributionally robust approach for inverse multiobjective optimization. Specifically, we leverage the Wasserstein metric to construct a ball centered at the empirical distribution of these decisions. We then formulate a Wasserstein distributionally robust inverse multiobjective optimization problem (WRO-IMOP) that minimizes a worst-case expected loss function, where the worst case is taken over all distributions in the Wasserstein ball. We show that the excess risk of the WRO-IMOP estimator has a sub-linear convergence rate. Furthermore, we propose the semi-infinite reformulations of the WRO-IMOP and develop a cutting-plane algorithm that converges to an approximate solution in finite iterations. Finally, we demonstrate the effectiveness of our method on both a synthetic multiobjective quadratic program and a real world portfolio optimization problem. 
    more » « less
  2. null (Ed.)
    Inverse multiobjective optimization provides a general framework for the unsupervised learning task of inferring parameters of a multiobjective decision making problem (DMP), based on a set of observed decisions from the human expert. However, the performance of this framework relies critically on the availability of an accurate DMP, sufficient decisions of high quality, and a parameter space that contains enough information about the DMP. To hedge against the uncertainties in the hypothetical DMP, the data, and the parameter space, we investigate in this paper the distributionally robust approach for inverse multiobjective optimization. Specifically, we leverage the Wasserstein metric to construct a ball centered at the empirical distribution of these decisions. We then formulate a Wasserstein distributionally robust inverse multiobjective optimization problem (WRO-IMOP) that minimizes a worst-case expected loss function, where the worst case is taken over all distributions in the Wasserstein ball. We show that the excess risk of the WRO-IMOP estimator has a sub-linear convergence rate. Furthermore, we propose the semi-infinite reformulations of the WRO-IMOP and develop a cutting-plane algorithm that converges to an approximate solution in finite iterations. Finally, we demonstrate the effectiveness of our method on both a synthetic multiobjective quadratic program and a real world portfolio optimization problem. 
    more » « less
  3. Abstract Expert testimony varies in scientific quality and jurors have a difficult time evaluating evidence quality (McAuliff et al., 2009). In the current study, we apply Fuzzy Trace Theory principles, examining whether visual and gist aids help jurors calibrate to the strength of scientific evidence. Additionally we were interested in the role of jurors’ individual differences in scientific reasoning skills in their understanding of case evidence. Contrary to our preregistered hypotheses, there was no effect of evidence condition or gist aid on evidence understanding. However, individual differences between jurors’ numeracy skills predicted evidence understanding. Summary Poor-quality expert evidence is sometimes admitted into court (Smithburn, 2004). Jurors’ calibration to evidence strength varies widely and is not robustly understood. For instance, previous research has established jurors lack understanding of the role of control groups, confounds, and sample sizes in scientific research (McAuliff, Kovera, & Nunez, 2009; Mill, Gray, & Mandel, 1994). Still others have found that jurors can distinguish weak from strong evidence when the evidence is presented alone, yet not when simultaneously presented with case details (Smith, Bull, & Holliday, 2011). This research highlights the need to present evidence to jurors in a way they can understand. Fuzzy Trace Theory purports that people encode information in exact, verbatim representations and through “gist” representations, which represent summary of meaning (Reyna & Brainerd, 1995). It is possible that the presenting complex scientific evidence to people with verbatim content or appealing to the gist, or bottom-line meaning of the information may influence juror understanding of that evidence. Application of Fuzzy Trace Theory in the medical field has shown that gist representations are beneficial for helping laypeople better understand risk and benefits of medical treatment (Brust-Renck, Reyna, Wilhelms, & Lazar, 2016). Yet, little research has applied Fuzzy Trace Theory to information comprehension and application within the context of a jury (c.f. Reyna et. al., 2015). Additionally, it is likely that jurors’ individual characteristics, such as scientific reasoning abilities and cognitive tendencies, influence their ability to understand and apply complex scientific information (Coutinho, 2006). Methods The purpose of this study was to examine how jurors calibrate to the strength of scientific information, and whether individual difference variables and gist aids inspired by Fuzzy Trace Theory help jurors better understand complicated science of differing quality. We used a 2 (quality of scientific evidence: high vs. low) x 2 (decision aid to improve calibration - gist information vs. no gist information), between-subjects design. All hypotheses were preregistered on the Open Science Framework. Jury-eligible community participants (430 jurors across 90 juries; Mage = 37.58, SD = 16.17, 58% female, 56.93% White). Each jury was randomly assigned to one of the four possible conditions. Participants were asked to individually fill out measures related to their scientific reasoning skills prior to watching a mock jury trial. The trial was about an armed bank robbery and consisted of various pieces of testimony and evidence (e.g. an eyewitness testimony, police lineup identification, and a sweatshirt found with the stolen bank money). The key piece of evidence was mitochondrial DNA (mtDNA) evidence collected from hair on a sweatshirt (materials from Hans et al., 2011). Two experts presented opposing opinions about the scientific evidence related to the mtDNA match estimate for the defendant’s identification. The quality and content of this mtDNA evidence differed based on the two conditions. The high quality evidence condition used a larger database than the low quality evidence to compare to the mtDNA sample and could exclude a larger percentage of people. In the decision aid condition, experts in the gist information group presented gist aid inspired visuals and examples to help explain the proportion of people that could not be excluded as a match. Those in the no gist information group were not given any aid to help them understand the mtDNA evidence presented. After viewing the trial, participants filled out a questionnaire on how well they understood the mtDNA evidence and their overall judgments of the case (e.g. verdict, witness credibility, scientific evidence strength). They filled this questionnaire out again after a 45-minute deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. We developed a 20-item multiple-choice comprehension test for the mtDNA scientific information in the cases (modeled on Hans et al., 2011, and McAuliff et al., 2009). Participants were shown 20 statements related to DNA evidence and asked whether these statements were True or False. The test was then scored out of 20 points. Results For this project, we measured calibration to the scientific evidence in a few different ways. We are building a full model with these various operationalizations to be presented at APLS, but focus only on one of the calibration DVs (i.e., objective understanding of the mtDNA evidence) in the current proposal. We conducted a general linear model with total score on the mtDNA understanding measure as the DV and quality of scientific evidence condition, decision aid condition, and the four individual difference measures (i.e., NFC, ATS, WNS, and SRS) as predictors. Contrary to our main hypotheses, neither evidence quality nor decision aid condition affected juror understanding. However, the individual difference variables did: we found significant main effects for Scientific Reasoning Skills, F(1, 427) = 16.03, p <.001, np2 = .04, Weller Numeracy Scale, F(1, 427) = 15.19, p <.001, np2 = .03, and Need for Cognition, F(1, 427) = 16.80, p <.001, np2 = .04, such that those who scored higher on these measures displayed better understanding of the scientific evidence. In addition there was a significant interaction of evidence quality condition and scores on the Weller’s Numeracy Scale, F(1, 427) = 4.10, p = .04, np2 = .01. Further results will be discussed. Discussion These data suggest jurors are not sensitive to differences in the quality of scientific mtDNA evidence, and also that our attempt at helping sensitize them with Fuzzy Trace Theory-inspired aids did not improve calibration. Individual scientific reasoning abilities and general cognition styles were better predictors of understanding this scientific information. These results suggest a need for further exploration of approaches to help jurors differentiate between high and low quality evidence. Note: The 3rd author was supported by an AP-LS AP Award for her role in this research. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills help jurors understand complex scientific evidence. 
    more » « less
  4. A prerequisite for social coordination is bidirectional communication between teammates, each playing two roles simultaneously: as receptive listeners and expressive speakers. For robots working with humans in complex situations with multiple goals that differ in importance, failure to fulfill the expectation of either role could undermine group performance due to misalignment of values between humans and robots. Specifically, a robot needs to serve as an effective listener to infer human users’ intents from instructions and feedback and as an expressive speaker to explain its decision processes to users. Here, we investigate how to foster effective bidirectional human-robot communications in the context of value alignment—collaborative robots and users form an aligned understanding of the importance of possible task goals. We propose an explainable artificial intelligence (XAI) system in which a group of robots predicts users’ values by taking in situ feedback into consideration while communicating their decision processes to users through explanations. To learn from human feedback, our XAI system integrates a cooperative communication model for inferring human values associated with multiple desirable goals. To be interpretable to humans, the system simulates human mental dynamics and predicts optimal explanations using graphical models. We conducted psychological experiments to examine the core components of the proposed computational framework. Our results show that real-time human-robot mutual understanding in complex cooperative tasks is achievable with a learning model based on bidirectional communication. We believe that this interaction framework can shed light on bidirectional value alignment in communicative XAI systems and, more broadly, in future human-machine teaming systems. 
    more » « less
  5. Bayesian inference allows the transparent communication and systematic updating of model uncertainty as new data become available. When applied to material flow analysis (MFA), however, Bayesian inference is undermined by the difficulty of defining proper priors for the MFA parameters and quantifying the noise in the collected data. We start to address these issues by first deriving and implementing an expert elicitation procedure suitable for generating MFA parameter priors. Second, we propose to learn the data noise concurrent with the parametric uncertainty. These methods are demonstrated using a case study on the 2012 US steel flow. Eight experts are interviewed to elicit distributions on steel flow uncertainty from raw materials to intermediate goods. The experts’ distributions are combined and weighted according to the expertise demonstrated in response to seeding questions. These aggregated distributions form our model parameters’ informative priors. Sensible, weakly informative priors are adopted for learning the data noise. Bayesian inference is then performed to update the parametric and data noise uncertainty given MFA data collected from the United States Geological Survey and the World Steel Association. The results show a reduction in MFA parametric uncertainty when incorporating the collected data. Only a modest reduction in data noise uncertainty was observed using 2012 data; however, greater reductions were achieved when using data from multiple years in the inference. These methods generate transparent MFA and data noise uncertainties learned from data rather than pre-assumed data noise levels, providing a more robust basis for decision-making that affects the system. 
    more » « less