skip to main content


Title: Current Challenges When Using Numbers in Patient Decision Aids: Advanced Concepts
Background Decision aid developers have to convey complex task-specific numeric information in a way that minimizes bias and promotes understanding of the options available within a particular decision. Whereas our companion paper summarizes fundamental issues, this article focuses on more complex, task-specific aspects of presenting numeric information in patient decision aids. Methods As part of the International Patient Decision Aids Standards third evidence update, we gathered an expert panel of 9 international experts who revised and expanded the topics covered in the 2013 review working in groups of 2 to 3 to update the evidence, based on their expertise and targeted searches of the literature. The full panel then reviewed and provided additional revisions, reaching consensus on the final version. Results Five of the 10 topics addressed more complex task-specific issues. We found strong evidence for using independent event rates and/or incremental absolute risk differences for the effect size of test and screening outcomes. Simple visual formats can help to reduce common judgment biases and enhance comprehension but can be misleading if not well designed. Graph literacy can moderate the effectiveness of visual formats and hence should be considered in tool design. There is less evidence supporting the inclusion of personalized and interactive risk estimates. Discussion More complex numeric information. such as the size of the benefits and harms for decision options, can be better understood by using incremental absolute risk differences alongside well-designed visual formats that consider the graph literacy of the intended audience. More research is needed into when and how to use personalized and/or interactive risk estimates because their complexity and accessibility may affect their feasibility in clinical practice.  more » « less
Award ID(s):
2017651
NSF-PAR ID:
10296673
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Medical Decision Making
Volume:
41
Issue:
7
ISSN:
0272-989X
Page Range / eLocation ID:
834 to 847
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Background Shared decision making requires evidence to be conveyed to the patient in a way they can easily understand and compare. Patient decision aids facilitate this process. This article reviews the current evidence for how to present numerical probabilities within patient decision aids. Methods Following the 2013 review method, we assembled a group of 9 international experts on risk communication across Australia, Germany, the Netherlands, the United Kingdom, and the United States. We expanded the topics covered in the first review to reflect emerging areas of research. Groups of 2 to 3 authors reviewed the relevant literature based on their expertise and wrote each section before review by the full authorship team. Results Of 10 topics identified, we present 5 fundamental issues in this article. Although some topics resulted in clear guidance (presenting the chance an event will occur, addressing numerical skills), other topics (context/evaluative labels, conveying uncertainty, risk over time) continue to have evolving knowledge bases. We recommend presenting numbers over a set time period with a clear denominator, using consistent formats between outcomes and interventions to enable unbiased comparisons, and interpreting the numbers for the reader to meet the needs of varying numeracy. Discussion Understanding how different numerical formats can bias risk perception will help decision aid developers communicate risks in a balanced, comprehensible manner and avoid accidental “nudging” toward a particular option. Decisions between probability formats need to consider the available evidence and user skills. The review may be useful for other areas of science communication in which unbiased presentation of probabilities is important. 
    more » « less
  2. While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. We apply our approach, named ReAct, to a diverse set of language and decision making tasks and demonstrate its effectiveness over state-of-the-art baselines, as well as improved human interpretability and trustworthiness over methods without reasoning or acting components. Concretely, on question answering (HotpotQA) and fact verification (Fever), ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API, and generates human-like task-solving trajectories that are more interpretable than baselines without reasoning traces. On two interactive decision making benchmarks (ALFWorld and WebShop), ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples. 
    more » « less
  3. Theory—understanding mental processes that drive decisions—is important to help patients and providers make decisions that reflect medical advances and personal values. Building on a 2008 review, we summarize current tenets of fuzzy-trace theory (FTT) in light of new evidence that provides insight regarding mental representations of options and how such representations connect to values and evoke emotions. We discuss implications for communicating risks, preventing risky behaviors, discouraging misinformation, and choosing appropriate treatments. Findings suggest that simple, fuzzy but meaningful gist representations of information often determine decisions. Within minutes of conversing with their doctor, reading a health-related web post, or processing other health information, patients rely on gist memories of that information rather than verbatim details. This fuzzy-processing preference explains puzzles and paradoxes in how patients (and sometimes providers) think about probabilities (e.g., “50-50” chance), outcomes of treatment (e.g., with antibiotics), experiences of pain, end-of-life decisions, memories for medication instructions, symptoms of concussion, and transmission of viruses (e.g., in AIDS and COVID-19). As examples, participation in clinical trials or seeking treatments with low probabilities of success (e.g., with antibiotics or at the end of life) may indicate a defensibly different categorical gist perspective on risk as opposed to simply misunderstanding probabilities or failing to make prescribed tradeoffs. Thus, FTT explains why people avoid precise tradeoffs despite computing them. Facilitating gist representations of information offers an alternative approach that goes beyond providing uninterpreted “neutral” facts versus persuading or shifting the balance between fast versus slow thinking (or emotion vs. cognition). In contrast to either taking mental shortcuts or deliberating about details, gist processing facilitates application of advanced knowledge and deeply held values to choices. Highlights Fuzzy-trace theory (FTT) supports practical approaches to improving health and medicine. FTT differs in important respects from other theories of decision making, which has implications for how to help patients, providers, and health communicators. Gist mental representations emphasize categorical distinctions, reflect understanding in context, and help cue values relevant to health and patient care. Understanding the science behind theory is crucial for evidence-based medicine. 
    more » « less
  4. Abstract Expert testimony varies in scientific quality and jurors have a difficult time evaluating evidence quality (McAuliff et al., 2009). In the current study, we apply Fuzzy Trace Theory principles, examining whether visual and gist aids help jurors calibrate to the strength of scientific evidence. Additionally we were interested in the role of jurors’ individual differences in scientific reasoning skills in their understanding of case evidence. Contrary to our preregistered hypotheses, there was no effect of evidence condition or gist aid on evidence understanding. However, individual differences between jurors’ numeracy skills predicted evidence understanding. Summary Poor-quality expert evidence is sometimes admitted into court (Smithburn, 2004). Jurors’ calibration to evidence strength varies widely and is not robustly understood. For instance, previous research has established jurors lack understanding of the role of control groups, confounds, and sample sizes in scientific research (McAuliff, Kovera, & Nunez, 2009; Mill, Gray, & Mandel, 1994). Still others have found that jurors can distinguish weak from strong evidence when the evidence is presented alone, yet not when simultaneously presented with case details (Smith, Bull, & Holliday, 2011). This research highlights the need to present evidence to jurors in a way they can understand. Fuzzy Trace Theory purports that people encode information in exact, verbatim representations and through “gist” representations, which represent summary of meaning (Reyna & Brainerd, 1995). It is possible that the presenting complex scientific evidence to people with verbatim content or appealing to the gist, or bottom-line meaning of the information may influence juror understanding of that evidence. Application of Fuzzy Trace Theory in the medical field has shown that gist representations are beneficial for helping laypeople better understand risk and benefits of medical treatment (Brust-Renck, Reyna, Wilhelms, & Lazar, 2016). Yet, little research has applied Fuzzy Trace Theory to information comprehension and application within the context of a jury (c.f. Reyna et. al., 2015). Additionally, it is likely that jurors’ individual characteristics, such as scientific reasoning abilities and cognitive tendencies, influence their ability to understand and apply complex scientific information (Coutinho, 2006). Methods The purpose of this study was to examine how jurors calibrate to the strength of scientific information, and whether individual difference variables and gist aids inspired by Fuzzy Trace Theory help jurors better understand complicated science of differing quality. We used a 2 (quality of scientific evidence: high vs. low) x 2 (decision aid to improve calibration - gist information vs. no gist information), between-subjects design. All hypotheses were preregistered on the Open Science Framework. Jury-eligible community participants (430 jurors across 90 juries; Mage = 37.58, SD = 16.17, 58% female, 56.93% White). Each jury was randomly assigned to one of the four possible conditions. Participants were asked to individually fill out measures related to their scientific reasoning skills prior to watching a mock jury trial. The trial was about an armed bank robbery and consisted of various pieces of testimony and evidence (e.g. an eyewitness testimony, police lineup identification, and a sweatshirt found with the stolen bank money). The key piece of evidence was mitochondrial DNA (mtDNA) evidence collected from hair on a sweatshirt (materials from Hans et al., 2011). Two experts presented opposing opinions about the scientific evidence related to the mtDNA match estimate for the defendant’s identification. The quality and content of this mtDNA evidence differed based on the two conditions. The high quality evidence condition used a larger database than the low quality evidence to compare to the mtDNA sample and could exclude a larger percentage of people. In the decision aid condition, experts in the gist information group presented gist aid inspired visuals and examples to help explain the proportion of people that could not be excluded as a match. Those in the no gist information group were not given any aid to help them understand the mtDNA evidence presented. After viewing the trial, participants filled out a questionnaire on how well they understood the mtDNA evidence and their overall judgments of the case (e.g. verdict, witness credibility, scientific evidence strength). They filled this questionnaire out again after a 45-minute deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. We developed a 20-item multiple-choice comprehension test for the mtDNA scientific information in the cases (modeled on Hans et al., 2011, and McAuliff et al., 2009). Participants were shown 20 statements related to DNA evidence and asked whether these statements were True or False. The test was then scored out of 20 points. Results For this project, we measured calibration to the scientific evidence in a few different ways. We are building a full model with these various operationalizations to be presented at APLS, but focus only on one of the calibration DVs (i.e., objective understanding of the mtDNA evidence) in the current proposal. We conducted a general linear model with total score on the mtDNA understanding measure as the DV and quality of scientific evidence condition, decision aid condition, and the four individual difference measures (i.e., NFC, ATS, WNS, and SRS) as predictors. Contrary to our main hypotheses, neither evidence quality nor decision aid condition affected juror understanding. However, the individual difference variables did: we found significant main effects for Scientific Reasoning Skills, F(1, 427) = 16.03, p <.001, np2 = .04, Weller Numeracy Scale, F(1, 427) = 15.19, p <.001, np2 = .03, and Need for Cognition, F(1, 427) = 16.80, p <.001, np2 = .04, such that those who scored higher on these measures displayed better understanding of the scientific evidence. In addition there was a significant interaction of evidence quality condition and scores on the Weller’s Numeracy Scale, F(1, 427) = 4.10, p = .04, np2 = .01. Further results will be discussed. Discussion These data suggest jurors are not sensitive to differences in the quality of scientific mtDNA evidence, and also that our attempt at helping sensitize them with Fuzzy Trace Theory-inspired aids did not improve calibration. Individual scientific reasoning abilities and general cognition styles were better predictors of understanding this scientific information. These results suggest a need for further exploration of approaches to help jurors differentiate between high and low quality evidence. Note: The 3rd author was supported by an AP-LS AP Award for her role in this research. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills help jurors understand complex scientific evidence. 
    more » « less
  5. Abstract <p>Active surveillance (AS) is a suitable management option for newly diagnosed prostate cancer, which usually presents low to intermediate clinical risk. Patients enrolled in AS have their tumor monitored via longitudinal multiparametric MRI (mpMRI), PSA tests, and biopsies. Hence, treatment is prescribed when these tests identify progression to higher-risk prostate cancer. However, current AS protocols rely on detecting tumor progression through direct observation according to population-based monitoring strategies. This approach limits the design of patient-specific AS plans and may delay the detection of tumor progression. Here, we present a pilot study to address these issues by leveraging personalized computational predictions of prostate cancer growth. Our forecasts are obtained with a spatiotemporal biomechanistic model informed by patient-specific longitudinal mpMRI data (T2-weighted MRI and apparent diffusion coefficient maps from diffusion-weighted MRI). Our results show that our technology can represent and forecast the global tumor burden for individual patients, achieving concordance correlation coefficients from 0.93 to 0.99 across our cohort (n = 7). In addition, we identify a model-based biomarker of higher-risk prostate cancer: the mean proliferation activity of the tumor (P = 0.041). Using logistic regression, we construct a prostate cancer risk classifier based on this biomarker that achieves an area under the ROC curve of 0.83. We further show that coupling our tumor forecasts with this prostate cancer risk classifier enables the early identification of prostate cancer progression to higher-risk disease by more than 1 year. Thus, we posit that our predictive technology constitutes a promising clinical decision-making tool to design personalized AS plans for patients with prostate cancer.</p></sec> <sec><title>Significance:

    Personalization of a biomechanistic model of prostate cancer with mpMRI data enables the prediction of tumor progression, thereby showing promise to guide clinical decision-making during AS for each individual patient.

     
    more » « less