- Award ID(s):
- NSF-PAR ID:
- Date Published:
- Journal Name:
- Annual Conference of the American Psychology-Law Society.
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Abstract Expert testimony varies in scientific quality and jurors have a difficult time evaluating evidence quality (McAuliff et al., 2009). In the current study, we apply Fuzzy Trace Theory principles, examining whether visual and gist aids help jurors calibrate to the strength of scientific evidence. Additionally we were interested in the role of jurors’ individual differences in scientific reasoning skills in their understanding of case evidence. Contrary to our preregistered hypotheses, there was no effect of evidence condition or gist aid on evidence understanding. However, individual differences between jurors’ numeracy skills predicted evidence understanding. Summary Poor-quality expert evidence is sometimes admitted into court (Smithburn, 2004). Jurors’ calibration to evidence strength varies widely and is not robustly understood. For instance, previous research has established jurors lack understanding of the role of control groups, confounds, and sample sizes in scientific research (McAuliff, Kovera, & Nunez, 2009; Mill, Gray, & Mandel, 1994). Still others have found that jurors can distinguish weak from strong evidence when the evidence is presented alone, yet not when simultaneously presented with case details (Smith, Bull, & Holliday, 2011). This research highlights the need to present evidence to jurors in a way they can understand. Fuzzy Trace Theory purports that people encode information in exact, verbatim representations and through “gist” representations, which represent summary of meaning (Reyna & Brainerd, 1995). It is possible that the presenting complex scientific evidence to people with verbatim content or appealing to the gist, or bottom-line meaning of the information may influence juror understanding of that evidence. Application of Fuzzy Trace Theory in the medical field has shown that gist representations are beneficial for helping laypeople better understand risk and benefits of medical treatment (Brust-Renck, Reyna, Wilhelms, & Lazar, 2016). Yet, little research has applied Fuzzy Trace Theory to information comprehension and application within the context of a jury (c.f. Reyna et. al., 2015). Additionally, it is likely that jurors’ individual characteristics, such as scientific reasoning abilities and cognitive tendencies, influence their ability to understand and apply complex scientific information (Coutinho, 2006). Methods The purpose of this study was to examine how jurors calibrate to the strength of scientific information, and whether individual difference variables and gist aids inspired by Fuzzy Trace Theory help jurors better understand complicated science of differing quality. We used a 2 (quality of scientific evidence: high vs. low) x 2 (decision aid to improve calibration - gist information vs. no gist information), between-subjects design. All hypotheses were preregistered on the Open Science Framework. Jury-eligible community participants (430 jurors across 90 juries; Mage = 37.58, SD = 16.17, 58% female, 56.93% White). Each jury was randomly assigned to one of the four possible conditions. Participants were asked to individually fill out measures related to their scientific reasoning skills prior to watching a mock jury trial. The trial was about an armed bank robbery and consisted of various pieces of testimony and evidence (e.g. an eyewitness testimony, police lineup identification, and a sweatshirt found with the stolen bank money). The key piece of evidence was mitochondrial DNA (mtDNA) evidence collected from hair on a sweatshirt (materials from Hans et al., 2011). Two experts presented opposing opinions about the scientific evidence related to the mtDNA match estimate for the defendant’s identification. The quality and content of this mtDNA evidence differed based on the two conditions. The high quality evidence condition used a larger database than the low quality evidence to compare to the mtDNA sample and could exclude a larger percentage of people. In the decision aid condition, experts in the gist information group presented gist aid inspired visuals and examples to help explain the proportion of people that could not be excluded as a match. Those in the no gist information group were not given any aid to help them understand the mtDNA evidence presented. After viewing the trial, participants filled out a questionnaire on how well they understood the mtDNA evidence and their overall judgments of the case (e.g. verdict, witness credibility, scientific evidence strength). They filled this questionnaire out again after a 45-minute deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. We developed a 20-item multiple-choice comprehension test for the mtDNA scientific information in the cases (modeled on Hans et al., 2011, and McAuliff et al., 2009). Participants were shown 20 statements related to DNA evidence and asked whether these statements were True or False. The test was then scored out of 20 points. Results For this project, we measured calibration to the scientific evidence in a few different ways. We are building a full model with these various operationalizations to be presented at APLS, but focus only on one of the calibration DVs (i.e., objective understanding of the mtDNA evidence) in the current proposal. We conducted a general linear model with total score on the mtDNA understanding measure as the DV and quality of scientific evidence condition, decision aid condition, and the four individual difference measures (i.e., NFC, ATS, WNS, and SRS) as predictors. Contrary to our main hypotheses, neither evidence quality nor decision aid condition affected juror understanding. However, the individual difference variables did: we found significant main effects for Scientific Reasoning Skills, F(1, 427) = 16.03, p <.001, np2 = .04, Weller Numeracy Scale, F(1, 427) = 15.19, p <.001, np2 = .03, and Need for Cognition, F(1, 427) = 16.80, p <.001, np2 = .04, such that those who scored higher on these measures displayed better understanding of the scientific evidence. In addition there was a significant interaction of evidence quality condition and scores on the Weller’s Numeracy Scale, F(1, 427) = 4.10, p = .04, np2 = .01. Further results will be discussed. Discussion These data suggest jurors are not sensitive to differences in the quality of scientific mtDNA evidence, and also that our attempt at helping sensitize them with Fuzzy Trace Theory-inspired aids did not improve calibration. Individual scientific reasoning abilities and general cognition styles were better predictors of understanding this scientific information. These results suggest a need for further exploration of approaches to help jurors differentiate between high and low quality evidence. Note: The 3rd author was supported by an AP-LS AP Award for her role in this research. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills help jurors understand complex scientific evidence.more » « less
Abstract We investigate the link between individual differences in science reasoning skills and mock jurors’ deliberation behavior; specifically, how much they talk about the scientific evidence presented in a complicated, ecologically valid case during deliberation. Consistent with our preregistered hypothesis, mock jurors strong in scientific reasoning discussed the scientific evidence more during deliberation than those with weaker science reasoning skills. Summary With increasing frequency, legal disputes involve complex scientific information (Faigman et al., 2014; Federal Judicial Center, 2011; National Research Council, 2009). Yet people often have trouble consuming scientific information effectively (McAuliff et al., 2009; National Science Board, 2014; Resnick et al., 2016). Individual differences in reasoning styles and skills can affect how people comprehend complex evidence (e.g., Hans, Kaye, Dann, Farley, Alberston, 2011; McAuliff & Kovera, 2008). Recently, scholars have highlighted the importance of studying group deliberation contexts as well as individual decision contexts (Salerno & Diamond, 2010; Kovera, 2017). If individual differences influence how jurors understand scientific evidence, it invites questions about how these individual differences may affect the way jurors discuss science during group deliberations. The purpose of the current study was to examine how individual differences in the way people process scientific information affects the extent to which jurors discuss scientific evidence during deliberations. Methods We preregistered the data collection plan, sample size, and hypotheses on the Open Science Framework. Jury-eligible community participants (303 jurors across 50 juries) from Phoenix, AZ (Mage=37.4, SD=16.9; 58.8% female; 51.5% White, 23.7% Latinx, 9.9% African-American, 4.3% Asian) were paid $55 for a 3-hour mock jury study. Participants completed a set of individual questionnaires related to science reasoning skills and attitudes toward science prior to watching a 45-minute mock armed-robbery trial. The trial included various pieces of evidence and testimony, including forensic experts testifying about mitochondrial DNA evidence (mtDNA; based on Hans et al. 2011 materials). Participants were then given 45 minutes to deliberate. The deliberations were video recorded and transcribed to text for analysis. We analyzed the deliberation content for discussions related to the scientific evidence presented during trial. We hypothesized that those with stronger scientific and numeric reasoning skills, higher need for cognition, and more positive views towards science would discuss scientific evidence more than their counterparts during deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by the National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. Coding We identified verbal utterances related to the scientific evidence presented in court. For instance, references to DNA evidence in general (e.g. nuclear DNA being more conclusive than mtDNA), the database that was used to compare the DNA sample (e.g. the database size, how representative it was), exclusion rates (e.g. how many other people could not be excluded as a possible match), and the forensic DNA experts (e.g. how credible they were perceived). We used word count to operationalize the extent to which each juror discussed scientific information. First we calculated the total word count for each complete jury deliberation transcript. Based on the above coding scheme we determined the number of words each juror spent discussing scientific information. To compare across juries, we wanted to account for the differing length of deliberation; thus, we calculated each juror’s scientific deliberation word count as a proportion of their jury’s total word count. Results On average, jurors discussed the science for about 4% of their total deliberation (SD=4%, range 0-22%). We regressed proportion of the deliberation jurors spend discussing scientific information on the four individual difference measures (i.e., SRS, NFC, WNS, ATS). Using the adjusted R-squared, the measures significantly accounted for 5.5% of the variability in scientific information deliberation discussion, SE=0.04, F(4, 199)=3.93, p=0.004. When controlling for all other variables in the model, the Scientific Reasoning Scale was the only measure that remained significant, b=0.003, SE=0.001, t(203)=2.02, p=0.045. To analyze how much variability each measure accounted for, we performed a stepwise regression, with NFC entered at step 1, ATS entered at step 2, WNS entered at step 3, and SRS entered at step 4. At step 1, NFC accounted for 2.4% of the variability, F(1, 202)=5.95, p=0.02. At step 2, ATS did not significantly account for any additional variability. At step 3, WNS accounted for an additional 2.4% of variability, ΔF(1, 200)=5.02, p=0.03. Finally, at step 4, SRS significantly accounted for an additional 1.9% of variability in scientific information discussion, ΔF(1, 199)=4.06, p=0.045, total adjusted R-squared of 0.055. Discussion This study provides additional support for previous findings that scientific reasoning skills affect the way jurors comprehend and use scientific evidence. It expands on previous findings by suggesting that these individual differences also impact the way scientific evidence is discussed during juror deliberations. In addition, this study advances the literature by identifying Scientific Reasoning Skills as a potentially more robust explanatory individual differences variable than more well-studied constructs like Need for Cognition in jury research. Our next steps for this research, which we plan to present at AP-LS as part of this presentation, incudes further analysis of the deliberation content (e.g., not just the mention of, but the accuracy of the references to scientific evidence in discussion). We are currently coding this data with a software program called Noldus Observer XT, which will allow us to present more sophisticated results from this data during the presentation. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills affect how much jurors discuss scientific evidence during deliberation.more » « less
Abstract: 100 words Jurors are increasingly exposed to scientific information in the courtroom. To determine whether providing jurors with gist information would assist in their ability to make well-informed decisions, the present experiment utilized a Fuzzy Trace Theory-inspired intervention and tested it against traditional legal safeguards (i.e., judge instructions) by varying the scientific quality of the evidence. The results indicate that jurors who viewed high quality evidence rated the scientific evidence significantly higher than those who viewed low quality evidence, but were unable to moderate the credibility of the expert witness and apply damages appropriately resulting in poor calibration. Summary: <1000 words Jurors and juries are increasingly exposed to scientific information in the courtroom and it remains unclear when they will base their decisions on a reasonable understanding of the relevant scientific information. Without such knowledge, the ability of jurors and juries to make well-informed decisions may be at risk, increasing chances of unjust outcomes (e.g., false convictions in criminal cases). Therefore, there is a critical need to understand conditions that affect jurors’ and juries’ sensitivity to the qualities of scientific information and to identify safeguards that can assist with scientific calibration in the courtroom. The current project addresses these issues with an ecologically valid experimental paradigm, making it possible to assess causal effects of evidence quality and safeguards as well as the role of a host of individual difference variables that may affect perceptions of testimony by scientific experts as well as liability in a civil case. Our main goal was to develop a simple, theoretically grounded tool to enable triers of fact (individual jurors) with a range of scientific reasoning abilities to appropriately weigh scientific evidence in court. We did so by testing a Fuzzy Trace Theory-inspired intervention in court, and testing it against traditional legal safeguards. Appropriate use of scientific evidence reflects good calibration – which we define as being influenced more by strong scientific information than by weak scientific information. Inappropriate use reflects poor calibration – defined as relative insensitivity to the strength of scientific information. Fuzzy Trace Theory (Reyna & Brainerd, 1995) predicts that techniques for improving calibration can come from presentation of easy-to-interpret, bottom-line “gist” of the information. Our central hypothesis was that laypeople’s appropriate use of scientific information would be moderated both by external situational conditions (e.g., quality of the scientific information itself, a decision aid designed to convey clearly the “gist” of the information) and individual differences among people (e.g., scientific reasoning skills, cognitive reflection tendencies, numeracy, need for cognition, attitudes toward and trust in science). Identifying factors that promote jurors’ appropriate understanding of and reliance on scientific information will contribute to general theories of reasoning based on scientific evidence, while also providing an evidence-based framework for improving the courts’ use of scientific information. All hypotheses were preregistered on the Open Science Framework. Method Participants completed six questionnaires (counterbalanced): Need for Cognition Scale (NCS; 18 items), Cognitive Reflection Test (CRT; 7 items), Abbreviated Numeracy Scale (ABS; 6 items), Scientific Reasoning Scale (SRS; 11 items), Trust in Science (TIS; 29 items), and Attitudes towards Science (ATS; 7 items). Participants then viewed a video depicting a civil trial in which the defendant sought damages from the plaintiff for injuries caused by a fall. The defendant (bar patron) alleged that the plaintiff (bartender) pushed him, causing him to fall and hit his head on the hard floor. Participants were informed at the outset that the defendant was liable; therefore, their task was to determine if the plaintiff should be compensated. Participants were randomly assigned to 1 of 6 experimental conditions: 2 (quality of scientific evidence: high vs. low) x 3 (safeguard to improve calibration: gist information, no-gist information [control], jury instructions). An expert witness (neuroscientist) hired by the court testified regarding the scientific strength of fMRI data (high [90 to 10 signal-to-noise ratio] vs. low [50 to 50 signal-to-noise ratio]) and gist or no-gist information both verbally (i.e., fairly high/about average) and visually (i.e., a graph). After viewing the video, participants were asked if they would like to award damages. If they indicated yes, they were asked to enter a dollar amount. Participants then completed the Positive and Negative Affect Schedule-Modified Short Form (PANAS-MSF; 16 items), expert Witness Credibility Scale (WCS; 20 items), Witness Credibility and Influence on damages for each witness, manipulation check questions, Understanding Scientific Testimony (UST; 10 items), and 3 additional measures were collected, but are beyond the scope of the current investigation. Finally, participants completed demographic questions, including questions about their scientific background and experience. The study was completed via Qualtrics, with participation from students (online vs. in-lab), MTurkers, and non-student community members. After removing those who failed attention check questions, 469 participants remained (243 men, 224 women, 2 did not specify gender) from a variety of racial and ethnic backgrounds (70.2% White, non-Hispanic). Results and Discussion There were three primary outcomes: quality of the scientific evidence, expert credibility (WCS), and damages. During initial analyses, each dependent variable was submitted to a separate 3 Gist Safeguard (safeguard, no safeguard, judge instructions) x 2 Scientific Quality (high, low) Analysis of Variance (ANOVA). Consistent with hypotheses, there was a significant main effect of scientific quality on strength of evidence, F(1, 463)=5.099, p=.024; participants who viewed the high quality evidence rated the scientific evidence significantly higher (M= 7.44) than those who viewed the low quality evidence (M=7.06). There were no significant main effects or interactions for witness credibility, indicating that the expert that provided scientific testimony was seen as equally credible regardless of scientific quality or gist safeguard. Finally, for damages, consistent with hypotheses, there was a marginally significant interaction between Gist Safeguard and Scientific Quality, F(2, 273)=2.916, p=.056. However, post hoc t-tests revealed significantly higher damages were awarded for low (M=11.50) versus high (M=10.51) scientific quality evidence F(1, 273)=3.955, p=.048 in the no gist with judge instructions safeguard condition, which was contrary to hypotheses. The data suggest that the judge instructions alone are reversing the pattern, though nonsignificant, those who received the no gist without judge instructions safeguard awarded higher damages in the high (M=11.34) versus low (M=10.84) scientific quality evidence conditions F(1, 273)=1.059, p=.30. Together, these provide promising initial results indicating that participants were able to effectively differentiate between high and low scientific quality of evidence, though inappropriately utilized the scientific evidence through their inability to discern expert credibility and apply damages, resulting in poor calibration. These results will provide the basis for more sophisticated analyses including higher order interactions with individual differences (e.g., need for cognition) as well as tests of mediation using path analyses. [References omitted but available by request] Learning Objective: Participants will be able to determine whether providing jurors with gist information would assist in their ability to award damages in a civil trial.more » « less
The law expects jurors to weigh the facts and evidence of a case to inform the decision with which they are charged. However, evidence in legal cases is becoming increasingly complicated, and studies have raised questions about laypeople’s abilities to understand and use complex evidence to inform decisions. Compared to other studies that have looked at general evidence comprehension and expert credibility (e.g. Schweitzer & Saks, 2012), this experimental study investigated whether jurors can appropriately weigh strong vs. weak DNA evidence without special assistance. That is, without help to understand when DNA evidence is relatively weak, are jurors sensitive to the strength of weak DNA evidence as compared to strong DNA evidence? Responses from jury-eligible participants (N=346) were collected from Amazon Mechanical Turk (MTurk). Participants were presented with a summary of a robbery case before being asked a short questionnaire related to verdict preference and evidence comprehension. (Data is from the pilot of experiment 2 for the grant project). We hypothesized participants would not be able to distinguish high- from low-quality DNA evidence. We analyzed the data using Bayes Factors, which allows for directly testing the null hypothesis (Zyphur & Oswald, 2013). A Bayes Factor of 4-8 (depending on the priors used) was found supporting the null for participants’ rating of low vs. high quality scientific evidence. A Bayes Factor of 4 means that the null is four times as probable as an alternative hypothesis. Participants tended to rate the DNA evidence as “high quality” no matter the condition they were in. The Bayes Factor of 4-8 in this case gives good reason to believe that jury members are unable to discern what constitutes low quality DNA evidence without assistance. If jurors are unable to distinguish between different qualities of evidence, or if they are unaware that they may have to, they could give greater weight to low quality scientific evidence than is warranted. The current study supports the hypothesis that jurors have trouble distinguishing between complicated high vs. low quality evidence without help. Further attempts will be made to discover ways of presenting DNA evidence that could better calibrate jurors in their decisions. These future directions involve larger sample sizes in which jury-eligible participants will complete the study in person. Instead of reading about the evidence, they will watch a filmed mock jury trial. This plan also involves jury deliberation which will provide additional knowledge about how jurors come to conclusions as a group about different qualities of evidence. Acknowledging the potential issues in jury trials and working to solve these problems is a vital step in improving our justice system.more » « less
The jury is a defining component of the American criminal justice system, and the courts largely assume that the collaborative nature of jury deliberations will enhance jurors’ memory for important trial information. However, research suggests that this kind of collaboration, although sometimes improving memory, can also lead to incomplete and inaccurate “collective” memories. The present research examines whether jury deliberations, where individuals collaboratively recall and discuss trial evidence to render unanimous verdicts, might shape jurors’ memories through the robust phenomena of Within‐Individual and Socially Shared Retrieval‐Induced Forgetting (WI‐RIF and SS‐RIF, respectively). The results revealed no WI‐RIF or SS‐RIF. However, we did find evidence in the direction of Within‐Individual and Socially‐shared Retrieval Induced Facilitation (WI‐RIFA and SS‐RIFA, respectively) in speakers’ and listeners’ narrative and open‐ended recall of evidentiary details. The present results are discussed in terms of whether jurors’ goals during deliberation and the deliberation structure (e.g., six or more discussants) protect against forgetting, or whether possible methodological issues (e.g., the vast amount of information presented) eliminated WI‐RIF and SS‐RIF and, in turn, make drawing conclusions surrounding the mnemonic impact of jury deliberation difficult. Regardless, the present results suggest jury deliberations are quite limited in terms of how much evidence is actually discussed compared to the total of what could be discussed, and our methodology provides an ecologically valid baseline for future research to better understand the mnemonic consequences associated with jury deliberations and, in turn, jury decision making.