skip to main content


Title: A Cross-Sectional Investigation of Students’ Reasoning About Integer Addition and Subtraction: Ways of Reasoning, Problem Types, and Flexibility
In a cross-sectional study, 160 students in Grades 2, 4, 7, and 11 were interviewed about their reasoning when solving integer addition and subtraction open-number sentence problems. We applied our previously developed framework for 5 Ways of Reasoning (WoRs) to our data set to describe patterns within and across participant groups. Our analysis of the WoRs also led to the identification of 3 problem types: change-positive, all-negatives, and counterintuitive. We found that problem type influenced student performance and tended to evoke a different way of reasoning. We showed that those with more experience with negative numbers use WoRs more flexibly than those with less experience and that flexibility is correlated with accuracy. We provide 3 types of resources for educators: (a) WoRs and problem-types frameworks, (b) characterization of flexibility with integer addition and subtraction, and (c) development of a trajectory of learning about integers.  more » « less
Award ID(s):
0918780
NSF-PAR ID:
10123107
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Journal for research in mathematics education
Volume:
49
Issue:
5
ISSN:
1945-2306
Page Range / eLocation ID:
575 - 613
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We investigate the link between individual differences in science reasoning skills and mock jurors’ deliberation behavior; specifically, how much they talk about the scientific evidence presented in a complicated, ecologically valid case during deliberation. Consistent with our preregistered hypothesis, mock jurors strong in scientific reasoning discussed the scientific evidence more during deliberation than those with weaker science reasoning skills. Summary With increasing frequency, legal disputes involve complex scientific information (Faigman et al., 2014; Federal Judicial Center, 2011; National Research Council, 2009). Yet people often have trouble consuming scientific information effectively (McAuliff et al., 2009; National Science Board, 2014; Resnick et al., 2016). Individual differences in reasoning styles and skills can affect how people comprehend complex evidence (e.g., Hans, Kaye, Dann, Farley, Alberston, 2011; McAuliff & Kovera, 2008). Recently, scholars have highlighted the importance of studying group deliberation contexts as well as individual decision contexts (Salerno & Diamond, 2010; Kovera, 2017). If individual differences influence how jurors understand scientific evidence, it invites questions about how these individual differences may affect the way jurors discuss science during group deliberations. The purpose of the current study was to examine how individual differences in the way people process scientific information affects the extent to which jurors discuss scientific evidence during deliberations. Methods We preregistered the data collection plan, sample size, and hypotheses on the Open Science Framework. Jury-eligible community participants (303 jurors across 50 juries) from Phoenix, AZ (Mage=37.4, SD=16.9; 58.8% female; 51.5% White, 23.7% Latinx, 9.9% African-American, 4.3% Asian) were paid $55 for a 3-hour mock jury study. Participants completed a set of individual questionnaires related to science reasoning skills and attitudes toward science prior to watching a 45-minute mock armed-robbery trial. The trial included various pieces of evidence and testimony, including forensic experts testifying about mitochondrial DNA evidence (mtDNA; based on Hans et al. 2011 materials). Participants were then given 45 minutes to deliberate. The deliberations were video recorded and transcribed to text for analysis. We analyzed the deliberation content for discussions related to the scientific evidence presented during trial. We hypothesized that those with stronger scientific and numeric reasoning skills, higher need for cognition, and more positive views towards science would discuss scientific evidence more than their counterparts during deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by the National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. Coding We identified verbal utterances related to the scientific evidence presented in court. For instance, references to DNA evidence in general (e.g. nuclear DNA being more conclusive than mtDNA), the database that was used to compare the DNA sample (e.g. the database size, how representative it was), exclusion rates (e.g. how many other people could not be excluded as a possible match), and the forensic DNA experts (e.g. how credible they were perceived). We used word count to operationalize the extent to which each juror discussed scientific information. First we calculated the total word count for each complete jury deliberation transcript. Based on the above coding scheme we determined the number of words each juror spent discussing scientific information. To compare across juries, we wanted to account for the differing length of deliberation; thus, we calculated each juror’s scientific deliberation word count as a proportion of their jury’s total word count. Results On average, jurors discussed the science for about 4% of their total deliberation (SD=4%, range 0-22%). We regressed proportion of the deliberation jurors spend discussing scientific information on the four individual difference measures (i.e., SRS, NFC, WNS, ATS). Using the adjusted R-squared, the measures significantly accounted for 5.5% of the variability in scientific information deliberation discussion, SE=0.04, F(4, 199)=3.93, p=0.004. When controlling for all other variables in the model, the Scientific Reasoning Scale was the only measure that remained significant, b=0.003, SE=0.001, t(203)=2.02, p=0.045. To analyze how much variability each measure accounted for, we performed a stepwise regression, with NFC entered at step 1, ATS entered at step 2, WNS entered at step 3, and SRS entered at step 4. At step 1, NFC accounted for 2.4% of the variability, F(1, 202)=5.95, p=0.02. At step 2, ATS did not significantly account for any additional variability. At step 3, WNS accounted for an additional 2.4% of variability, ΔF(1, 200)=5.02, p=0.03. Finally, at step 4, SRS significantly accounted for an additional 1.9% of variability in scientific information discussion, ΔF(1, 199)=4.06, p=0.045, total adjusted R-squared of 0.055. Discussion This study provides additional support for previous findings that scientific reasoning skills affect the way jurors comprehend and use scientific evidence. It expands on previous findings by suggesting that these individual differences also impact the way scientific evidence is discussed during juror deliberations. In addition, this study advances the literature by identifying Scientific Reasoning Skills as a potentially more robust explanatory individual differences variable than more well-studied constructs like Need for Cognition in jury research. Our next steps for this research, which we plan to present at AP-LS as part of this presentation, incudes further analysis of the deliberation content (e.g., not just the mention of, but the accuracy of the references to scientific evidence in discussion). We are currently coding this data with a software program called Noldus Observer XT, which will allow us to present more sophisticated results from this data during the presentation. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills affect how much jurors discuss scientific evidence during deliberation. 
    more » « less
  2. Reasoning about memory aliasing and mutation in software verification is a hard problem. This is especially true for systems using SMT-based automated theorem provers. Memory reasoning in SMT verification typically requires a nontrivial amount of manual effort to specify heap invariants, as well as extensive alias reasoning from the SMT solver. In this paper, we present a hybrid approach that combines linear types with SMT-based verification for memory reasoning. We integrate linear types into Dafny, a verification language with an SMT backend, and show that the two approaches complement each other. By separating memory reasoning from verification conditions, linear types reduce the SMT solving time. At the same time, the expressiveness of SMT queries extends the flexibility of the linear type system. In particular, it allows our linear type system to easily and correctly mix linear and nonlinear data in novel ways, encapsulating linear data inside nonlinear data and vice-versa. We formalize the core of our extensions, prove soundness, and provide algorithms for linear type checking. We evaluate our approach by converting the implementation of a verified storage system (about 24K lines of code and proof) written in Dafny, to use our extended Dafny. The resulting system uses linear types for 91% of the code and SMT-based heap reasoning for the remaining 9%. We show that the converted system has 28% fewer lines of proofs and 30% shorter verification time overall. We discuss the development overhead in the original system due to SMT-based heap reasoning and highlight the improved developer experience when using linear types. 
    more » « less
  3. Despite the early development of causal reasoning (CR), and its potential for shaping scientific literacy, we have little understanding of its structural origins. Specifically, is CR a unique capability that develops relatively independently or is it largely dependent on broader, more fundamental, cognitive abilities? Executive Functioning (EF) is an especially promising contributor to CR based on its already established role in related skills like planning and problem solving (e.g., Diamond, 2013). To begin exploring this potential relationship, we assessed 123 three (Mage = 3.42 years) and 64 five year olds’ (Mage = 5.36 years) performance on two CR tasks (counterfactual reasoning and causal inference), each of which we expected might be influenced in different ways by distinct EF skills. The counterfactual reasoning task (Guajardo & Turley-Ames, 2004) required children to generate alternative courses of action that would lead to different outcomes in fictional vignettes. The causal inference task (Das Gupta & Bryant, 1989) required children to compare pictures taken before and after a transformation (e.g., broken flowerpot and intact flowerpot) and to select a tool (e.g., glue) that could have caused it. We measured EF with three tasks: flanker (inhibition), count and label (working memory), and dimensional change card sort (cognitive flexibility). Finally, we measured children’s vocabulary and processing speed. To explore the relationship between EF and CR, we conducted a series of four linear regressions predicting causal inference and counterfactual reasoning ability in 3 and 5 year olds. Of all our measures, only vocabulary and inhibitory control emerged as significant predictors of causal inference ability for both 3 (βvocab = .04, p = .002, and βinhib = .04, p = .04) and 5 year olds (βvocab = .03, p = .01, and βinhib = .02, p = .04). Similarly, inhibitory control emerged as the only significant predictor of counterfactual reasoning in 3 year olds, βinhib = .03, p = .03. In contrast, for 5 year olds, working memory was the only significantly predictor of counterfactual reasoning, βWM = .71, p = .02. These results suggest that causal inference skills are stably supported by inhibitory control throughout early childhood. The story for counterfactual reasoning, however, appears to be somewhat more complex. Consistent with previous work (Beck, Riggs & Gorniak, 2009), inhibitory control supported counterfactual reasoning ability in our 3-year-old sample. However, inhibitory control did not significantly predict counterfactual reasoning in 5 year olds, it was supported by working memory instead. One explanation for this difference might have to do with the sophistication of children’s counterfactual reasoning skills at these different ages. Taken together, these results suggest that CR does not develop as a unique capacity, but instead likely relies on EFs that influence different CR skills in distinct ways across development. This represents an initial step in understanding early CR skills, which are promising contributors to emerging scientific literacy. 
    more » « less
  4. Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-trained language models such as GPT-3 have achieved remarkable progress on mathematical reasoning tasks written in text form, such as math word problems (MWP). However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data. To fill the gap, we present Tabular Math Word Problems (TABMWP), a new dataset containing 38,431 open-domain grade-level problems that require mathematical reasoning on both textual and tabular data. Each question in TABMWP is aligned with a tabular context, which is presented as an image, semi-structured text, and a structured table. There are two types of questions: free-text and multi-choice, and each problem is annotated with gold solutions to reveal the multi-step reasoning process. We evaluate different pre-trained models on TABMWP, including the GPT-3 model in a few-shot setting. As earlier studies suggest, since few-shot GPT-3 relies on the selection of in-context examples, its performance is unstable and can degrade to near chance. The unstable issue is more severe when handling complex problems like TABMWP. To mitigate this, we further propose a novel approach, PROMPTPG, which utilizes policy gradient to learn to select in-context examples from a small amount of training data and then constructs the corresponding prompt for the test example. Experimental results show that our method outperforms the best baseline by 5.31% on the accuracy metric and reduces the prediction variance significantly compared to random selection, which verifies its effectiveness in selecting in-context examples. 
    more » « less
  5. Abstract

    Investigating how children think about leadership may inform theories of the gender gaps in leadership among adults. In three studies (N = 492 U.S. children ages 5–10 years), we investigated (1) whether children expect those who claim leadership roles within a peer group to elicit social support and cooperation from the group, (2) children’s own interest and self-efficacy in such roles, and (3) the influence of contextual cues (e.g., how leader roles are described) on children’s reasoning about and interest in leadership. We also explored differences based on children’s race/ethnicity. In Study 1, girls expected lower social support for child leaders than boys did. However, in Study 2, we found no evidence that girls are less interested in leadership. In addition, interest in leadership increased with age among White girls but decreased among White boys and girls and boys of color. In Study 3, we tested whether interest in a leader role is boosted (particularly among girls) by describing the role as helpful for the group and by providing gender-balanced peer role models. Regardless of gender, children in the helpful or “communal” (vs. “agentic”) leader condition were more interested in the leader role, anticipated stronger social support and cooperation from others, and reported higher self-efficacy as leaders. The gender composition of role models had little impact. This research underscores the early development of children’s attitudes toward leadership and highlights the potential value in early interventions to nurture children’s leadership ambitions.

     
    more » « less