skip to main content

This content will become publicly available on April 1, 2023

Title: Measures of mathematics teachers’ behavior and affect: An examination of the assessment landscape
Although the paradigm wars between quantitative and qualitative research methods and the associated epistemologies may have settled down in recent years within the mathematics education research community, the high value placed on quantitative methods and randomized control trials remain as the gold standard at the policy-making level (USDOE, 2008). Although diverse methods are valued in the mathematics education community, if mathematics educators hope to influence policy to cultivate more equitable education systems, then we must engage in rigorous quantitative research. However, quantitative research is limited in what it can measure by the quantitative tools that exist. In mathematics education, it seems as though the development of quantitative tools and studying their associated validity and reliability evidence has lagged behind the important constructs that rich qualitative research has uncovered. The purpose of this study is to describe quantitative instruments related to mathematics teacher behavior and affect in order to better understand what currently exists in the field, what validity and reliability evidence has been published for such instruments, and what constructs each measure. 1. How many and what types of instruments of mathematics teacher behavior and affect exist? 2. What types of validity and reliability evidence are published for these instruments? 3. What constructs do these more » instruments measure? 4. To what extent have issues of equity been the focus of the instruments found? « less
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Annual meeting program American Educational Research Association
Sponsoring Org:
National Science Foundation
More Like this
  1. The Engineering Research Centers (ERCs), funded by the National Science Foundation (NSF), play an important role in improving engineering education, bridging engineering academia and broad communities, and promoting a culture of diversity and inclusion. Each ERC must partner with an independent evaluation team to annually assess their performance and impact on progressing education, connecting community, and building diversified culture. This evaluation is currently performed independently (and in isolation), which leads to inconsistent evaluations and a redundant investment of ERCs’ resources into such tasks (e.g. developing evaluation instruments). These isolated efforts by ERCs to quantitatively evaluate their education programs also typically lack adequate sample size within a single center, which limits the validity and reliability of the quantitative analyses. Three ERCs, all associated with a large southwest university in the United States, worked collaboratively to overcome sample size and measure inconsistency concerns by developing a common quantitative instrument that is capable of evaluating any ERC’s education and diversity impacts. The instrument is the result of a systematic process with comparing and contrasting each ERC’s existing evaluation tools, including surveys and interview protocols. This new, streamlined tool captures participants’ overall experience as part of the ERC by measuring various constructs including skillsetmore »development, perception of diversity and inclusion, future plans after participating in the ERC, and mentorship received from the ERC. Scales and embedded items were designed broadly for possible use with both yearlong (e.g. graduate and undergraduate student, and postdoctoral scholars) and summer program (Research Experience for Undergraduates, Research Experience for Teachers, and Young Scholar Program) participants. The instrument was distributed and tested during Summer 2019 with participants in the summer programs from all three ERCs. The forthcoming paper will present the new common cross-ERC evaluation instrument, demonstrate the effort of collecting data across all three ERCs, present preliminary findings, and discuss collaborative processes and challenges. The preliminary implication for this work is the ability to directly compare educational programs across ERCs. The authors also believe that this tool can provide a fast start for new ERCs on how to evaluate their educational programs.« less
  2. In this theory paper, we set out to consider, as a matter of methodological interest, the use of quantitative measures of inter-coder reliability (e.g., percentage agreement, correlation, Cohen’s Kappa, etc.) as necessary and/or sufficient correlates for quality within qualitative research in engineering education. It is well known that the phrase qualitative research represents a diverse body of scholarship conducted across a range of epistemological viewpoints and methodologies. Given this diversity, we concur with those who state that it is ill advised to propose recipes or stipulate requirements for achieving qualitative research validity and reliability. Yet, as qualitative researchers ourselves, we repeatedly find the need to communicate the validity and reliability—or quality—of our work to different stakeholders, including funding agencies and the public. One method for demonstrating quality, which is increasingly used in qualitative research in engineering education, is the practice of reporting quantitative measures of agreement between two or more people who code the same qualitative dataset. In this theory paper, we address this common practice in two ways. First, we identify instances in which inter-coder reliability measures may not be appropriate or adequate for establishing quality in qualitative research. We query research that suggests that the numerical measure itselfmore »is the goal of qualitative analysis, rather than the depth and texture of the interpretations that are revealed. Second, we identify complexities or methodological questions that may arise during the process of establishing inter-coder reliability, which are not often addressed in empirical publications. To achieve this purposes, in this paper we will ground our work in a review of qualitative articles, published in the Journal of Engineering Education, that have employed inter-rater or inter-coder reliability as evidence of research validity. In our review, we will examine the disparate measures and scores (from 40% agreement to 97% agreement) used as evidence of quality, as well as the theoretical perspectives within which these measures have been employed. Then, using our own comparative case study research as an example, we will highlight the questions and the challenges that we faced as we worked to meet rigorous standards of evidence in our qualitative coding analysis, We will explain the processes we undertook and the challenges we faced as we assigned codes to a large qualitative data set approached from a post positivist perspective. We will situate these coding processes within the larger methodological literature and, in light of contrasting literature, we will describe the principled decisions we made while coding our own data. We will use this review of qualitative research and our own qualitative research experiences to elucidate inconsistencies and unarticulated issues related to evidence for qualitative validity as a means to generate further discussion regarding quality in qualitative coding processes.« less
  3. High levels of stress and anxiety are common amongst college students, particularly engineering students. Students report lack of sleep, grades, competition, change in lifestyle, and other significant stressors throughout their undergraduate education (1, 2). Stress and anxiety have been shown to negatively impact student experience (3-6), academic performance (6-8), and retention (9). Previous studies have focused on identifying factors that cause individual students stress while completing undergraduate engineering degree programs (1). However, it not well-understood how a culture of stress is perceived and is propagated in engineering programs or how this culture impacts student levels of identification with engineering. Further, the impact of student stress has not been directly considered in engineering regarding recruitment, retention, and success. Therefore, our guiding research question is: Does the engineering culture create stress for students that hinder their engineering identity development? To answer our research question, we designed a sequential mixed methods study with equal priority of quantitative survey data and qualitative individual interviews. Our study participants are undergraduate engineering students across all levels and majors at a large, public university. Our sample goal is 2000 engineering student respondents. We combined three published surveys to build our quantitative data collection instrument, including the Depressionmore »Anxiety Stress Scales (DASS), Identification with engineering subscale, and Engineering Department Inclusion Level subscale. The objective of the quantitative instrument is to illuminate individual perceptions of the existence of an engineering stress culture (ESC) and create an efficient tool to measure the impact ESC on engineering identity development. Specifically, we seek to understand the relationships among the following constructs; 1) identification with engineering, 2) stress and anxiety, and 3) feelings of inclusion within their department. The focus of this paper presents the results of the pilot of the proposed instrument with 20 participants and a detailed data collection and analysis process. In an effort to validate our instrument, we conducted a pilot study to refine our data collection process and the results will guide the data collection for the larger study. In addition to identifying relationships among construct, the survey data will be further analyzed to specify which demographics are mediating or moderating factors of these relationships. For example, does a student’s 1st generation status influence their perception of stress or engineering identity development? Our analysis may identify discipline-specific stressors and characterize culture components that promote student anxiety and stress. Our objective is to validate our survey instrument and use it to inform the protocol for the follow-up interviews to gain a deeper understanding of the responses to the survey instrument. Understanding what students view as stressful and how students identify stress as an element of program culture will support the development of interventions to mitigate student stress. References 1. Schneider L (2007) Perceived stress among engineering students. A Paper Presented at St. Lawrence Section Conference. Toronto, Canada. Retrieved from: www. asee. morrisville. edu. 2. Ross SE, Niebling BC, & Heckert TM (1999) Sources of stress among college students. Social psychology 61(5):841-846. 3. Goldman CS & Wong EH (1997) Stress and the college student. Education 117(4):604-611. 4. Hudd SS, et al. (2000) Stress at college: Effects on health habits, health status and self-esteem. College Student Journal 34(2):217-228. 5. Macgeorge EL, Samter W, & Gillihan SJ (2005) Academic Stress, Supportive Communication, and Health A version of this paper was presented at the 2005 International Communication Association convention in New York City. Communication Education 54(4):365-372. 6. Burt KB & Paysnick AA (2014) Identity, stress, and behavioral and emotional problems in undergraduates: Evidence for interaction effects. Journal of college student development 55(4):368-384. 7. Felsten G & Wilcox K (1992) Influences of stress and situation-specific mastery beliefs and satisfaction with social support on well-being and academic performance. Psychological Reports 70(1):291-303. 8. Pritchard ME & Wilson GS (2003) Using emotional and social factors to predict student success. Journal of college student development 44(1):18-28. 9. Zhang Z & RiCharde RS (1998) Prediction and Analysis of Freshman Retention. AIR 1998 Annual Forum Paper.« less
  4. The purpose of this working group is to bring together scholars with an interest in examining the research on quantitative tools and measures for gathering meaningful data, and to spark conversations and collaboration across individuals and groups with an interest in synthesizing the literature on large-scale tools used to measure student- and teacher-related outcomes. While syntheses of measures for use in mathematics education can be found in the literature, few can be described as a comprehensive analysis. The working group session will focus on (1) defining terms identified as critical (e.g., large-scale, quantitative, and validity evidence) for bounding the focus of the group, (2) initial development of a document of available tools and their associated validity evidence, and (3) identification of potential follow-up activities to continue the work to identify tools and developed related synthesis documents (e.g., the formation of sub-groups around potential topics of interest). The efforts of the group will be summarized and extended through both social media tools (e.g., creating a Facebook group) and online collaboration tools (e.g., Google hangouts and documents) to further promote this work.
  5. Informal learning institutions, such as museums, science centers, and community-based organizations, play a critical role in providing opportunities for students to engage in science, technology, engineering, and mathematics (STEM) activities during out-of-school time hours. In recent years, thousands of studies, evaluations, and conference proceedings have been published measuring the impact that these programs have had on their participants. However, because studies of informal science education (ISE) programs vary considerably in how they are designed and in the quality of their designs, it is often quite difficult to assess their impact on participants. Knowing whether the outcomes reported by these studies are supported with sufficient evidence is important not only for maximizing participant impact, but also because there are considerable economic and human resources invested to support informal learning initiatives. To address this problem, I used the theories of impact analysis and triangulation as a framework for developing user-friendly rubrics for assessing quality of research designs and evidence of impact. I used two main sources, research-based recommendations from STEM governing bodies and feedback from a focus group, to identify criteria indicative of high-quality STEM research and study design. Accordingly, I developed three STEM Research Design Rubrics, one for quantitative studies, onemore »for qualitative studies, and another for mixed methods studies, that can be used by ISE researchers, practitioners, and evaluators to assess research design quality. Likewise, I developed three STEM Impact Rubrics, one for quantitative studies, one for qualitative studies, and another for mixed methods studies, that can be used by ISE researchers, practitioners, and evaluators to assess evidence of outcomes. The rubrics developed in this study are practical tools that can be used by ISE researchers, practitioners, and evaluators to improve the field of informal science learning by increasing the quality of study design and for discerning whether studies or program evaluations are providing sufficient evidence of impact.« less