skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Practices in instrument use and development in chemistry education research and practice 2010–2021
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral,etc.) of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the validity and reliability of data collected with these instruments following guidance from the Standards for Educational and Psychological Testing. This study examines how quantitative instruments have been used in the journalChemistry Education Research and Practice(CERP) from 2010–2021. Of the 369 unique researcher-developed instruments used during this time frame, the majority only appeared in a single publication (89.7%) and were rarely reused. Cognitive topics were the most common target of the instruments (56.6%). Validity and/or reliability evidence was provided in 64.4% of instances where instruments were used inCERPpublications. The most frequently reported evidence was single administration reliability (e.g., coefficient alpha), appearing in 47.9% of instances. Only 37.2% of instances reported evidence of both validity and reliability. These results indicate that, as a field, opportunities exist to increase the amount of validity and reliability evidence available for data collected with instruments and that reusing instruments may be one method of increasing this type of data quality evidence for instruments used by the chemistry education community.  more » « less
Award ID(s):
1914996
PAR ID:
10566301
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Chemistry Education Research and Practice
Date Published:
Journal Name:
Chemistry Education Research and Practice
Volume:
24
Issue:
3
ISSN:
1109-4028
Page Range / eLocation ID:
882 to 895
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Although the paradigm wars between quantitative and qualitative research methods and the associated epistemologies may have settled down in recent years within the mathematics education research community, the high value placed on quantitative methods and randomized control trials remain as the gold standard at the policy-making level (USDOE, 2008). Although diverse methods are valued in the mathematics education community, if mathematics educators hope to influence policy to cultivate more equitable education systems, then we must engage in rigorous quantitative research. However, quantitative research is limited in what it can measure by the quantitative tools that exist. In mathematics education, it seems as though the development of quantitative tools and studying their associated validity and reliability evidence has lagged behind the important constructs that rich qualitative research has uncovered. The purpose of this study is to describe quantitative instruments related to mathematics teacher behavior and affect in order to better understand what currently exists in the field, what validity and reliability evidence has been published for such instruments, and what constructs each measure. 1. How many and what types of instruments of mathematics teacher behavior and affect exist? 2. What types of validity and reliability evidence are published for these instruments? 3. What constructs do these instruments measure? 4. To what extent have issues of equity been the focus of the instruments found? 
    more » « less
  2. In this theory paper, we set out to consider, as a matter of methodological interest, the use of quantitative measures of inter-coder reliability (e.g., percentage agreement, correlation, Cohen’s Kappa, etc.) as necessary and/or sufficient correlates for quality within qualitative research in engineering education. It is well known that the phrase qualitative research represents a diverse body of scholarship conducted across a range of epistemological viewpoints and methodologies. Given this diversity, we concur with those who state that it is ill advised to propose recipes or stipulate requirements for achieving qualitative research validity and reliability. Yet, as qualitative researchers ourselves, we repeatedly find the need to communicate the validity and reliability—or quality—of our work to different stakeholders, including funding agencies and the public. One method for demonstrating quality, which is increasingly used in qualitative research in engineering education, is the practice of reporting quantitative measures of agreement between two or more people who code the same qualitative dataset. In this theory paper, we address this common practice in two ways. First, we identify instances in which inter-coder reliability measures may not be appropriate or adequate for establishing quality in qualitative research. We query research that suggests that the numerical measure itself is the goal of qualitative analysis, rather than the depth and texture of the interpretations that are revealed. Second, we identify complexities or methodological questions that may arise during the process of establishing inter-coder reliability, which are not often addressed in empirical publications. To achieve this purposes, in this paper we will ground our work in a review of qualitative articles, published in the Journal of Engineering Education, that have employed inter-rater or inter-coder reliability as evidence of research validity. In our review, we will examine the disparate measures and scores (from 40% agreement to 97% agreement) used as evidence of quality, as well as the theoretical perspectives within which these measures have been employed. Then, using our own comparative case study research as an example, we will highlight the questions and the challenges that we faced as we worked to meet rigorous standards of evidence in our qualitative coding analysis, We will explain the processes we undertook and the challenges we faced as we assigned codes to a large qualitative data set approached from a post positivist perspective. We will situate these coding processes within the larger methodological literature and, in light of contrasting literature, we will describe the principled decisions we made while coding our own data. We will use this review of qualitative research and our own qualitative research experiences to elucidate inconsistencies and unarticulated issues related to evidence for qualitative validity as a means to generate further discussion regarding quality in qualitative coding processes. 
    more » « less
  3. Abstract In contrast to traditional views of instructional design that are often focused on content development, researchers are increasingly exploring learning experience design (LXD) perspectives as a way to espouse a broader and more holistic view of learning. In addition to cognitive and affective perspectives, LXD includes perspectives on human–computer interaction that consist of usability and other interactions (ie—goal-directed user behavior). However, there is very little consensus about the quantitative instruments and surveys used by individuals to assess how learners interact with technology. This systematic review explored 627 usability studies in learning technology over the last decade in terms of the instruments (RQ1), domains (RQ2), and number of users (RQ3). Findings suggest that many usability studies rely on self-created instruments, which leads to questions about reliability and validity. Moreover, additional research suggests usability studies are largely focused within the medical and STEM domains, with very little focus on educators' perspectives (pre-service, in-service teachers). Implications for theory and practice are discussed. 
    more » « less
  4. National Science Foundation (NSF) funded Engineering Research Centers (ERC) must complement their technical research with various education and outreach opportunities to: 1) improve and promote engineering education, both within the center and to the local community; 2) encourage and include the underrepresented populations to participate in Engineering activities; and 3) advocate communication and collaboration between industry and academia. ERCs ought to perform an adequate evaluation of their educational and outreach programs to ensure that beneficial goals are met. Each ERC has complete autonomy in conducting and reporting such evaluation. Evaluation tools used by individual ERCs are quite similar, but each ERC has designed their evaluation processes in isolation, including evaluation tools such as survey instruments, interview protocols, focus group protocols, and/or observation protocols. These isolated efforts resulted in redundant resources spent and lacking outcome comparability across ERCs. Leaders from three different ERCs led and initiated a collaborative effort to address the above issue by building a suite of common evaluation instruments that all current and future ERCs can use. This leading group consists of education directors and external evaluators from all three partners ERCs and engineering education researchers, who have worked together for two years. The project intends to address the four ERC program clusters: Broadening Participation in Engineering, Centers and Networks, Engineering Education, and Engineering Workforce Development. The instruments developed will pay attention to culture of inclusion, outreach activities, mentoring experience, and sustained interest in engineering. The project will deliver best practices in education program evaluation, which will not only support existing ERCs, but will also serve as immediate tools for brand new ERCs and similar large-scale research centers. Expanding the research beyond TEEC and sharing the developed instruments with NSF as well as other ERCs will also promote and encourage continual cross-ERC collaboration and research. Further, the joint evaluation will increase the evaluation consistency across all ERC education programs. Embedded instrumental feedback loops will lead to continual improvement to ERC education performance and support the growth of an inclusive and innovative engineering workforce. Four major deliveries are planned. First, develop a common quantitative assessment instrument, named Multi-ERC Instrument Inventory (MERCII). Second, develop a set of qualitative instruments to complement MERCII. Third, create a web-based evaluation platform for MERCII. Fourth, update the NSF ERC education program evaluation best practice manual. These deliveries together will become part of and supplemented by an ERC evaluator toolbox. This project strives to significantly impact how ERCs evaluate their educational and outreach programs. Single ERC based studies lack the sample size to truly test the validity of any evaluation instruments or measures. A common suite of instruments across ERCs would provide an opportunity for a large scale assessment study. The online platform will further provide an easy-to-use tool for all ERCs to facilitate evaluation, share data, and reporting impacts. 
    more » « less
  5. Vollstedt, M. (Ed.)
    Can fostering mathematical creativity explicitly in a calculus I course impact students’ mathematical identity? As a part of a larger research project exploring this question, a quantitative research study was developed to explore six aspects of student mathematical identity along with student perception of creativity-fostering instructor behavior. Analysis of pre- and post-semester survey data indicated that the instruments measuring aspects of student identity had strong reliability and good structure validity. Correlational analysis of the six aspects of student identity provided evidence that students’ views of mathematics as a creative endeavor impacted the formation of self-efficacy in mathematics. The instrument measuring creativity-fostering instruction demonstrated low reliability and internal inconsistencies. Methodological issues related to measuring creativity-fostering instruction and directions for future research studying creativity-fostering and student identity are discussed. 
    more » « less