skip to main content


Title: An Inquiry into the Use of Intercoder Reliability Measures in Qualitative Research
In this theory paper, we set out to consider, as a matter of methodological interest, the use of quantitative measures of inter-coder reliability (e.g., percentage agreement, correlation, Cohen’s Kappa, etc.) as necessary and/or sufficient correlates for quality within qualitative research in engineering education. It is well known that the phrase qualitative research represents a diverse body of scholarship conducted across a range of epistemological viewpoints and methodologies. Given this diversity, we concur with those who state that it is ill advised to propose recipes or stipulate requirements for achieving qualitative research validity and reliability. Yet, as qualitative researchers ourselves, we repeatedly find the need to communicate the validity and reliability—or quality—of our work to different stakeholders, including funding agencies and the public. One method for demonstrating quality, which is increasingly used in qualitative research in engineering education, is the practice of reporting quantitative measures of agreement between two or more people who code the same qualitative dataset. In this theory paper, we address this common practice in two ways. First, we identify instances in which inter-coder reliability measures may not be appropriate or adequate for establishing quality in qualitative research. We query research that suggests that the numerical measure itself is the goal of qualitative analysis, rather than the depth and texture of the interpretations that are revealed. Second, we identify complexities or methodological questions that may arise during the process of establishing inter-coder reliability, which are not often addressed in empirical publications. To achieve this purposes, in this paper we will ground our work in a review of qualitative articles, published in the Journal of Engineering Education, that have employed inter-rater or inter-coder reliability as evidence of research validity. In our review, we will examine the disparate measures and scores (from 40% agreement to 97% agreement) used as evidence of quality, as well as the theoretical perspectives within which these measures have been employed. Then, using our own comparative case study research as an example, we will highlight the questions and the challenges that we faced as we worked to meet rigorous standards of evidence in our qualitative coding analysis, We will explain the processes we undertook and the challenges we faced as we assigned codes to a large qualitative data set approached from a post positivist perspective. We will situate these coding processes within the larger methodological literature and, in light of contrasting literature, we will describe the principled decisions we made while coding our own data. We will use this review of qualitative research and our own qualitative research experiences to elucidate inconsistencies and unarticulated issues related to evidence for qualitative validity as a means to generate further discussion regarding quality in qualitative coding processes.  more » « less
Award ID(s):
1664228
NSF-PAR ID:
10089476
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ASEE Annual Conference proceedings
ISSN:
1524-4644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Who and by what means do we ensure that engineering education evolves to meet the ever changing needs of our society? This and other papers presented by our research team at this conference offer our initial set of findings from an NSF sponsored collaborative study on engineering education reform. Organized around the notion of higher education governance and the practice of educational reform, our open-ended study is based on conducting semi-structured interviews at over three dozen universities and engineering professional societies and organizations, along with a handful of scholars engaged in engineering education research. Organized as a multi-site, multi-scale study, our goal is to document differences in perspectives and interest the exist across organizational levels and institutions, and to describe the coordination that occurs (or fails to occur) in engineering education given the distributed structure of the engineering profession. This paper offers for all engineering educators and administrators a qualitative and retrospective analysis of ABET EC 2000 and its implementation. The paper opens with a historical background on the Engineers Council for Professional Development (ECPD) and engineering accreditation; the rise of quantitative standards during the 1950s as a result of the push to implement an engineering science curriculum appropriate to the Cold War era; EC 2000 and its call for greater emphasis on professional skill sets amidst concerns about US manufacturing productivity and national competitiveness; the development of outcomes assessment and its implementation; and the successive negotiations about assessment practice and the training of both of program evaluators and assessment coordinators for the degree programs undergoing evaluation. It was these negotiations and the evolving practice of assessment that resulted in the latest set of changes in ABET engineering accreditation criteria (“1-7” versus “a-k”). To provide an insight into the origins of EC 2000, the “Gang of Six,” consisting of a group of individuals loyal to ABET who used the pressure exerted by external organizations, along with a shared rhetoric of national competitiveness to forge a common vision organized around the expanded emphasis on professional skill sets. It was also significant that the Gang of Six was aware of the fact that the regional accreditation agencies were already contemplating a shift towards outcomes assessment; several also had a background in industrial engineering. However, this resulted in an assessment protocol for EC 2000 that remained ambiguous about whether the stated learning outcomes (Criterion 3) was something faculty had to demonstrate for all of their students, or whether EC 2000’s main emphasis was continuous improvement. When it proved difficult to demonstrate learning outcomes on the part of all students, ABET itself began to place greater emphasis on total quality management and continuous process improvement (TQM/CPI). This gave institutions an opening to begin using increasingly limited and proximate measures for the “a-k” student outcomes as evidence of effort and improvement. In what social scientific terms would be described as “tactical” resistance to perceived oppressive structures, this enabled ABET coordinators and the faculty in charge of degree programs, many of whom had their own internal improvement processes, to begin referring to the a-k criteria as “difficult to achieve” and “ambiguous,” which they sometimes were. Inconsistencies in evaluation outcomes enabled those most discontented with the a-k student outcomes to use ABET’s own organizational processes to drive the latest revisions to EAC accreditation criteria, although the organization’s own process for member and stakeholder input ultimately restored much of the professional skill sets found in the original EC 2000 criteria. Other refinements were also made to the standard, including a new emphasis on diversity. This said, many within our interview population believe that EC 2000 had already achieved much of the changes it set out to achieve, especially with regards to broader professional skills such as communication, teamwork, and design. Regular faculty review of curricula is now also a more routine part of the engineering education landscape. While programs vary in their engagement with ABET, there are many who are skeptical about whether the new criteria will produce further improvements to their programs, with many arguing that their own internal processes are now the primary drivers for change. 
    more » « less
  2. Research prior to 2005 found that no single framework existed that could capture the engineering design process fully or well and benchmark each element of the process to a commonly accepted set of referenced artifacts. Compounding the construction of a stepwise, artifact driven framework is that engineering design is typically practiced over time as a complex and iterative process. For both novice and advanced students, learning and applying the design process is often cumulative, with many informal and formal programmatic opportunities to practice essential elements. The Engineering Design Process Portfolio Scoring Rubric (EDPPSR) was designed to apply to any portfolio that is intended to document an individual or team driven process leading to an original attempt to design a product, process, or method to provide the best and most optimal solution to a genuine and meaningful problem. In essence, the portfolio should be a detailed account or “biography” of a project and the thought processes that inform that project. Besides narrative and explanatory text, entries may include (but need not be limited to) drawings, schematics, photographs, notebook and journal entries, transcripts or summaries of conversations and interviews, and audio/video recordings. Such entries are likely to be necessary in order to convey accurately and completely the complex thought processes behind the planning, implementation, and self-evaluation of the project. The rubric is comprised of four main components, each in turn comprised of three elements. Each element has its own holistic rubric. The process by which the EDPPSR was created gives evidence of the relevance and representativeness of the rubric and helps to establish validity. The EDPPSR model as originally rendered has a strong theoretical foundation as it has been developed by reference to the literature on the steps of the design process through focus groups and through expert review by teachers, faculty and researchers in performance based, portfolio rubrics and assessments. Using the unified construct validity framework, the EDDPSR’s validity was further established through expert reviewers (experts in engineering design) providing evidence supporting the content relevance and representativeness of the EDPPSR in representing the basic process of engineering design. This manuscript offers empirical evidence that supports the use of the EDPPSR model to evaluate student design-based projects in a reliable and valid manner. Intra-class correlation coefficients (ICC) were calculated to determine the inter-rater reliability (IRR) of the rubric. Given the small sample size we also examined confidence intervals (95%) to provide a range of values in which the estimate of inter-reliability is likely contained. 
    more » « less
  3. Abstract Background

    The use of metacognition is critical to learning, especially in fields such as engineering that involve problem‐solving and difficult conceptual material. Due to limitations with current methodological approaches, new methods are needed to investigate engineering students' metacognitive engagement in learning situations that are self‐directed, such as study groups.

    Purpose

    Our purpose was to develop an approach to investigate the metacognitive engagement of undergraduate engineering students in self‐directed learning environments. The Naturalistic Observations of Metacognition in Engineering (NOME) Observational Protocol and Coding Strategy is a qualitative data collection method that allows researchers to observe the behaviors of students who are studying in groups to determine the student's engagement in different metacognitive practices. The NOME is intended to be used by researchers interested in studying online metacognitive behaviors without the direct interference of a methodological approach.

    Design/Method

    We observed three study groups where students were working on an engineering problem‐solving homework assignment. Using a taxonomic definition of metacognition, we coded episodes of observation transcripts to identify behaviors that represented key definitions in the taxonomy.

    Results

    We combined subcodes and descriptions of behaviors with key definitions to develop a coding strategy useful for future observational studies. Evidence of intercoder agreement and agreement in unitizing indicates that the coding strategy can reliably be used by multiple trained coders to identify metacognitive engagement.

    Conclusions

    The reliability evidence shows that the NOME may be a useful tool for researchers in engineering education interested in studying the metacognitive habits of engineering students in self‐directed study.

     
    more » « less
  4. We outline a process for using large coder teams (10 + coders) to code large-scale qualitative data sets. The process reflects experience recruiting and managing large teams of novice and trainee coders for 18 projects in the last decade, each engaging a coding team of 12 (minimum) to 54 (maximum) coders. We identify four unique challenges to large coder teams that are not presently discussed in the methodological literature: (1) recruiting and training coders, (2) providing coder compensation and incentives, (3) maintaining data quality and ensuring coding reliability at scale, and (4) building team cohesion and morale. For each challenge, we provide associated guidance. We conclude with a discussion of advantages and disadvantages of large coder teams for qualitative research and provide notes of caution for anyone considering hiring and/or managing large coder teams for research (whether in academia, government and non-profit sectors, or industry). 
    more » « less
  5. Abstract Background

    The field of engineering education research is adopting an increasingly diverse range of qualitative methods. These developments necessitate a coherent language and conceptual framework to critically engage with questions of qualitative research quality.

    Purpose/Hypothesis

    This article advances discussions of qualitative research quality through sharing and analyzing a methodologically diverse, practice‐based exploration of research quality in the context of five engineering education research studies.

    Design/Method

    As a group of seven engineering education researchers, we drew on the collaborative inquiry method to systematically examine questions of qualitative research quality in our everyday research practice. We used a process‐based, theoretical framework for research quality as the anchor for these explorations.

    Results

    We constructed five practice explorations spanning grounded theory, interpretative phenomenological analysis, and various forms of narrative inquiry. Examining the individual contributions as a whole yielded four key insights: quality challenges require examination from multiple theoretical lenses; questions of research quality are implicitly infused in research practice; research quality extends beyond the objects, procedures, and products of research to concern the human context and local research setting; and research quality lies at the heart of introducing novices to interpretive research.

    Conclusions

    This study demonstrates the potential and further need for the engineering education community to advance methodological theory through purposeful and reflective engagement in research practice across the diverse methodological approaches currently being adopted.

     
    more » « less