skip to main content


Title: Argumentation Surrounding Argument‐Based Validation: A Systematic Review of Validation Methodology in Peer‐Reviewed Articles
Abstract

Since it was formalized by Kane, the argument‐based approach to validation has been promoted as the preferred method for validating interpretations and uses of test scores. Because validation is discussed in terms of arguments, and arguments are both interactive and social, the present review systematically examines the scholarly arguments which appear in 83 papers on argument‐based validation methods published in peer‐reviewed journals. Findings suggest that scholars generally agree on the nature and importance of argument‐based validation but disagree on whether validation should be structured or unstructured, formal or informal. Implications are discussed, including promotion of theStandards for Educational and Psychological Testing(AERA, APA, and NCME) as a foundation for consensus in the field.

 
more » « less
NSF-PAR ID:
10378874
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Educational Measurement: Issues and Practice
Volume:
39
Issue:
4
ISSN:
0731-1745
Page Range / eLocation ID:
p. 116-130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. As early as Descartes (1637/1970), logic and reason have been positioned as tools for individuals to advance their own understanding. By contrast, argumentation is an interactive, social exercise used for persuasion, collective cognition, and to advance shared knowledge (Mercier & Sperber, 2011, 2017). When one advances an argument, subjects it to the tests and challenges of others, and responds to questions and counterarguments, one’s thinking improves (Mercier & Sperber, 2017). Through argumentation, groups produce correct solutions more often than individuals (Moshman & Geil, 1998) and individual accuracy improves as well (Castelain, Girotto, Jamet, & Mercier, 2016). Since it was formally introduced by Kane (1990, 1992), the argument-based approach to validation has been promoted in the field of educational and psychological measurement as the preferred method for validating interpretations and uses of test scores (AERA, APA, & NCME, 2014; Kane, 2013; Schilling & Hill, 2007). Scholars continue to debate the best approaches for developing and supporting validity arguments, however (for examples, see Brennan, 2013; Kane, 2007). 
    more » « less
  2. This paper provides a brief introduction to the set of four manuscripts in the special issue. To provide a foundation for the issue, key terms are defined, a brief historical overview of validity is provided, and a description of several different validation approaches used in the issue are explained. Finally, the contribution of the manuscripts to further articulating argument-based validation approaches is discussed, along with questions for the field to consider. 
    more » « less
  3. Abstract

    The ability to analyze arguments is critical for higher-level reasoning, yet previous research suggests that standard university education provides only modest improvements in students’ analytical-reasoning abilities. What pedagogical approaches are most effective for cultivating these skills? We investigated the effectiveness of a 12-week undergraduate seminar in which students practiced a software-based technique for visualizing the logical structures implicit in argumentative texts. Seminar students met weekly to analyze excerpts from contemporary analytic philosophy papers, completed argument visualization problem sets, and received individualized feedback on a weekly basis. We found that seminar students improved substantially more on LSAT Logical Reasoning test forms than did control students (d = 0.71, 95% CI: [0.37, 1.04],p < 0.001), suggesting that learning how to visualize arguments in the seminar led to large generalized improvements in students’ analytical-reasoning skills. Moreover, blind scoring of final essays from seminar students and control students, drawn from a parallel lecture course, revealed large differences in favor of seminar students (d = 0.87, 95% CI: [0.26, 1.48],p = 0.005). Seminar students understood the arguments better, and their essays were more accurate and effectively structured. Taken together, these findings deepen our understanding of how visualizations support logical reasoning and provide a model for improving analytical-reasoning pedagogy.

     
    more » « less
  4. Abstract

    We study the performance of Markov chains for theq-state ferromagnetic Potts model on random regular graphs. While the cases of the grid and the complete graph are by now well-understood, the case of random regular graphs has resisted a detailed analysis and, in fact, even analysing the properties of the Potts distribution has remained elusive. It is conjectured that the performance of Markov chains is dictated by metastability phenomena, i.e., the presence of “phases” (clusters) in the sample space where Markov chains with local update rules, such as the Glauber dynamics, are bound to take exponential time to escape, and therefore cause slow mixing. The phases that are believed to drive these metastability phenomena in the case of the Potts model emerge as local, rather than global, maxima of the so-called Bethe functional, and previous approaches of analysing these phases based on optimisation arguments fall short of the task. Our first contribution is to detail the emergence of the two relevant phases for theq-state Potts model on thed-regular random graph for all integers$$q,d\ge 3$$q,d3, and establish that for an interval of temperatures, delineated by the uniqueness and a broadcasting threshold on thed-regular tree, the two phases coexist (as possible metastable states). The proofs are based on a conceptual connection between spatial properties and the structure of the Potts distribution on the random regular graph, rather than complicated moment calculations. This significantly refines earlier results by Helmuth, Jenssen, and Perkins who had established phase coexistence for a small interval around the so-called ordered-disordered threshold (via different arguments) that applied for largeqand$$d\ge 5$$d5. Based on our new structural understanding of the model, our second contribution is to obtain metastability results for two classical Markov chains for the Potts model. We first complement recent fast mixing results for Glauber dynamics by Blanca and Gheissari below the uniqueness threshold, by showing an exponential lower bound on the mixing time above the uniqueness threshold. Then, we obtain tight results even for the non-local and more elaborate Swendsen–Wang chain, where we establish slow mixing/metastability for the whole interval of temperatures where the chain is conjectured to mix slowly on the random regular graph. The key is to bound the conductance of the chains using a random graph “planting” argument combined with delicate bounds on random-graph percolation.

     
    more » « less
  5. Abstract

    13C‐Metabolic Flux Analysis (13C‐MFA) and Flux Balance Analysis (FBA) are widely used to investigate the operation of biochemical networks in both biological and biotechnological research. Both methods use metabolic reaction network models of metabolism operating at steady state so that reaction rates (fluxes) and the levels of metabolic intermediates are constrained to be invariant. They provide estimated (MFA) or predicted (FBA) values of the fluxes through the network in vivo, which cannot be measured directly. These fluxes can shed light on basic biology and have been successfully used to inform metabolic engineering strategies. Several approaches have been taken to test the reliability of estimates and predictions from constraint‐based methods and to compare alternative model architectures. Despite advances in other areas of the statistical evaluation of metabolic models, such as the quantification of flux estimate uncertainty, validation and model selection methods have been underappreciated and underexplored. We review the history and state‐of‐the‐art in constraint‐based metabolic model validation and model selection. Applications and limitations of the χ2‐test of goodness‐of‐fit, the most widely used quantitative validation and selection approach in 13C‐MFA, are discussed, and complementary and alternative forms of validation and selection are proposed. A combined model validation and selection framework for 13C‐MFA incorporating metabolite pool size information that leverages new developments in the field is presented and advocated for. Finally, we discuss how adopting robust validation and selection procedures can enhance confidence in constraint‐based modeling as a whole and ultimately facilitate more widespread use of FBA in biotechnology.

     
    more » « less