skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Bayesian nonparametric approach for handling item and examinee heterogeneity in assessment data
Abstract We propose a novel nonparametric Bayesian item response theory model that estimates clusters at the question level, while simultaneously allowing for heterogeneity at the examinee level under each question cluster, characterized by a mixture of binomial distributions. The main contribution of this work is threefold. First, we present our new model and demonstrate that it is identifiable under a set of conditions. Second, we show that our model can correctly identify question‐level clusters asymptotically, and the parameters of interest that measure the proficiency of examinees in solving certain questions can be estimated at a rate (up to a log term). Third, we present a tractable sampling algorithm to obtain valid posterior samples from our proposed model. Compared to the existing methods, our model manages to reveal the multi‐dimensionality of the examinees' proficiency level in handling different types of questions parsimoniously by imposing a nested clustering structure. The proposed model is evaluated via a series of simulations as well as apply it to an English proficiency assessment data set. This data analysis example nicely illustrates how our model can be used by test makers to distinguish different types of students and aid in the design of future tests.  more » « less
Award ID(s):
2412922 2412923
PAR ID:
10499298
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
British Journal of Mathematical and Statistical Psychology
Volume:
77
Issue:
1
ISSN:
0007-1102
Page Range / eLocation ID:
196 to 211
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In computer‐based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable‐length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test‐taking behavior, which can inform test development and instructions. In the current study, we used recently proposed statistical learning methods for sequence data to provide an exploratory analysis of item‐level revision and review log data. Based on the revision log data collected from computer‐based classroom assessments, common prototypes of revisit and review behavior were identified. The relationship between revision behavior and various item, test, and individual covariates was further explored under a Bayesian multivariate generalized linear mixed model. 
    more » « less
  2. Question-asking is a crucial learning and teaching approach. It reveals different levels of students' understanding, application, and potential misconceptions. Previous studies have categorized question types into higher and lower orders, finding positive and significant associations between higher-order questions and students' critical thinking ability and their learning outcomes in different learning contexts. However, the diversity of higher-order questions, especially in collaborative learning environments. has left open the question of how they may be different from other types of dialogue that emerge from students' conversations, To address these questions, our study utilized natural language processing techniques to build a model and investigate the characteristics of students' higher-order questions. We interpreted these questions using Bloom's taxonomy, and our results reveal three types of higher-order questions during collaborative problem-solving. Students often use Why, How and What If' questions to I) understand the reason and thought process behind their partners' actions: 2) explore and analyze the project by pinpointing the problem: and 3) propose and evaluate ideas or alternative solutions. In addition. we found dialogue labeled 'Social'. 'Question - other', 'Directed at Agent', and 'Confusion/Help Seeking' shows similar underlying patterns to higher-order questions, Our findings provide insight into the different scenarios driving students' higher-order questions and inform the design of adaptive systems to deliver personalized feedback based on students' questions. 
    more » « less
  3. Videos convey rich information. Dynamic spatio-temporal relationships between people/objects, and diverse multimodal events are present in a video clip. Hence, it is important to develop automated models that can accurately extract such information from videos. Answering questions on videos is one of the tasks which can evaluate such AI abilities. In this paper, we propose a video question answering model which effectively integrates multi-modal input sources and finds the temporally relevant information to answer questions. Specifically, we first employ dense image captions to help identify objects and their detailed salient regions and actions, and hence give the model useful extra information (in explicit textual format to allow easier matching) for answering questions. Moreover, our model is also comprised of dual-level attention (word/object and frame level), multi-head self/cross-integration for different sources (video and dense captions), and gates which pass more relevant information to the classifier. Finally, we also cast the frame selection problem as a multi-label classification task and introduce two loss functions, In-andOut Frame Score Margin (IOFSM) and Balanced Binary Cross-Entropy (BBCE), to better supervise the model with human importance annotations. We evaluate our model on the challenging TVQA dataset, where each of our model components provides significant gains, and our overall model outperforms the state-of-the-art by a large margin (74.09% versus 70.52%). We also present several word, object, and frame level visualization studies. 
    more » « less
  4. Current textual question answering (QA) models achieve strong performance on in-domain test sets, but often do so by fitting surface-level patterns, so they fail to generalize to out-of-distribution settings. To make a more robust and understandable QA system, we model question answering as an alignment problem. We decompose both the question and context into smaller units based on off-the-shelf semantic representations (here, semantic roles), and align the question to a subgraph of the context in order to find the answer. We formulate our model as a structured SVM, with alignment scores computed via BERT, and we can train end-to-end despite using beam search for approximate inference. Our use of explicit alignments allows us to explore a set of constraints with which we can prohibit certain types of bad model behavior arising in cross-domain settings. Furthermore, by investigating differences in scores across different potential answers, we can seek to understand what particular aspects of the input lead the model to choose the answer without relying on post-hoc explanation techniques. We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets. The results show that our model is more robust than the standard BERT QA model, and constraints derived from alignment scores allow us to effectively trade off coverage and accuracy. 
    more » « less
  5. Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Question difficulty and user expertise follow the characteristic of asymmetric transitivity. Maintaining such properties, while reducing the graph to a lower dimensional vector embedding space, has been the focus of much recent research. In this paper, we tackle the challenge of directed graph embedding with asymmetric transitivity preservation and then leverage the proposed embedding method to solve a fundamental task in CQAs: how to appropriately route and assign newly posted questions to users with the suitable expertise and interest in CQAs. The technique incorporates graph hierarchy and reachability information naturally by relying on a nonlinear transformation that operates on the core reachability and implicit hierarchy within such graphs. Subsequently, the methodology levers a factorization-based approach to generate two embedding vectors for each node within the graph, to capture the asymmetric transitivity. Extensive experiments show that our framework consistently and significantly outperforms the state-of-the-art baselines on three diverse realworld tasks: link prediction, and question difficulty estimation and expert finding in online forums like Stack Exchange. Particularly, our framework can support inductive embedding learning for newly posted questions (unseen nodes during training), and therefore can properly route and assign these kinds of questions to experts in CQAs. 
    more » « less