skip to main content


This content will become publicly available on July 1, 2024

Title: Automated Search for Logistic Knowledge Tracing Models
This paper presents a tool for creating student models in logistic regression. Creating student models has typically been done by expert selection of the appropriate terms, beginning with models as simple as IRT or AFM but more recently with highly complex models like BestLR. While alternative methods exist to select the appropriate predictors for the regression-based models (e.g., step-wise selection or LASSO), we are unaware of their application to student modeling. Such automatic methods of model creation offer the possibility of better student models with either reduced complexity or better fit, in addition to relieving experts from the burden of searching for better models by hand with possible human error. Our tool builds on top of the preexisting R package LKT. We explain our search methods with two datasets demonstrating the advantages of using the tool with stepwise regression and regularization (LASSO) methods to aid in feature selection. For the stepwise method using BIC, the models are simpler (due to the BIC penalty for parameters) than alternatives like BestLR with little lack of fit. For the LASSO method, the models can be made simpler due to the fitting procedure involving a regularization parameter that penalizes large absolute coefficient values. However, LASSO also offers the possibility of highly complex models with exceptional fit.  more » « less
Award ID(s):
1934745
NSF-PAR ID:
10447581
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the 16th International Conference on Educational Data Mining, 17–27
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Modern technologies are producing a wealth of data with complex structures. For instance, in two-dimensional digital imaging, flow cytometry and electroencephalography, matrix-type covariates frequently arise when measurements are obtained for each combination of two underlying variables. To address scientific questions arising from those data, new regression methods that take matrices as covariates are needed, and sparsity or other forms of regularization are crucial owing to the ultrahigh dimensionality and complex structure of the matrix data. The popular lasso and related regularization methods hinge on the sparsity of the true signal in terms of the number of its non-zero coefficients. However, for the matrix data, the true signal is often of, or can be well approximated by, a low rank structure. As such, the sparsity is frequently in the form of low rank of the matrix parameters, which may seriously violate the assumption of the classical lasso. We propose a class of regularized matrix regression methods based on spectral regularization. A highly efficient and scalable estimation algorithm is developed, and a degrees-of-freedom formula is derived to facilitate model selection along the regularization path. Superior performance of the method proposed is demonstrated on both synthetic and real examples.

     
    more » « less
  2. Integrating regularization methods with standard loss functions such as the least squares, hinge loss, etc., within a regression framework has become a popular choice for researchers to learn predictive models with lower variance and better generalization ability. Regularizers also aid in building interpretable models with high-dimensional data which makes them very appealing. It is observed that each regularizer is uniquely formulated in order to capture data-specific properties such as correlation, structured sparsity and temporal smoothness. The problem of obtaining a consensus among such diverse regularizers while learning a predictive model is extremely important in order to determine the optimal regularizer for the problem. The advantage of such an approach is that it preserves the simplicity of the final model learned by selecting a single candidate model which is not the case with ensemble methods as they use multiple candidate models for prediction. This is called the consensus regularization problem which has not received much attention in the literature due to the inherent difficulty associated with learning and selecting a model from an integrated regularization framework. To solve this problem, in this paper, we propose a method to generate a committee of non-convex regularized linear regression models, and use a consensus criterion to determine the optimal model for prediction. Each corresponding non-convex optimization problem in the committee is solved efficiently using the cyclic-coordinate descent algorithm with the generalized thresholding operator. Our Consensus RegularIzation Selection based Prediction (CRISP) model is evaluated on electronic health records (EHRs) obtained from a large hospital for the congestive heart failure readmission prediction problem. We also evaluate our model on high-dimensional synthetic datasets to assess its performance. The results indicate that CRISP outperforms several state-of-the-art methods such as additive, interactions-based and other competing non-convex regularized linear regression methods. 
    more » « less
  3. Abstract

    In statistics, the least absolute shrinkage and selection operator (Lasso) is a regression method that performs both variable selection and regularization. There is a lot of literature available, discussing the statistical properties of the regression coefficients estimated by the Lasso method. However, there lacks a comprehensive review discussing the algorithms to solve the optimization problem in Lasso. In this review, we summarize five representative algorithms to optimize the objective function in Lasso, including iterative shrinkage threshold algorithm (ISTA), fast iterative shrinkage‐thresholding algorithms (FISTA), coordinate gradient descent algorithm (CGDA), smooth L1 algorithm (SLA), and path following algorithm (PFA). Additionally, we also compare their convergence rate, as well as their potential strengths and weakness.

    This article is categorized under:

    Statistical Models > Linear Models

    Algorithms and Computational Methods > Numerical Methods

    Algorithms and Computational Methods > Computational Complexity

     
    more » « less
  4. Summary

    We describe a simple, computationally efficient, permutation-based procedure for selecting the penalty parameter in LASSO-penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), scaled sparse linear regression, and a selection method based on recently developed testing procedures for the LASSO.

     
    more » « less
  5. null (Ed.)
    Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method. 
    more » « less