skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Survival Analysis based Framework for Early Prediction of Student Dropouts
Retention of students at colleges and universities has been a concern among educators for many decades. The consequences of student attrition are significant for students, academic staffs and the universities. Thus, increasing student retention is a long term goal of any academic institution. The most vulnerable students are the freshman, who are at the highest risk of dropping out at the beginning of their study. Therefore, the early identification of "at-risk'' students is a crucial task that needs to be effectively addressed. In this paper, we develop a survival analysis framework for early prediction of student dropout using Cox proportional hazards regression model (Cox). We also applied time-dependent Cox (TD-Cox), which captures time-varying factors and can leverage those information to provide more accurate prediction of student dropout. For this prediction task, our model utilizes different groups of variables such as demographic, family background, financial, high school information, college enrollment and semester-wise credits. The proposed framework has the ability to address the challenge of predicting dropout students as well as the semester that the dropout will occur. This study enables us to perform proactive interventions in a prioritized manner where limited academic resources are available. This is critical in the student retention problem because not only correctly classifying whether a student is going to dropout is important but also when this is going to happen is crucial for a focused intervention. We evaluate our method on real student data collected at Wayne State University. Results show that the proposed Cox-based framework can predict the student dropouts and semester of dropout with high accuracy and precision compared to the other state-of-the-art methods.  more » « less
Award ID(s):
1527827
PAR ID:
10021820
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Page Range / eLocation ID:
903 to 912
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    A well-established literature base identifies a portion of students enrolled in post-secondary General Chemistry as at-risk of failing the course based on incoming metrics. Learning about the experiences and factors that lead to this higher failure rate is essential toward improving retention in this course. This study examines the relationship between study habits and academic performance for at-risk students in General Chemistry. Students who were in the bottom quartile of SAT math scores were identified as at-risk students. The study habits of General Chemistry students, both those identified as at-risk and those not identified were measured by text message inquiries. The text message asked ‘‘Have you studied for General Chemistry I in the past 48 hours? If so, how did you study?” twice a week throughout a semester. Student responses to the messages were used to calculate the frequency of studying throughout the term. The results from a multiple regression analysis showed that high frequency of studying could mitigate the difference between at-risk and non-at-risk students on final exam scores. Additionally, the quality of studying for six at-risk students was analyzed by student interviews in concert with their text message responses. The results indicated that the quality of studying is not necessarily linked to frequency of studying and both quality and frequency can play a role in at-risk students' academic performance. The results presented offer a path for at-risk students to succeed in General Chemistry and the methodology presented offers a potential avenue for evaluating future efforts to improve student success. 
    more » « less
  2. Educating Engineering Students Innovatively (EESI, pronounced "easy") is a student support program for sophomores to seniors enrolled in an engineering major offered at the FAMU-FSU College of Engineering. The program is designed to: (1) foster a sense of community, (2) improve students’ engineering skill sets, and (3) provide each student with their direct path of interest from college to the STEM workforce. Universities spend much effort to provide student support programs for first-year students, such as summer bridge programs. However, sometimes upper-level students are not offered the same level of support and can fall off the STEM pathway. Introducing experiential learning experiences centered on the safe space (or community) of students provides a model to address underrepresentation in the STEM workforce and graduate school. This case study of an experiential learning program will provide an option for universities to consider underrepresented minority upperclassmen retention methods. We will present data for students enrolled in an engineering major between 2018-2021, considering students' gender, first-generation, and financial status. This paper will report the results of four (4) different cohorts of EESI Scholars who completed at least one semester in the student support program. We compare the retention rates, persistence, and academic performance of EESI Scholars compared with students that did not participate in the student support program as one measure of the program's success. Then we provide the best practices of the experiential learning program that led to students' persistence at ***** University. This paper could assist other colleges that would like to ensure Black students, who have been historically underrepresented in STEM, persistence in their engineering programs. 
    more » « less
  3. The ability to predict student performance in introductory programming courses is important to help struggling students and enhance their persistence. However, for this prediction to be impactful, it is crucial that it remains transparent and accessible for both instructors and students, ensuring effective utilization of the predicted results. Machine learning models with explainable features provide an effective means for students and instructors to comprehend students' diverse programming behaviors and problem-solving strategies, elucidating the factors contributing to both successful and suboptimal performance. This study develops an explainable model that predicts student performance based on programming assignment submission information in different stages of the course to enable early explainable predictions. We extract data-driven features from student programming submissions and utilize a stacked ensemble model for predicting final exam grades. The experimental results suggest that our model successfully predicts student performance based on their programming submissions earlier in the semester. Employing SHAP, a game-theory-based framework, we explain the model's predictions, aiding stakeholders in understanding the influence of diverse programming behaviors on students' success. Additionally, we analyze crucial features, employing a mix of descriptive statistics and mixture models to identify distinct student profiles based on their problem-solving patterns, enhancing overall explainability. Furthermore, we dive deeper and analyze the profiles using different programming patterns of the students to elucidate the characteristics of different students where SHAP explanations are not comprehensible. Our explainable early prediction model elucidates common problem-solving patterns in students relative to their expertise, facilitating effective intervention and adaptive support. 
    more » « less
  4. The ability to predict student performance in introductory programming courses is important to help struggling students and enhance their persistence. However, for this prediction to be impactful, it is crucial that it remains transparent and accessible for both instructors and students, ensuring effective utilization of the predicted results. Machine learning models with explainable features provide an effective means for students and instructors to comprehend students' diverse programming behaviors and problem-solving strategies, elucidating the factors contributing to both successful and suboptimal performance. This study develops an explainable model that predicts student performance based on programming assignment submission information in different stages of the course to enable early explainable predictions. We extract data-driven features from student programming submissions and utilize a stacked ensemble model for predicting final exam grades. The experimental results suggest that our model successfully predicts student performance based on their programming submissions earlier in the semester. Employing SHAP, a game-theory-based framework, we explain the model's predictions, aiding stakeholders in understanding the influence of diverse programming behaviors on students' success. Additionally, we analyze crucial features, employing a mix of descriptive statistics and mixture models to identify distinct student profiles based on their problem-solving patterns, enhancing overall explainability. Furthermore, we dive deeper and analyze the profiles using different programming patterns of the students to elucidate the characteristics of different students where SHAP explanations are not comprehensible. Our explainable early prediction model elucidates common problem-solving patterns in students relative to their expertise, facilitating effective intervention and adaptive support. 
    more » « less
  5. The ability to predict student performance in introductory programming courses is important to help struggling students and enhance their persistence. However, for this prediction to be impactful, it is crucial that it remains transparent and accessible for both instructors and students, ensuring effective utilization of the predicted results. Machine learning models with explainable features provide an effective means for students and instructors to comprehend students' diverse programming behaviors and problem-solving strategies, elucidating the factors contributing to both successful and suboptimal performance. This study develops an explainable model that predicts student performance based on programming assignment submission information in different stages of the course to enable early explainable predictions. We extract data-driven features from student programming submissions and utilize a stacked ensemble model for predicting final exam grades. The experimental results suggest that our model successfully predicts student performance based on their programming submissions earlier in the semester. Employing SHAP, a game-theory-based framework, we explain the model's predictions, aiding stakeholders in understanding the influence of diverse programming behaviors on students' success. Additionally, we analyze crucial features, employing a mix of descriptive statistics and mixture models to identify distinct student profiles based on their problem-solving patterns, enhancing overall explainability. Furthermore, we dive deeper and analyze the profiles using different programming patterns of the students to elucidate the characteristics of different students where SHAP explanations are not comprehensible. Our explainable early prediction model elucidates common problem-solving patterns in students relative to their expertise, facilitating effective intervention and adaptive support. 
    more » « less