skip to main content


Search for: All records

Award ID contains: 1636847

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In recent years, we have seen the continuous and rapid increase of job openings in Science, Technology, Engineering and Math (STEM)-related fields. Unfortunately, these positions are not met with an equal number of workers ready to fill them. Efforts are being made to find durable solutions for this phenomena, and they start by encouraging young students to enroll in STEM college majors. However, enrolling in a STEM major requires specific skills in math and science that are learned in schools. Hopefully, institutions are adopting educational software that collects data from the students' usage. This gathered data will serve to conduct analysis and detect students' behaviors, predict their performances and their eventual college enrollment. As we will outline in this paper, we used data collected from the students' usage of an Intelligent Tutoring System to predict whether they would pursue a career in STEM-related fields. We conducted different types of analysis called "problem-based approach" and "skill-based approach". The problem- based approach focused on evaluating students' actions based on the problems they solved. Likewise, in the skill-based approach we evaluated their usage based on the skills they had practiced. Furthermore, we investigated whether comparing students' features with those of their peer schoolmates can improve the prediction models in both the skill-based and the problem-based approaches. The experimental re- sults showed that the skill-based approach with school aggregation achieved the best results with regard to a combination of two metrics which are the Area Under the Receiver Operating Characteristic Curve (AUC) and the Root Mean Squared Error (RMSE). 
    more » « less
  2. Given the increasing need for skilled workers in science, technology, engineering, and mathematics (STEM), there is a burgeoning interest to encourage young students to pursue a career in STEM fields. Middle school is an opportune time to guide students' interests towards STEM disciplines, as they begin to think about and plan for their career aspirations. Previous studies have shown that detectors of students' learning, affect, and engagement, measured from their interactions within an online tutoring system during middle school, are strongly predictive of their eventual choice to attend college and enroll in a STEM major (San Pedro et al., 2013; 2014). In this study, we extend prior work by examining how the constructs measured by these detectors relate to the decision to participate in a STEM career after college. Findings from this study suggest that subtle forms of disengagement (i.e., gaming the system, carelessness) are predictive and can potentially provide actionable information for teachers and counselors to apply early intervention in STEM learning. In general, this study sheds light on the relevant student factors that influence STEM participation years later, providing a more comprehensive understanding of student STEM trajectories. 
    more » « less
  3. This special issue includes papers from some of the leading competitors in the ASSISTments Longitudinal Data Mining Competition 2017, as well as some research from non-competitors, using the same data set. In this competition, participants attempted to predict whether students would choose a career in a STEM field or not, making this prediction using a click-stream dataset from middle school students working on math assignments inside ASSISTments, an online tutoring platform. At the conclusion of the competition on December 3rd, 2017, there were 202 participants, 74 of whom submitted predictions at least once. In this special issue, some of the leading competitors present their results and what they have learned about the link between behavior in online learning and future STEM career development. 
    more » « less
  4. In this paper, we describe our solution to predict student STEM career choices during the 2017 ASSISTments Datamining Competition. We built a machine learning system that automatically reformats the data set, generates new features and prunes redundant ones, and performs model and feature selection. We designed the system to automatically find a model that optimizes prediction performance, yet the final model is a simple logistic regression that allows researchers to discover important features and study their effects on STEM career choices. We also compared our method to other methods, which revealed that the key to good prediction is proper feature enrichment in the beginning stage of the data analysis, while feature selection in a later stage allows a simpler final model. 
    more » « less
  5. The 2nd Annual WPI-UMASS-UPENN EDM Data Min- ing Challenge required contestants to predict efficient test taking based on log data. In this paper, we describe our theory-driven and psychometric modeling approach. For feature engineering, we employed the Log-Normal Response Time Model for estimating latent person speed, and the Generalized Partial Credit Model for estimating latent person ability. Additionally, we adopted an n-gram feature approach for event sequences. For training a multi-label classifier, we distinguished inefficent test takers who were going too fast and those who were going too slow, instead of using the provided binary target label. Our best-performing ensemble classify er comprised three sets of low-dimensional classi ers, dominated by test-taker speed. While our classi- er reached moderate performance, relative to competition leaderboard, our approach makes two important contributions. First, we show how explainable classi ers could provide meaningful predictions if results can be contextualized to test administrators who wish to intervene or take action. Second, our re-engineering of test scores enabled us to incorporate person ability into the estimation. However, ability was hardly predictive of efficient behavior, leading to the conclusion that the target label's validity needs to be questioned. The paper concludes with tools that are helpful for substantively meaningful log data mining. 
    more » « less