skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Rus, V."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This paper systematically investigates the generation of code explanations by Large Language Models (LLMs) for code examples commonly encountered in introductory programming courses. Our findings reveal significant variations in the nature of code explanations produced by LLMs, influenced by factors such as the wording of the prompt, the specific code examples under consideration, the programming language involved, the temperature parameter, and the version of the LLM. However, a consistent pattern emerges for Java and Python, where ex- planations exhibit a Flesch-Kincaid readability level of approximately 7-8 grade and a consistent lexical density, indicating the proportion of meaningful words relative to the total explanation size. Additionally, the generated explanations consistently achieve high scores for correctness, but lower scores on three other metrics: completeness, conciseness, and specificity. 
    more » « less
    Free, publicly-accessible full text available December 15, 2024
  2. Understanding a student's problem-solving strategy can have a significant impact on effective math learning using Intelligent Tutoring Systems (ITSs) and Adaptive Instructional Systems (AISs). For instance, the ITS/AIS can better personalize itself to correct specific misconceptions that are indicated by incorrect strategies, specific problems can be designed to improve strategies and frustration can be minimized by adapting to a student's natural way of thinking rather than trying to fit a standard strategy for all. While it may be possible for human experts to identify strategies manually in classroom settings with sufficient student interaction, it is not possible to scale this up to big data. Therefore, we leverage advances in Machine Learning and AI methods to perform scalable strategy prediction that is also fair to students at all skill levels. Specifically, we develop an embedding called MVec where we learn a representation based on the mastery of students. We then cluster these embeddings with a non-parametric clustering method where each cluster contains instances that have approximately symmetrical strategies. The strategy prediction model is trained on instances sampled from these clusters ensuring that we train the model over diverse strategies. Using real world large-scale student interaction datasets from MATHia, we show that our approach can scale up to achieve high accuracy by training on a small sample of a large dataset and also has predictive equality, i.e., it can predict strategies equally well for learners at diverse skill levels. 
    more » « less
  3. Frasson, C. ; Mylonas, P. ; Troussas, C. (Ed.)
    Domain modeling is an important task in designing, developing, and deploying intelligent tutoring systems and other adaptive instructional systems. We focus here on the more specific task of automatically extracting a domain model from textbooks. In particular, this paper explores using multiple textbook indexes to extract a domain model for computer programming. Our approach is based on the observation that different experts, i.e., authors of intro-to-programming textbooks in our case, break down a domain in slightly different ways, and identifying the commonalities and differences can be very revealing. To this end, we present automated approaches to extracting domain models from multiple textbooks and compare the resulting common domain model with a domain model created by experts. Specifically, we use approximate string-matching approaches to increase coverage of the resulting domain model and majority voting across different textbooks to discover common domain terms related to computer programming. Our results indicate that using approximate string matching gives more accurate domain models for computer programming with increased precision and recall. By automating our approach, we can significantly reduce the time and effort required to construct high-quality domain models, making it easy to develop and deploy tutoring systems. Furthermore, we obtain a common domain model that can serve as a benchmark or skeleton that can be used broadly and adapted to specific needs by others. 
    more » « less
  4. Understanding how students with varying capabilities think about problem solving can greatly help in improving personalized education which can have significantly better learning outcomes. Here, we present the details of a system we call NeTra that we developed for discovering strategies that students follow in the context of Math learning. Specifically, we developed this system from large-scale data from MATHia that contains millions of student-tutor interactions. The goal of this system is to provide a visual interface for educators to understand the likely strategy the student will follow for problems that students are yet to attempt. This predictive interface can help educators/tutors to develop interventions that are personalized for students. Underlying the system is a powerful AI model based on Neuro-Symbolic learning that has shown promising results in predicting both strategies and the mastery over concepts used in the strategy. 
    more » « less
  5. This paper provides an update of the Learner Data Institute (LDI; www.learnerdatainstitute.org) which is now in its third year since conceptualization. Funded as a conceptualization project, the LDI’s first two years had two major goals: (1) develop, implement, evaluate, and refine a framework for data-intensive science and engineering and (2) use the framework to start developing prototype solutions, based on data, data science, and science convergence, to a number of core challenges in learning science and engineering. One major focus in the third, current year is synthesizing efforts from the first two years to identify new opportunities for future research by various mutual interest groups within LDI, which have focused on developing a particular prototype solution to one or more related core challenges in learning science and engineering. In addition to highlighting emerging data-intensive solutions and innovations from the LDI’s first two years, including places where LDI researchers have received additional funding for future research, we highlight here various core challenges our team has identified as being at a “tipping point.” Tipping point challenges are those for which timely investment in data-intensive approaches has the maximum potential for a transformative effect. 
    more » « less
  6. null (Ed.)
    This paper provides a progress report on the first 18 months of Phase 1, the conceptualization phase, of the Learner Data Institute (LDI; www.learnerdatainstitute.org). LDI is currently in Phase 1, the conceptualization phase, to be followed by Phase 2, the institute or convergence phase. The current 2-year conceptualization phase has two major goals: (1) develop, implement, evaluate, and refine a framework for data-intensive science and engineering for the future institute, and (2) use the framework to provide prototype solutions, based on data, data science, and science convergence, to a number of core challenges in learning science and engineering. By targeting a critical mass of key challenges that are at a tipping point, LDI aims to start a chain reaction that will transform the whole learning ecosystem. We will emphasize here the key elements of the LDI science convergence framework that our team developed, implemented, and now is in the process of evaluating and refining. We highlight important outcomes of the convergence framework and related processes, including a 5-year plan for the institute phase and data-intensive prototype solutions to transform the learning ecosystem. 
    more » « less