skip to main content

Search for: All records

Creators/Authors contains: "Wang, Chunpai"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We propose a new approach, the calibrated nonparametric scan statistic (CNSS), for more accurate detection of anomalous patterns in large-scale, real-world graphs. Scan statistics identify connected subgraphs that are interesting or unexpected through maximization of a likelihood ratio statistic; in particular, nonparametric scan statistics (NPSSs) identify subgraphs with a higher than expected proportion of individually significant nodes. However, we show that recently proposed NPSS methods are miscalibrated, failing to account for the maximization of the statistic over the multiplicity of subgraphs. This results in both reduced detection power for subtle signals, and low precision of the detected subgraph even for stronger signals. Thus we develop a new statistical approach to recalibrate NPSSs, correctly adjusting for multiple hypothesis testing and taking the underlying graph structure into account. While the recalibration, based on randomization testing, is computationally expensive, we propose both an efficient (approximate) algorithm and new, closed-form lower bounds (on the expected maximum proportion of significant nodes for subgraphs of a given size, under the null hypothesis of no anomalous patterns). These advances, along with the integration of recent core-tree decomposition methods, enable CNSS to scale to large real-world graphs, with substantial improvement in the accuracy of detected subgraphs. Extensive experimentsmore »on both semi-synthetic and real-world datasets are demonstrated to validate the effectiveness of our proposed methods, in comparison with state-of-the-art counterparts.« less
    Free, publicly-accessible full text available June 30, 2023
  2. Knowledge Tracing (KT), which aims to model student knowledge level and predict their performance, is one of the most important applications of user modeling. Modern KT approaches model and maintain an up-to-date state of student knowledge over a set of course concepts according to students’ historical performance in attempting the problems. However, KT approaches were designed to model knowledge by observing relatively small problem-solving steps in Intelligent Tutoring Systems. While these approaches were applied successfully to model student knowledge by observing student solutions for simple problems, such as multiple-choice questions, they do not perform well for modeling complex problem solving in students. Most importantly, current models assume that all problem attempts are equally valuable in quantifying current student knowledge. However, for complex problems that involve many concepts at the same time, this assumption is deficient. It results in inaccurate knowledge states and unnecessary fluctuations in estimated student knowledge, especially if students guess the correct answer to a problem that they have not mastered all of its concepts or slip in answering the problem that they have already mastered all of its concepts. In this paper, we argue that not all attempts are equivalently important in discovering students’ knowledge state, andmore »some attempts can be summarized together to better represent student performance. We propose a novel student knowledge tracing approach, Granular RAnk based TEnsor factorization (GRATE), that dynamically selects student attempts that can be aggregated while predicting students’ performance in problems and discovering the concepts presented in them. Our experiments on three real-world datasets demonstrate the improved performance of GRATE, compared to the state-of-the-art baselines, in the task of student performance prediction. Our further analysis shows that attempt aggregation eliminates the unnecessary fluctuations from students’ discovered knowledge states and helps in discovering complex latent concepts in the problems.« less
  3. The state of the art knowledge tracing approaches mostly model student knowledge using their performance in assessed learning resource types, such as quizzes, assignments, and exercises, and ignore the non-assessed learning resources. However, many student activities are non-assessed, such as watching video lectures, participating in a discussion forum, and reading a section of a textbook, all of which potentially contributing to the students' knowledge growth. In this paper, we propose the  first novel deep learning based knowledge tracing model (DMKT) that explicitly model student's knowledge transitions over both assessed and non-assessed learning activities. With DMKT we can discover the underlying latent concepts of each non-assessed and assessed learning material and better predict the student performance in future assessed learning resources. We compare our proposed method with various state of the art knowledge tracing methods on four real-world datasets and show its effectiveness in predicting student performance, representing student knowledge, and discovering the underlying domain model.
  4. We propose a generalized optimization framework for detecting anomalous patterns (subgraphs that are interesting or unexpected) in interdependent networks, such as multi-layer networks, temporal networks, networks of networks, and many others. We frame the problem as a non-convex optimization that has a general nonlinear score function and a set of block-structured and non-convex constraints. We develop an effective, efficient, and parallelizable projection-based algorithm, namely Graph Block-structured Gradient Projection (GBGP), to solve the problem. It is proved that our algorithm 1) runs in nearly-linear time on the network size, and 2) enjoys a theoretical approximation guarantee. Moreover, we demonstrate how our framework can be applied to two very practical applications, and we conduct comprehensive experiments to show the effectiveness and efficiency of our proposed algorithm.