skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Enhancing Predictive Modeling of Nested Spatial Data through Group-Level Feature Disaggregation
Multilevel modeling and multi-task learning are two widely used approaches for modeling nested (multi-level) data, which contain observations that can be clustered into groups, characterized by their group-level features. Despite the similarity of the problems they address, the explicit relationship between multilevel modeling and multi-task learning has not been carefully examined. In this paper, we present a comparative analysis between the two methods to illustrate their strengths and limitations when applied to two-level nested data. We provide a detailed analysis demonstrating the equivalence of their formulations under a mild condition from an optimization perspective. We also demonstrate their limitations in terms of their predictive performance and especially, their difficulty in identifying potential cross-scale interactions between the local and group-level features when applied to datasets with either a small number of groups or limited training examples per group. To overcome these limitations, we propose a novel method for disaggregating the coarse-scale values of the group-level features in the nested data. Experimental results on both synthetic and real-world data show that the disaggregated group-level features can help enhance the prediction accuracy of the models significantly and identify the cross-scale interactions more effectively.  more » « less
Award ID(s):
1638679 1638539
PAR ID:
10076359
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Page Range / eLocation ID:
1784 to 1793
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Constructs that reflect differences in variability are of interest to many researchers studying workplace phenomena. The aggregation methods typically used to investigate “variability-based” constructs suffer from several limitations, including the inability to include Level 1 predictors and a failure to account for uncertainty in the variability estimates. We demonstrate how mixed-effects location-scale (MELS) and heterogeneous variance models, which are direct extensions of traditional mixed-effects (or multilevel) models, can be used to test mean (location)- and variability (scale)-related hypotheses simultaneously. The aims of this article are to demonstrate (a) how the MELS and heterogeneous variance models can be estimated with both nested cross-sectional and longitudinal data to answer novel research questions about constructs of interest to organizational researchers, (b) how a Bayesian approach allows for the inclusion of random intercepts and slopes when predicting both variability and mean levels, and finally (c) how researchers can use a multilevel approach to predict between-group heterogeneous variances. In doing so, this article highlights the added value of viewing variability as more than a statistical nuisance in organizational research. 
    more » « less
  2. ABSTRACT Covariate-dependent graph learning has gained increasing interest in the graphical modeling literature for the analysis of heterogeneous data. This task, however, poses challenges to modeling, computational efficiency, and interpretability. The parameter of interest can be naturally represented as a 3-dimensional array with elements that can be grouped according to 2 directions, corresponding to node level and covariate level, respectively. In this article, we propose a novel dual group spike-and-slab prior that enables multi-level selection at covariate-level and node-level, as well as individual (local) level sparsity. We introduce a nested strategy with specific choices to address distinct challenges posed by the various grouping directions. For posterior inference, we develop a full Gibbs sampler for all parameters, which mitigates the difficulties of parameter tuning often encountered in high-dimensional graphical models and facilitates routine implementation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of graph recovery. We show the practical utility of our model via an application to microbiome data where we seek to better understand the interactions among microbes as well as how these are affected by relevant covariates. 
    more » « less
  3. Irgens, G; Knight, S (Ed.)
    Wearable positioning sensors are enabling unprecedented opportunities to model students’ procedural and social behaviours during collaborative learning tasks in physical learning spaces. Emerging work in this area has mainly focused on modelling group-level interactions from low-level x-y positioning data. Yet, little work has utilised such data to automatically identify individual-level differences among students working in co-located groups in terms of procedural and social aspects such as task prioritisation and collaboration dynamics, respectively. To address this gap, this study characterised key differences among 124 students’ procedural and social behaviours according to their perceived stress, collaboration, and task satisfaction during a complex group task using wearable positioning sensors and ordered networked analysis. The results revealed that students who demonstrated more collaborative behaviours were associated with lower stress and higher collaboration satisfaction. Interestingly, students who worked individually on the primary and secondary learning tasks reported lower and higher task satisfaction, respectively. These findings can deepen our understanding of students’ individual-level behaviours and experiences while learning in groups. 
    more » « less
  4. We model coordination and coregulation patterns in 33 triads engaged in collaboratively solving a challenging computer programming task for approximately 20 minutes. Our goal is to prospectively model speech rate (words/sec) – an important signal of turn taking and active participation – of one teammate (A or B or C) from time lagged nonverbal signals (speech rate and acoustic-prosodic features) of the other two (i.e., A + B → C; A + C → B; B + C → A) and task-related context features. We trained feed-forward neural networks (FFNNs) and long short- term memory recurrent neural networks (LSTMs) using group- level nested cross-validation. LSTMs outperformed FFNNs and a chance baseline and could predict speech rate up to 6s into the future. A multimodal combination of speech rate, acoustic- prosodic, and task context features outperformed unimodal and bimodal signals. The extent to which the models could predict an individual’s speech rate was positively related to that individual’s scores on a subsequent posttest, suggesting a link between coordination/coregulation and collaborative learning outcomes. We discuss applications of the models for real-time systems that monitor the collaborative process and intervene to promote positive collaborative outcomes. 
    more » « less
  5. Modeling player engagement is a key challenge in games. However, the gameplay signatures of engaged players can be highly context-sensitive, varying based on where the game is used or what population of players is using it. Traditionally, models of player engagement are investigated in a particular context, and it is unclear how effectively these models generalize to other settings and populations. In this work, we investigate a Bayesian hierarchical linear model for multi-task learning to devise a model of player engagement from a pair of datasets that were gathered in two complementary contexts: a Classroom Study with middle school students and a Laboratory Study with undergraduate students. Both groups of players used similar versions of Crystal Island, an educational interactive narrative game for science learning. Results indicate that the Bayesian hierarchical model outperforms both pooled and context-specific models in cross-validation measures of predicting player motivation from in-game behaviors, particularly for the smaller Classroom Study group. Further, we find that the posterior distributions of model parameters indicate that the coefficient for a measure of gameplay performance significantly differs between groups. Drawing upon their capacity to share information across groups, hierarchical Bayesian methods provide an effective approach for modeling player engagement with data from similar, but different, contexts. 
    more » « less