skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling Second-Language Learning from a Psychological Perspective
Psychological research on learning and memory has tended to emphasize small-scale laboratory studies. However, large datasets of people using educational software provides opportunities to explore these issues from a new perspective. In this paper we describe our approach to the Duolingo Second Language Acquisition Modeling (SLAM) competition which was run in early 2018. We used a well-known class of algorithms (gradient boosted decision trees), with features partially informed by theories from the psychological literature. After detailing our modeling approach and a number of supplementary simulations, we reflect on the degree to which psychological theory aided the model, and the potential for cognitive science and predictive modeling competitions to gain from each other.  more » « less
Award ID(s):
1631436
PAR ID:
10062637
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the NAACL-HLT Workshop on Innovative Use of NLP for Building Educational Applications (BEA).
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract While a robust literature on the psychology of conspiracy theories has identified dozens of characteristics correlated with conspiracy theory beliefs, much less attention has been paid to understanding the generalized predisposition towards interpreting events and circumstances as the product of supposed conspiracies. Using a unique national survey of 2015 U.S. adults from October 2020, we investigate the relationship between this predisposition—conspiracy thinking—and 34 different psychological, political, and social correlates. Using conditional inference tree modeling—a machine learning-based approach designed to facilitate prediction using a flexible modeling methodology—we identify the characteristics that are most useful for orienting individuals along the conspiracy thinking continuum, including (but not limited to): anomie, Manicheanism, support for political violence, a tendency to share false information online, populism, narcissism, and psychopathy. Altogether, psychological characteristics are much more useful in predicting conspiracy thinking than are political and social characteristics, though even our robust set of correlates only partially accounts for variance in conspiracy thinking. 
    more » « less
  2. Large-scale language datasets and advances in natural language processing offer opportunities for studying people’s cognitions and behaviors. We show how representations derived from language can be combined with laboratory-based word norms to predict implicit attitudes for diverse concepts. Our approach achieves substantially higher correlations than existing methods. We also show that our approach is more predictive of implicit attitudes than are explicit attitudes, and that it captures variance in implicit attitudes that is largely unexplained by explicit attitudes. Overall, our results shed light on how implicit attitudes can be measured by combining standard psychological data with large-scale language data. In doing so, we pave the way for highly accurate computational modeling of what people think and feel about the world around them. 
    more » « less
  3. null (Ed.)
    Probability distributions over rankings are crucial for the modeling and design of a wide range of practical systems. In this work, we pursue a nonparametric approach that seeks to learn a distribution over rankings (aka the ranking model) that is consistent with the observed data and has the sparsest possible support (i.e., the smallest number of rankings with nonzero probability mass). We focus on first-order marginal data, which comprise information on the probability that item i is ranked at position j, for all possible item and position pairs. The observed data may be noisy. Finding the sparsest approximation requires brute force search in the worst case. To address this issue, we restrict our search to, what we dub, the signature family, and show that the sparsest model within the signature family can be found computationally efficiently compared with the brute force approach. We then establish that the signature family provides good approximations to popular ranking model classes, such as the multinomial logit and the exponential family classes, with support size that is small relative to the dimension of the observed data. We test our methods on two data sets: the ranked election data set from the American Psychological Association and the preference ordering data on 10 different sushi varieties. 
    more » « less
  4. This paper evaluates the effects of being an only child in a family on psychological health, leveraging data on the One-Child Policy in China. We use an instrumental variable approach to address the potential unmeasured confounding between the fertility decision and psychological health, where the instrumental variable is an index on the intensity of the implementation of the One-Child Policy. We establish an analytical link between the local instrumental variable approach and principal stratification to accommodate the continuous instrumental variable. Within the principal stratification framework, we postulate a Bayesian hierarchical model to infer various causal estimands of policy interest while adjusting for the clustering data structure. We apply the method to the data from the China Family Panel Studies and find small but statistically significant negative effects of being an only child on self-reported psychological health for some subpopulations. Our analysis reveals treatment effect heterogeneity with respect to both observed and unobserved characteristics. In particular, urban males suffer the most from being only children, and the negative effect has larger magnitude if the families were more resistant to the One-Child Policy. We also conduct sensitivity analysis to assess the key instrumental variable assumption. 
    more » « less
  5. Lin, Chung-Ying (Ed.)
    Background University students are increasingly recognized as a vulnerable population, suffering from higher levels of anxiety, depression, substance abuse, and disordered eating compared to the general population. Therefore, when the nature of their educational experience radically changes—such as sheltering in place during the COVID-19 pandemic—the burden on the mental health of this vulnerable population is amplified. The objectives of this study are to 1) identify the array of psychological impacts COVID-19 has on students, 2) develop profiles to characterize students' anticipated levels of psychological impact during the pandemic, and 3) evaluate potential sociodemographic, lifestyle-related, and awareness of people infected with COVID-19 risk factors that could make students more likely to experience these impacts. Methods Cross-sectional data were collected through web-based questionnaires from seven U.S. universities. Representative and convenience sampling was used to invite students to complete the questionnaires in mid-March to early-May 2020, when most coronavirus-related sheltering in place orders were in effect. We received 2,534 completed responses, of which 61% were from women, 79% from non-Hispanic Whites, and 20% from graduate students. Results Exploratory factor analysis on close-ended responses resulted in two latent constructs, which we used to identify profiles of students with latent profile analysis, including high (45% of sample), moderate (40%), and low (14%) levels of psychological impact. Bivariate associations showed students who were women, were non-Hispanic Asian, in fair/poor health, of below-average relative family income, or who knew someone infected with COVID-19 experienced higher levels of psychological impact. Students who were non-Hispanic White, above-average social class, spent at least two hours outside, or less than eight hours on electronic screens were likely to experience lower levels of psychological impact. Multivariate modeling (mixed-effects logistic regression) showed that being a woman, having fair/poor general health status, being 18 to 24 years old, spending 8 or more hours on screens daily, and knowing someone infected predicted higher levels of psychological impact when risk factors were considered simultaneously. Conclusion Inadequate efforts to recognize and address college students’ mental health challenges, especially during a pandemic, could have long-term consequences on their health and education. 
    more » « less