skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using deep neural networks as a guide for modeling human planning
Abstract When developing models in cognitive science, researchers typically start with their own intuitions about human behavior in a given task and then build in mechanisms that explain additional aspects of the data. This refinement step is often hindered by how difficult it is to distinguish the unpredictable randomness of people’s decisions from meaningful deviations between those decisions and the model. One solution for this problem is to compare the model against deep neural networks trained on behavioral data, which can detect almost any pattern given sufficient data. Here, we apply this method to the domain of planning with a heuristic search model for human play in 4-in-a-row, a combinatorial game where participants think multiple steps into the future. Using a data set consisting of 10,874,547 games, we train deep neural networks to predict human moves and find that they accurately do so while capturing meaningful patterns in the data. Thus, deviations between the model and the best network allow us to identify opportunities for model improvement despite starting with a model that has undergone substantial testing in previous work. Based on this analysis, we add three extensions to the model that range from a simple opening bias to specific adjustments regarding endgame planning. Overall, our work demonstrates the advantages of model comparison with a high-performance deep neural network as well as the feasibility of scaling cognitive models to massive data sets for systematically investigating the processes underlying human sequential decision-making.  more » « less
Award ID(s):
2008331 1132009
PAR ID:
10474868
Author(s) / Creator(s):
; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
13
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Models in cognitive science are often restricted for the sake of interpretability, and as a result may miss patterns in the data that are instead classified as noise. In contrast, deep neural networks can detect almost any pattern given sufficient data, but have only recently been applied to large-scale data sets and tasks for which there already exist process-level models to compare against. Here, we train deep neural networks to predict human play in 4-in-a-row, a combinatorial game of intermediate complexity, using a data set of 10,874,547 games. We compare these networks to a planning model based on a heuristic function and tree search, and make suggestions for model improvements based on this analysis. This work provides the foundation for estimating a noise ceiling on massive data sets as well as systematically investigating the processes underlying human sequential decision-making. 
    more » « less
  2. Explainability is essential for AI models, especially in clinical settings where understanding the model’s decisions is crucial. Despite their impressive performance, black-box AI models are unsuitable for clinical use if their operations cannot be explained to clinicians. While deep neural networks (DNNs) represent the forefront of model performance, their explanations are often not easily interpreted by humans. On the other hand, hand-crafted features extracted to represent different aspects of the input data and traditional machine learning models are generally more understandable. However, they often lack the effectiveness of advanced models due to human limitations in feature design. To address this, we propose ExShall-CNN, a novel explainable shallow convolutional neural network for medical image processing. This model improves upon hand-crafted features to maintain human interpretability, ensuring that its decisions are transparent and understandable. We introduce the explainable shallow convolutional neural network (ExShall-CNN), which combines the interpretability of hand-crafted features with the performance of advanced deep convolutional networks like U-Net for medical image segmentation. Built on recent advancements in machine learning, ExShall-CNN incorporates widely used kernels while ensuring transparency, making its decisions visually interpretable by physicians and clinicians. This balanced approach offers both the accuracy of deep learning models and the explainability needed for clinical applications. 
    more » « less
  3. Subjective value has long been measured using binary choice experiments, yet responses like willingness-to-pay prices can be an effective and efficient way to assess individual differences risk preferences and value. Tony Marley’s work illustrated that dynamic, stochastic models permit meaningful inferences about cognition from process-level data on paradigms beyond binary choice, yet many of these models remain difficult to use because their likelihoods must be approximated from simulation. In this paper, we develop and test an approach that uses deep neural networks to estimate the parameters of otherwise-intractable behavioral models. Once trained, these networks allow for accurate and instantaneous parameter estimation. We compare different network architectures and show that they accurately recover true risk preferences related to utility, response caution, anchoring, and non-decision processes. To illustrate the usefulness of the approach, it was then applied to estimate model parameters for a large, demographically representative sample of U.S. participants who completed a 20-question pricing task — an estimation task that is not feasible with previous methods. The results illustrate the utility of machine-learning approaches for fitting cognitive and economic models, providing efficient methods for quantifying meaningful differences in risk preferences from sparse data. 
    more » « less
  4. While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful. 
    more » « less
  5. The neocortex-wide neural activity is organized into distinct networks of areas engaged in different cognitive processes. To elucidate the underlying mechanism of flexible network reconfiguration, we developed connectivity-constrained macaque and human whole-cortex models. In our model, within-area connectivity consists of a mixture of symmetric, asymmetric, and random motifs that give rise to stable (attractor) or transient (sequential) heterogeneous dynamics. Assuming sparse low-rank plus random inter-areal connectivity, we show that our model captures key aspects of the cognitive networks' dynamics and interactions observed experimentally. In particular, the anti-correlation between the default mode network and the dorsal attention network. Communication between networks is shaped by the alignment of long-range communication subspaces with local connectivity motifs and is switchable in a bottom-up salience-dependent routing mechanism. Furthermore, the frontoparietal multiple-demand network displays a coexistence of stable and dynamic coding, suitable for top-down cognitive control. Our work provides a theoretical framework for understanding the dynamic routing in the cortical networks during cognition. 
    more » « less