skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spatial relation categorization in infants and deep neural networks
Spatial relations, such as above, below, between, and containment, are important mediators in children’s understanding of the world (Piaget, 1954). The development of these relational categories in infancy has been extensively studied (Quinn, 2003) yet little is known about their computational underpinnings. Using developmental tests, we examine the extent to which deep neural networks, pretrained on a standard vision benchmark or egocentric video captured from one baby’s perspective, form categorical representations for visual stimuli depicting relations. Notably, the networks did not receive any explicit training on relations. We then analyze whether these networks recover similar patterns to ones identified in development, such as reproducing the relative difficulty of categorizing different spatial relations and different stimulus abstractions. We find that the networks we evaluate tend to recover many of the patterns observed with the simpler relations of “above versus below” or “between versus outside”, but struggle to match developmental findings related to “containment”. We identify factors in the choice of model architecture, pretraining data, and experimental design that contribute to the extent the networks match developmental patterns, and highlight experimental predictions made by our modeling results. Our results open the door to modeling infants’ earliest categorization abilities with modern machine learning tools and demonstrate the utility and productivity of this approach.  more » « less
Award ID(s):
1922658
PAR ID:
10535869
Author(s) / Creator(s):
; ;
Publisher / Repository:
Cognition
Date Published:
Journal Name:
Cognition
Volume:
245
Issue:
C
ISSN:
0010-0277
Page Range / eLocation ID:
105690
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Networks have been widely used to represent the relations between objects such as academic networks and social networks, and learning embedding for networks has thus garnered plenty of research attention. Self-supervised network representation learning aims at extracting node embedding without external supervision. Recently, maximizing the mutual information between the local node embedding and the global summary (e.g. Deep Graph Infomax, or DGI for short) has shown promising results on many downstream tasks such as node classification. However, there are two major limitations of DGI. Firstly, DGI merely considers the extrinsic supervision signal (i.e., the mutual information between node embedding and global summary) while ignores the intrinsic signal (i.e., the mutual dependence between node embedding and node attributes). Secondly, nodes in a real-world network are usually connected by multiple edges with different relations, while DGI does not fully explore the various relations among nodes. To address the above-mentioned problems, we propose a novel framework, called High-order Deep Multiplex Infomax (HDMI), for learning node embedding on multiplex networks in a self-supervised way. To be more specific, we first design a joint supervision signal containing both extrinsic and intrinsic mutual information by high-order mutual information, and we propose a High- order Deep Infomax (HDI) to optimize the proposed supervision signal. Then we propose an attention based fusion module to combine node embedding from different layers of the multiplex network. Finally, we evaluate the proposed HDMI on various downstream tasks such as unsupervised clustering and supervised classification. The experimental results show that HDMI achieves state-of-the-art performance on these tasks. 
    more » « less
  2. Networks of stiff fibers govern the elasticity of biological structures such as the extracellular matrix of collagen.These networks are known to stiffen nonlinearly under shear or extensional strain. Recently, it has been shown that such stiffening is governed by a strain-controlled athermal but critical phase transition, from a floppy phase below the critical strain to a rigid phase above the critical strain. While this phase transition has been extensively studied numerically and experimentally, a complete analytical theory for this transition remains elusive. Here, we present an effective medium theory (EMT) for this mechanical phase transition of fiber networks. We extend a previous EMT appropriate for linear elasticity to incorporate nonlinear effects via an anharmonic Hamiltonian. The mean-field predictions of this theory, including the critical exponents, scaling relations and non-affine fluctuations qualitatively agree with previous experimental and numerical results. 
    more » « less
  3. Abstract ObjectiveSNOMED CT is the largest clinical terminology worldwide. Quality assurance of SNOMED CT is of utmost importance to ensure that it provides accurate domain knowledge to various SNOMED CT-based applications. In this work, we introduce a deep learning-based approach to uncover missing is-a relations in SNOMED CT. Materials and MethodsOur focus is to identify missing is-a relations between concept-pairs exhibiting a containment pattern (ie, the set of words of one concept being a proper subset of that of the other concept). We use hierarchically related containment concept-pairs as positive instances and hierarchically unrelated containment concept-pairs as negative instances to train a model predicting whether an is-a relation exists between 2 concepts with containment pattern. The model is a binary classifier leveraging concept name features, hierarchical features, enriched lexical attribute features, and logical definition features. We introduce a cross-validation inspired approach to identify missing is-a relations among all hierarchically unrelated containment concept-pairs. ResultsWe trained and applied our model on the Clinical finding subhierarchy of SNOMED CT (September 2019 US edition). Our model (based on the validation sets) achieved a precision of 0.8164, recall of 0.8397, and F1 score of 0.8279. Applying the model to predict actual missing is-a relations, we obtained a total of 1661 potential candidates. Domain experts performed evaluation on randomly selected 230 samples and verified that 192 (83.48%) are valid. ConclusionsThe results showed that our deep learning approach is effective in uncovering missing is-a relations between containment concept-pairs in SNOMED CT. 
    more » « less
  4. Holistic processing (HP) of faces refers to the obligatory, simultaneous processing of the parts and their relations, and it emerges over the course of development. HP is manifest in a decrement in the perception of inverted versus upright faces and a reduction in face processing ability when the relations between parts are perturbed. Here, adopting the HP framework for faces, we examined the developmental emergence of HP in another domain for which human adults have expertise, namely, visual word processing. Children, adolescents, and adults performed a lexical decision task and we used two established signatures of HP for faces: the advantage in perception of upright over inverted words and nonwords and the reduced sensitivity to increasing parts (word length). Relative to the other groups, children showed less of an advantage for upright versus inverted trials and lexical decision was more affected by increasing word length. Performance on these HP indices was strongly associated with age and with reading proficiency. Also, the emergence of HP for word perception was not simply a result of improved visual perception over the course of development as no group differences were observed on an object decision task. These results reveal the developmental emergence of HP for orthographic input, and reflect a further instance of experience-dependent tuning of visual perception. These results also add to existing findings on the commonalities of mechanisms of word and face recognition. 
    more » « less
  5. Finding diagrams that contain a specific part or a similar part is important in many engineering tasks. In this search task, the query part is expected to match only a small region in a complex image.This paper investigates several local matching networks that explicitly model local region-to-region similarities. Deep convolutional neural networks extract local features and model local matching patterns. Spatial convolution is employed to cross-match local regions at different scale levels, addressing cases where the target part appears at a different scale, position, and/or angle. A gating network automatically learns region importance, removing noise from sparse areas and visual metadata in engineering diagrams. Experimental results show that local matching approaches are more effective for engineering diagram search than global matching approaches. Suppressing unimportant regions via the gating net-work enhances accuracy. Matching across different scales via spatial convolution substantially improves robustness to scale and rotation changes. A pipelined architecture efficiently searches a large collection of diagrams by using a simple local matching network to identify a small set of candidate images and a more sophisticated network with convolutional cross-scale matching to re-rank candidates. 
    more » « less