By middle childhood, humans are able to learn abstract semantic relations (e.g., antonym, synonym, category membership) and use them to reason by analogy. A deep theoretical challenge is to show how such abstract relations can arise from nonrelational inputs, thereby providing key elements of a protosymbolic representation system. We have developed a computational model that exploits the potential synergy between deep learning from “big data” (to create semantic features for individual words) and supervised learning from “small data” (to create representations of semantic relations between words). Given as inputs labeled pairs of lexical representations extracted by deep learning, the modelmore »
This content will become publicly available on July 1, 2022
Visual analogy: Deep learning versus compositional models
Is analogical reasoning a task that must be learned to solve
from scratch by applying deep learning models to massive
numbers of reasoning problems? Or are analogies solved by
computing similarities between structured representations of
analogs? We address this question by comparing human
performance on visual analogies created using images of
familiar three-dimensional objects (cars and their subregions)
with the performance of alternative computational models.
Human reasoners achieved above-chance accuracy for all
problem types, but made more errors in several conditions
(e.g., when relevant subregions were occluded). We compared
human performance to that of two recent deep learning models
(Siamese Network and Relation Network) directly trained to
solve these analogy problems, as well as to that of a
compositional model that assesses relational similarity
between part-based representations. The compositional model
based on part representations, but not the deep learning models,
generated qualitative performance similar to that of human
reasoners.
- Editors:
- Fitch, T.; Lamm, C.; Leder, H.; Teßmar-Raible, K.
- Award ID(s):
- 1827374
- Publication Date:
- NSF-PAR ID:
- 10231806
- Journal Name:
- Proceedings of the 43rd Annual Meeting of the Cognitive Science Society
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We report a first effort to model the solution of meaningful four-term visual analogies, by combining a machine-vision model (ResNet50-A) that can classify pixel-level images into object categories, with a cognitive model (BART) that takes semantic representations of words as input and identifies semantic relations instantiated by a word pair. Each model achieves above-chance performance in selecting the best analogical option from a set of four. However, combining the visual and the semantic models increases analogical performance above the level achieved by either model alone. The contribution of vision to reasoning thus may extend beyond simply generating verbal representations frommore »
-
There is a large gap between the ability of experts and students in grasping spatial concepts and representations. Engineering and the geosciences require the highest expertise in spatial thinking, and weak spatial skills are a significant barrier to success for many students [1]. Spatial skills are also highly malleable [2]; therefore, a current challenge is to identify how to promote students’ spatial thinking. Interdisciplinary research on how students think about spatially-demanding problems in the geosciences has identified several major barriers for students and interventions to help scaffold learning at a variety of levels from high school through upper level undergraduatemore »
-
Compositional models represent patterns with hierarchies of meaningful parts and subparts. Their ability to characterize high-order relationships among body parts helps resolve low-level ambiguities in human pose estimation (HPE). However, prior compositional models make unrealistic assumptions on subpart-part relationships, making them incapable to characterize complex compositional patterns. Moreover, state spaces of their higher-level parts can be exponentially large, complicating both inference and learning. To address these issues, this paper introduces a novel framework, termed as Deeply Learned Compositional Model (DLCM), for HPE. It exploits deep neural networks to learn the compositionality of human bodies. This results in a novel networkmore »
-
Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch ofmore »