skip to main content

This content will become publicly available on September 17, 2024

Title: Two Computational Approaches to Visual Analogy: Task‐Specific Models Versus Domain‐General Mapping

Advances in artificial intelligence have raised a basic question about human intelligence: Is human reasoning best emulated by applying task‐specific knowledge acquired from a wealth of prior experience, or is it based on the domain‐general manipulation and comparison of mental representations? We address this question for the case of visual analogical reasoning. Using realistic images of familiar three‐dimensional objects (cars and their parts), we systematically manipulated viewpoints, part relations, and entity properties in visual analogy problems. We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) that were directly trained to solve these problems and to apply their task‐specific knowledge to analogical reasoning. We also developed a new model using part‐based comparison (PCM) by applying a domain‐general mapping procedure to learned representations of cars and their component parts. Across four‐term analogies (Experiment 1) and open‐ended analogies (Experiment 2), the domain‐general PCM model, but not the task‐specific deep learning models, generated performance similar in key aspects to that of human reasoners. These findings provide evidence that human‐like analogical reasoning is unlikely to be achieved by applying deep learning with big data to a specific type of analogy problem. Rather, humans do (and machines might) achieve analogical reasoning by learning representations that encode structural information useful for multiple tasks, coupled with efficient computation of relational similarity.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Cognitive Science
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fitch, T. ; Lamm, C. ; Leder, H. ; Teßmar-Raible, K. (Ed.)
    Is analogical reasoning a task that must be learned to solve from scratch by applying deep learning models to massive numbers of reasoning problems? Or are analogies solved by computing similarities between structured representations of analogs? We address this question by comparing human performance on visual analogies created using images of familiar three-dimensional objects (cars and their subregions) with the performance of alternative computational models. Human reasoners achieved above-chance accuracy for all problem types, but made more errors in several conditions (e.g., when relevant subregions were occluded). We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations. The compositional model based on part representations, but not the deep learning models, generated qualitative performance similar to that of human reasoners. 
    more » « less
  2. We see the external world as consisting not only of objects and their parts, but also of relations that hold between them. Visual analogy, which depends on similarities between relations, provides a clear example of how perception supports reasoning. Here we report an experiment in which we quantitatively measured the human ability to find analogical mappings between parts of different objects, where the objects to be compared were drawn either from the same category (e.g., images of two mammals, such as a dog and a horse), or from two dissimilar categories (e.g., a chair image mapped to a cat image). Humans showed systematic mapping patterns, but with greater variability in mapping responses when objects were drawn from dissimilar categories. We simulated the human response of analogical mapping using a computational model of mapping between 3D objects, visiPAM (visual Probabilistic Analogical Mapping). VisiPAM takes point-cloud representations of two 3D objects as inputs, and outputs the mapping between analogous parts of the two objects. VisiPAM consists of a visual module that constructs structural representations of individual objects, and a reasoning module that identifies a probabilistic mapping between parts of the two 3D objects. Model simulations not only capture the qualitative pattern of human mapping performance cross conditions, but also approach human-level reliability in solving visual analogy problems. 
    more » « less
  3. null (Ed.)
    Analogical reasoning fundamentally involves exploiting redundancy in a given task, but there are many different ways an intelligent agent can choose to define and exploit redundancy, often resulting in very different levels of task performance. We explore such variations in analogical reasoning within the domain of geometric matrix reasoning tasks, namely on the Raven’s Standard Progressive Matrices intelligence test. We show how different analogical constructions used by the same basic visual-imagery-based computational model—varying only in how they “slice” a matrix problem into parts and do search and optimization within/across these parts—achieve very different levels of test performance, ranging from 13/60 correct all the way up to 57/60 correct. Our findings suggest that the ability to select or build effective high-level analogical constructions can be as important as an agent’s competencies in low-level reasoning skills, which raises interesting open questions about the extent to which building the “right” analogies might contribute to individual differences in human matrix reasoning performance, and how intelligent agents might learn to build or select from among different analogical constructions in the first place. 
    more » « less
  4. Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.

    more » « less

    Analogy is a powerful tool for fostering conceptual understanding and transfer in STEM and other fields. Well‐constructed analogical comparisons focus attention on the causal‐relational structure of STEM concepts, and provide a powerful capability to draw inferences based on a well‐understood source domain that can be applied to a novel target domain. However, analogy must be applied with consideration to students' prior knowledge and cognitive resources. We briefly review theoretical and empirical support for incorporating analogy into education, and recommend five general principles to guide its application so as to maximize the potential benefits. For analogies to be effective, instructors should use well‐understood source analogs and explain correspondences fully; use visuospatial and verbal supports to emphasize shared structure among analogs; discuss the alignment between semantic and formal representations; reduce extraneous cognitive load imposed by analogical comparison; and encourage generation of inferences when students have some proficiency with the material. These principles can be applied flexibly to topics in a wide variety of domains.

    more » « less