skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: Modeling Gestalt visual reasoning on Raven’s Matrices using generative image inpainting techniques.
Psychologists recognize Raven’s Progressive Matrices as a useful test of general human intelligence. While many computational models investigate various forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test performance. In this work, we investigate how Gestalt visual reasoning on the Raven’s test can be modeled using generative image inpainting techniques from computer vision. We demonstrate that a reasoning agent that has access to an off- the-shelf inpainting model trained only on photorealistic images of objects achieves a score of 27/36 on the Colored Progressive Matrices, which corresponds to average performance for nine-year-old children. We also show that when our agent uses inpainting models trained on other datasets (faces, places, and textures), it does not perform as well. Our results illustrate how learning visual regularities in real-world images can translate into successful reasoning about artificial test stimuli. On the flip side, our results also highlight the limitations of such transfer, which may contribute to explanations for why intelligence tests like the Raven’s are often sensitive to people’s individual sociocultural backgrounds.  more » « less
Award ID(s):
1730044
NSF-PAR ID:
10209969
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the Eighth Annual Conference on Advances in Cognitive Systems (ACS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Analogical reasoning fundamentally involves exploiting redundancy in a given task, but there are many different ways an intelligent agent can choose to define and exploit redundancy, often resulting in very different levels of task performance. We explore such variations in analogical reasoning within the domain of geometric matrix reasoning tasks, namely on the Raven’s Standard Progressive Matrices intelligence test. We show how different analogical constructions used by the same basic visual-imagery-based computational model—varying only in how they “slice” a matrix problem into parts and do search and optimization within/across these parts—achieve very different levels of test performance, ranging from 13/60 correct all the way up to 57/60 correct. Our findings suggest that the ability to select or build effective high-level analogical constructions can be as important as an agent’s competencies in low-level reasoning skills, which raises interesting open questions about the extent to which building the “right” analogies might contribute to individual differences in human matrix reasoning performance, and how intelligent agents might learn to build or select from among different analogical constructions in the first place. 
    more » « less
  2. Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.

     
    more » « less
  3. Abstract

    While the brain’s functional network architecture is largely conserved between resting and task states, small but significant changes in functional connectivity support complex cognition. In this study, we used a modified Raven’s Progressive Matrices Task to examine symbolic and perceptual reasoning in human participants undergoing fMRI scanning. Previously, studies have focused predominantly on discrete symbolic versions of matrix reasoning, even though the first few trials of the Raven’s Advanced Progressive Matrices task consist of continuous perceptual stimuli. Our analysis examined the activation patterns and functional reconfiguration of brain networks associated with resting state and both symbolic and perceptual reasoning. We found that frontoparietal networks, including the cognitive control and dorsal attention networks, were significantly activated during abstract reasoning. We determined that these same task-active regions exhibited flexibly-reconfigured functional connectivity when transitioning from resting state to the abstract reasoning task. Conversely, we showed that a stable network core of regions in default and somatomotor networks was maintained across both resting and task states. We propose that these regionally-specific changes in the functional connectivity of frontoparietal networks puts the brain in a “task-ready” state, facilitating efficient task-based activation.

     
    more » « less
  4. Abstract

    Advances in artificial intelligence have raised a basic question about human intelligence: Is human reasoning best emulated by applying task‐specific knowledge acquired from a wealth of prior experience, or is it based on the domain‐general manipulation and comparison of mental representations? We address this question for the case of visual analogical reasoning. Using realistic images of familiar three‐dimensional objects (cars and their parts), we systematically manipulated viewpoints, part relations, and entity properties in visual analogy problems. We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) that were directly trained to solve these problems and to apply their task‐specific knowledge to analogical reasoning. We also developed a new model using part‐based comparison (PCM) by applying a domain‐general mapping procedure to learned representations of cars and their component parts. Across four‐term analogies (Experiment 1) and open‐ended analogies (Experiment 2), the domain‐general PCM model, but not the task‐specific deep learning models, generated performance similar in key aspects to that of human reasoners. These findings provide evidence that human‐like analogical reasoning is unlikely to be achieved by applying deep learning with big data to a specific type of analogy problem. Rather, humans do (and machines might) achieve analogical reasoning by learning representations that encode structural information useful for multiple tasks, coupled with efficient computation of relational similarity.

     
    more » « less
  5. Is intelligence realized by connectionist or classicist? While connectionist approaches have achieved superhuman performance, there has been growing evidence that such task-specific superiority is particularly fragile in systematic generalization. This observation lies in the central debate between connectionist and classicist, wherein the latter continually advocates an algebraic treatment in cognitive architectures. In this work, we follow the classicist’s call and propose a hybrid approach to improve systematic generalization in reasoning. Specifically, we showcase a prototype with algebraic representation for the abstract spatial-temporal reasoning task of Raven’s Progressive Matrices (RPM) and present the ALgebra-Aware Neuro-Semi-Symbolic (ALANS) learner. The ALANS learner is motivated by abstract algebra and the representation theory. It consists of a neural visual perception frontend and an algebraic abstract reasoning backend: the frontend summarizes the visual information from object-based representation, while the backend transforms it into an algebraic structure and induces the hidden operator on the fly. The induced operator is later executed to predict the answer’s representation, and the choice most similar to the prediction is selected as the solution. Extensive experiments show that by incorporating an algebraic treatment, the ALANS learner outperforms various pure connectionist models in domains requiring systematic generalization. We further show the generative nature of the learned algebraic representation; it can be decoded by isomorphism to generate an answer. 
    more » « less