skip to main content


Search for: All records

Award ID contains: 1730044

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.

     
    more » « less
  2. null (Ed.)
    Many cognitive assessments are limited by their reliance on relatively sparse measures of performance, like per-item accuracy or reaction time. Capturing more detailed behavioral measurements from cognitive assessments will enhance their utility in many settings, from individual clinical evaluations to large-scale research studies. We demonstrate the feasibility of combining scene and gaze cameras with supervised learning algorithms to automatically measure key behaviors on the block design test, a widely used test of visuospatial cognitive ability. We also discuss how this block-design measurement system could enhance the assessment of many critical cognitive and meta-cognitive functions such as attention, planning, progress monitoring, and strategy selection. 
    more » « less
  3. null (Ed.)
    Visuospatial reasoning refers to a diverse set of skills that involve thinking about space and time. An artificial agent with access to a sufficiently large set of visuospatial reasoning skills might be able to generalize its reasoning ability to an unprecedented expanse of tasks including portions of many popular intelligence tests. In this paper, we stress the importance of a developmental approach to the study of visuospatial reasoning, with an emphasis on fundamental skills. A comprehensive benchmark, with properties we outline in this paper including breadth, depth, explainability, and domain-specificity, would encourage and measure the genesis of such a skillset. Lacking an existing benchmark that satisfies these properties, we outline the design of a novel test in this paper. Such a benchmark would allow for expanding analysis of existing datasets’ and agents’ applicability to the problem of generalized visuospatial reasoning. 
    more » « less
  4. null (Ed.)
    Psychologists recognize Raven’s Progressive Matrices as a useful test of general human intelligence. While many computational models investigate various forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test performance. In this work, we investigate how Gestalt visual reasoning on the Raven’s test can be modeled using generative image inpainting techniques from computer vision. We demonstrate that a reasoning agent that has access to an off- the-shelf inpainting model trained only on photorealistic images of objects achieves a score of 27/36 on the Colored Progressive Matrices, which corresponds to average performance for nine-year-old children. We also show that when our agent uses inpainting models trained on other datasets (faces, places, and textures), it does not perform as well. Our results illustrate how learning visual regularities in real-world images can translate into successful reasoning about artificial test stimuli. On the flip side, our results also highlight the limitations of such transfer, which may contribute to explanations for why intelligence tests like the Raven’s are often sensitive to people’s individual sociocultural backgrounds. 
    more » « less
  5. null (Ed.)
    Analogical reasoning fundamentally involves exploiting redundancy in a given task, but there are many different ways an intelligent agent can choose to define and exploit redundancy, often resulting in very different levels of task performance. We explore such variations in analogical reasoning within the domain of geometric matrix reasoning tasks, namely on the Raven’s Standard Progressive Matrices intelligence test. We show how different analogical constructions used by the same basic visual-imagery-based computational model—varying only in how they “slice” a matrix problem into parts and do search and optimization within/across these parts—achieve very different levels of test performance, ranging from 13/60 correct all the way up to 57/60 correct. Our findings suggest that the ability to select or build effective high-level analogical constructions can be as important as an agent’s competencies in low-level reasoning skills, which raises interesting open questions about the extent to which building the “right” analogies might contribute to individual differences in human matrix reasoning performance, and how intelligent agents might learn to build or select from among different analogical constructions in the first place. 
    more » « less
  6. null (Ed.)
    In this paper, we present the Visuospatial Reasoning Environment for Experimentation (VREE). VREE provides a simulated environment where intelligent agents interact with virtual objects while solving different visuospatial reasoning tasks. This paper shows how VREE is valuable for studying the sufficiency of visual imagery approaches for a large number of visuospatial reasoning tasks as well as how diverse strategies can be represented and studied within a single task. We present results from computational experiments using VREE on the block design task and on numerous subtests from the Leiter-R test battery on nonverbal intelligence. 
    more » « less
  7. null (Ed.)
    Nonverbal task learning is defined here as a variant of interactive task learning in which an agent learns the definition of a new task without any verbal information such as task instructions. Instead, the agent must 1) learn the task definition using only a single solved example problem as its training input, and then 2) generalize this definition in order to successfully parse new problems. In this paper, we present a conceptual framework for nonverbal task learning, and we compare and contrast this type of learning with existing learning paradigms in AI. We also discuss nonverbal task learning in the context of nonverbal human intelligence tests, which are standardized tests designed to be given without any verbal instructions so that they can be used by people with language difficulties. 
    more » « less
  8. Understanding how a person thinks, i.e., measuring a single individual’s cognitive characteristics, is challenging because cognition is not directly observable. Practically speaking, standardized cognitive tests (tests of IQ, memory, attention, etc.), with results interpreted by expert clinicians, represent the state of the art in measuring a person’s cognition. Three areas of AI show particular promise for improving the effectiveness of this kind of cognitive testing: 1) behavioral sensing, to more robustly quantify individual test-taker behaviors, 2) data mining, to identify and extract meaningful patterns from behavioral datasets; and 3) cognitive modeling, to help map ob- served behaviors onto hypothesized cognitive strategies. We bring these three areas of AI research together in a unified conceptual framework and provide a sampling of recent work in each area. Continued research at the nexus of AI and cognitive testing has potentially far-reaching implications for society in virtually every context in which measuring cognition is important, including research across many disciplines of cognitive science as well as applications in clinical, educational, and workforce settings. 
    more » « less
  9. Do people have dispositions towards visual or verbal thinking styles, i.e., a tendency towards one default representational modality versus the other? The problem in trying to answer this question is that visual/verbal thinking styles are challenging to measure. Subjective, introspective measures are the most common but often show poor reliability and validity; neuroimaging studies can provide objective evidence but are intrusive and resource-intensive. In previous work, we observed that in order for a purely behavioral testing method to be able to objectively evaluate a person’s visual/verbal thinking style, 1) the task must be solvable equally well using either visual or verbal mental representations, and 2) it must offer a secondary behavioral marker, in addition to primary performance measures, that indicates which modality is being used. We collected four such tasks from the psychology literature and conducted a small pilot study with adult participants to see the extent to which visual/verbal thinking styles can be differentiated using an individual’s results on these tasks. 
    more » « less