Title: AI, visual imagery, and a case study on the challenges posed by human intelligence tests
Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place. more »« less
Ainooson, J.; Michelson, J.; Sanyal, D.; Palmer, J.; Kunda, M.
(, Proceedings of the Eighth Annual Conference on Advances in Cognitive Systems (ACS))
null
(Ed.)
In this paper, we present the Visuospatial Reasoning Environment for Experimentation (VREE). VREE provides a simulated environment where intelligent agents interact with virtual objects while solving different visuospatial reasoning tasks. This paper shows how VREE is valuable for studying the sufficiency of visual imagery approaches for a large number of visuospatial reasoning tasks as well as how diverse strategies can be represented and studied within a single task. We present results from computational experiments using VREE on the block design task and on numerous subtests from the Leiter-R test battery on nonverbal intelligence.
Michelson, J.; Sanyal, D.; Ainooson, J.; Kunda, M.
(, Proceedings of the Eighth Annual Conference on Advances in Cognitive Systems (ACS))
null
(Ed.)
Visuospatial reasoning refers to a diverse set of skills that involve thinking about space and time. An artificial agent with access to a sufficiently large set of visuospatial reasoning skills might be able to generalize its reasoning ability to an unprecedented expanse of tasks including portions of many popular intelligence tests. In this paper, we stress the importance of a developmental approach to the study of visuospatial reasoning, with an emphasis on fundamental skills. A comprehensive benchmark, with properties we outline in this paper including breadth, depth, explainability, and domain-specificity, would encourage and measure the genesis of such a skillset. Lacking an existing benchmark that satisfies these properties, we outline the design of a novel test in this paper. Such a benchmark would allow for expanding analysis of existing datasets’ and agents’ applicability to the problem of generalized visuospatial reasoning.
Sampat, Shailaja; Banerjee, Pratyay; Yang, Yezhou; and Baral, Chitta.
(, Findings of EMNLP 2022.)
Actions’ play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform ‘Reasoning about Actions & Change’ (RAC). This has been an important research direction in Artificial Intelligence (AI) in general, but the study of RAC with visual and linguistic inputs is relatively recent. The CLEVR_HYP is one such testbed for hypothetical vision-language reasoning with actions as the key focus. In this work, we propose a novel learning strategy that can improve reasoning about the effects of actions. We implement an encoder-decoder architecture to learn the representation of actions as vectors. We combine the aforementioned encoder-decoder architecture with existing modality parsers and a scene graph question answering model to evaluate our proposed system on the CLEVR_HYP dataset. We conduct thorough experiments to demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
Yang, Y.; McGreggor, K.; Kunda, M.
(, Proceedings of the Eighth Annual Conference on Advances in Cognitive Systems (ACS))
null
(Ed.)
Analogical reasoning fundamentally involves exploiting redundancy in a given task, but there are many different ways an intelligent agent can choose to define and exploit redundancy, often resulting in very different levels of task performance. We explore such variations in analogical reasoning within the domain of geometric matrix reasoning tasks, namely on the Raven’s Standard Progressive Matrices intelligence test. We show how different analogical constructions used by the same basic visual-imagery-based computational model—varying only in how they “slice” a matrix problem into parts and do search and optimization within/across these parts—achieve very different levels of test performance, ranging from 13/60 correct all the way up to 57/60 correct. Our findings suggest that the ability to select or build effective high-level analogical constructions can be as important as an agent’s competencies in low-level reasoning skills, which raises interesting open questions about the extent to which building the “right” analogies might contribute to individual differences in human matrix reasoning performance, and how intelligent agents might learn to build or select from among different analogical constructions in the first place.
Abstract Artificial intelligence in the workplace is becoming increasingly common. These tools are sometimes used to aid users in performing their task, for example, when an artificial intelligence tool assists a radiologist in their search for abnormalities in radiographic images. The use of artificial intelligence brings a wealth of benefits, such as increasing the efficiency and efficacy of performance. However, little research has been conducted to determine how the use of artificial intelligence assistants might affect the user’s cognitive skills. In this theoretical perspective, we discuss how artificial intelligence assistants might accelerate skill decay among experts and hinder skill acquisition among learners. Further, we discuss how AI assistants might also prevent experts and learners from recognizing these deleterious effects. We then discuss the types of questions: use-inspired basic cognitive researchers, applied researchers, and computer science researchers should seek to answer. We conclude that multidisciplinary research from use-inspired basic cognitive research, domain-specific applied research, and technical research (e.g., human factors research, computer science research) is needed to (a) understand these potential consequences, (b) design artificial intelligence systems to mitigate these impacts, and (c) develop training and use protocols to prevent negative impacts on users’ cognitive skills. Only by answering these questions from multidisciplinary perspectives can we harness the benefits of artificial intelligence in the workplace while preventing negative impacts on users’ cognitive skills.
Kunda, Maithilee. AI, visual imagery, and a case study on the challenges posed by human intelligence tests. Proceedings of the National Academy of Sciences 117.47 Web. doi:10.1073/pnas.1912335117.
Kunda, Maithilee. AI, visual imagery, and a case study on the challenges posed by human intelligence tests. Proceedings of the National Academy of Sciences, 117 (47). https://doi.org/10.1073/pnas.1912335117
Kunda, Maithilee.
"AI, visual imagery, and a case study on the challenges posed by human intelligence tests". Proceedings of the National Academy of Sciences 117 (47). Country unknown/Code not available: Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.1912335117.https://par.nsf.gov/biblio/10202822.
@article{osti_10202822,
place = {Country unknown/Code not available},
title = {AI, visual imagery, and a case study on the challenges posed by human intelligence tests},
url = {https://par.nsf.gov/biblio/10202822},
DOI = {10.1073/pnas.1912335117},
abstractNote = {Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.},
journal = {Proceedings of the National Academy of Sciences},
volume = {117},
number = {47},
publisher = {Proceedings of the National Academy of Sciences},
author = {Kunda, Maithilee},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.