skip to main content

Title: Emergence of analogy from relation learning

By middle childhood, humans are able to learn abstract semantic relations (e.g., antonym, synonym, category membership) and use them to reason by analogy. A deep theoretical challenge is to show how such abstract relations can arise from nonrelational inputs, thereby providing key elements of a protosymbolic representation system. We have developed a computational model that exploits the potential synergy between deep learning from “big data” (to create semantic features for individual words) and supervised learning from “small data” (to create representations of semantic relations between words). Given as inputs labeled pairs of lexical representations extracted by deep learning, the model creates augmented representations by remapping features according to the rank of differences between values for the two words in each pair. These augmented representations aid in coping with the feature alignment problem (e.g., matching those features that make “love-hate” an antonym with the different features that make “rich-poor” an antonym). The model extracts weight distributions that are used to estimate the probabilities that new word pairs instantiate each relation, capturing the pattern of human typicality judgments for a broad range of abstract semantic relations. A measure of relational similarity can be derived and used to solve simple verbal analogies with more » human-level accuracy. Because each acquired relation has a modular representation, basic symbolic operations are enabled (notably, the converse of any learned relation can be formed without additional training). Abstract semantic relations can be induced by bootstrapping from nonrelational inputs, thereby enabling relational generalization and analogical reasoning.

« less
Authors:
; ;
Award ID(s):
1827374
Publication Date:
NSF-PAR ID:
10086056
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
116
Issue:
10
Page Range or eLocation-ID:
p. 4176-4181
ISSN:
0027-8424
Publisher:
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. The ability to recognize and make inductive inferences based on relational similarity is fundamental to much of human higher cognition. However, relational similarity is not easily defined or measured, which makes it difficult to determine whether individual differences in cognitive capacity or semantic knowledge impact relational processing. In two experiments, we used a multi-arrangement task (previously applied to individual words or objects) to efficiently assess similarities between word pairs instantiating various abstract relations. Experiment 1 established that the method identifies word pairs expressing the same relation as more similar to each other than to those expressing different relations. Experiment 2more »extended these results by showing that relational similarity measured by the multi-arrangement task is sensitive to more subtle distinctions. Word pairs instantiating the same specific subrelation were judged as more similar to each other than to those instantiating different subrelations within the same general relation type. In addition, Experiment 2 found that individual differences in both fluid intelligence and crystalized verbal intelligence correlated with differentiation of relation similarity judgments.« less
  2. Definition Extraction (DE) is one of the well-known topics in Information Extraction that aims to identify terms and their corresponding definitions in unstructured texts. This task can be formalized either as a sentence classification task (i.e., containing term-definition pairs or not) or a sequential labeling task (i.e., identifying the boundaries of the terms and definitions). The previous works for DE have only focused on one of the two approaches, failing to model the inter-dependencies between the two tasks. In this work, we propose a novel model for DE that simultaneously performs the two tasks in a single framework to benefitmore »from their inter-dependencies. Our model features deep learning architectures to exploit the global structures of the input sentences as well as the semantic consistencies between the terms and the definitions, thereby improving the quality of the representation vectors for DE. Besides the joint inference between sentence classification and sequential labeling, the proposed model is fundamentally different from the prior work for DE in that the prior work has only employed the local structures of the input sentences (i.e., word-to-word relations), and not yet considered the semantic consistencies between terms and definitions. In order to implement these novel ideas, our model presents a multi-task learning framework that employs graph convolutional neural networks and predicts the dependency paths between the terms and the definitions. We also seek to enforce the consistency between the representations of the terms and definitions both globally (i.e., increasing semantic consistency between the representations of the entire sentences and the terms/definitions) and locally (i.e., promoting the similarity between the representations of the terms and the definitions). The extensive experiments on three benchmark datasets demonstrate the effectiveness of our approach.1« less
  3. Scene graph generation refers to the task of automatically mapping an image into a semantic structural graph, which requires correctly labeling each extracted object and their interaction relationships. Despite the recent success in object detection using deep learning techniques, inferring complex contextual relationships and structured graph representations from visual data remains a challenging topic. In this study, we propose a novel Attentive Relational Network that consists of two key modules with an object detection backbone to approach this problem. The first module is a semantic transformation module utilized to capture semantic embedded relation features, by translating visual features and linguisticmore »features into a common semantic space. The other module is a graph self-attention module introduced to embed a joint graph representation through assigning various importance weights to neighboring nodes. Finally, accurate scene graphs are produced by the relation inference module to recognize all entities and the corresponding relations. We evaluate our proposed method on the widely-adopted Visual Genome dataset, and the results demonstrate the effectiveness and superiority of our model.« less
  4. Computational models of verbal analogy and relational similarity judgments can employ different types of vector representations of word meanings (embeddings) generated by machine-learning algorithms. An important question is whether human-like relational processing depends on explicit representations of relations (i.e., representations separable from those of the concepts being related), or whether implicit relation representations suffice. Earlier machine-learning models produced static embeddings for individual words, identical across all contexts. However, more recent Large Language Models (LLMs), which use transformer architectures applied to much larger training corpora, are able to produce contextualized embeddings that have the potential to capture implicit knowledge of semanticmore »relations. Here we compare multiple models based on different types of embeddings to human data concerning judgments of relational similarity and solutions of verbal analogy problems. For two datasets, a model that learns explicit representations of relations, Bayesian Analogy with Relational Transformations (BART), captured human performance more successfully than either a model using static embeddings (Word2vec) or models using contextualized embeddings created by LLMs (BERT, RoBERTa, and GPT-2). These findings support the proposal that human thinking depends on representations that separate relations from the concepts they relate.« less
  5. Analogy problems involving multiple ordered relations of the same type create mapping ambiguity, requiring some mechanism for relational integration to achieve mapping accuracy. We address the question of whether the integration of ordered relations depends on their logical form alone, or on semantic representations that differ across relation types. We developed a triplet mapping task that provides a basic paradigm to investigate analogical reasoning with simple relational structures. Experimental results showed that mapping performance differed across orderings based on category, linear order, and causal relations, providing evidence that each transitive relation has its own semantic representation. Hence, human analogical mappingmore »of ordered relations does not depend solely on their formal property of transitivity. Instead, human ability to solve mapping problems by integrating relations relies on the semantics of relation representations. We also compared human performance to the performance of several vector-based computational models of analogy. These models performed above chance but fell short of human performance for some relations, highlighting the need for further model development.« less