skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ChainNet: Structured Metaphor and Metonymy in WordNet
The senses of a word exhibit rich internal structure. In a typical lexicon, this structure is overlooked: A word`s senses are encoded as a list, without inter-sense relations. We present ChainNet, a lexical resource which for the first time explicitly identifies these structures, by expressing how senses in the Open English Wordnet are derived from one another. In ChainNet, every nominal sense of a word is either connected to another sense by metaphor or metonymy, or is disconnected (in the case of homonymy). Because WordNet senses are linked to resources which capture information about their meaning, ChainNet represents the first dataset of grounded metaphor and metonymy.  more » « less
Award ID(s):
2019805
PAR ID:
10586887
Author(s) / Creator(s):
; ; ;
Editor(s):
Calzolari, N; Kan, M; Hoste, V; Lenci, A; Sakti, S; Xue, N
Publisher / Repository:
ELRA and ICCL
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A word in natural language can be polysemous, having multiple meanings, as well as synonymous, meaning the same thing as other words. Word sense induction attempts to find the senses of pol- ysemous words. Synonymy detection attempts to find when two words are interchangeable. We com- bine these tasks, first inducing word senses and then detecting similar senses to form word-sense synonym sets (synsets) in an unsupervised fashion. Given pairs of images and text with noun phrase labels, we perform synset induction to produce col- lections of underlying concepts described by one or more noun phrases. We find that considering multi- modal features from both visual and textual context yields better induced synsets than using either con- text alone. Human evaluations show that our unsu- pervised, multi-modally induced synsets are com- parable in quality to annotation-assisted ImageNet synsets, achieving about 84% of ImageNet synsets’ approval. 
    more » « less
  2. Abstract The use of metaphor in cybersecurity discourse has become a topic of interest because of its ability to aid communication about abstract security concepts. In this paper, we borrow from existing metaphor identification algorithms and general theories to create a lightweight metaphor identification algorithm, which uses only one external source of knowledge. The algorithm also introduces a real time corpus builder for extracting collocates; this is, identifying words that appear together more frequently than chance. We implement several variations of the introduced algorithm and empirically evaluate the output using the TroFi dataset, a de facto evaluation dataset in metaphor research. We find first, contrary to our expectation, that adding word sense disambiguation to our metaphor identification algorithm decreases its performance. Second, we find, that our lightweight algorithms perform comparably to their existing, more complex, counterparts. Finally, we present the results of several case studies to observe the utility of the algorithm for future research in linguistic metaphor identification in text related to cybersecurity texts and threats. 
    more » « less
  3. Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and self-supervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert’s runtime is comparable to BERT’s and it scales to large KBs. 
    more » « less
  4. Animacy is a necessary property for a referent to be an agent, and thus animacy detection is useful for a variety of natural language processing tasks, including word sense disambiguation, co-reference resolution, semantic role labeling, and others. Prior work treated animacy as a word-level property, and has developed statistical classifiers to classify words as either animate or inanimate. We discuss why this approach to the problem is ill-posed, and present a new approach based on classifying the animacy of co-reference chains. We show that simple voting approaches to inferring the animacy of a chain from its constituent words perform relatively poorly, and then present a hybrid system merging supervised machine learning (ML) and a small number of hand-built rules to compute the animacy of referring expressions and co-reference chains. This method achieves state of the art performance. The supervised ML component leverages features such as word embeddings over referring expressions, parts of speech, and grammatical and semantic roles. The rules take into consideration parts of speech and the hypernymy structure encoded in WordNet. The system achieves an F1 of 0.88 for classifying the animacy of referring expressions, which is comparable to state of the art results for classifying the animacy of words, and achieves an F1 of 0.75 for classifying the animacy of coreference chains themselves. We release our training and test dataset, which includes 142 texts (all narratives) comprising 156,154 words, 34,698 referring expressions, and 10,941 co-reference chains. We test the method on a subset of the OntoNotes dataset, showing using manual sampling that animacy classification is 90% +/- 2% accurate for coreference chains, and 92% +/- 1% for referring expressions. The data also contains 46 folktales, which present an interesting challenge because they often involve characters who are members of traditionally inanimate classes (e.g., stoves that walk, trees that talk). We show that our system is able to detect the animacy of these unusual referents with an F1 of 0.95. 
    more » « less
  5. null (Ed.)
    Abstract Phonological alternations are often specific to morphosyntactic context. For example, stress shift in English occurs in the presence of some suffixes, -al , but not others, -ing : "Equation missing" , "Equation missing" , "Equation missing" . In some cases a phonological process applies only in words of certain lexical categories. Previous theories have stipulated that such morphosyntactically conditioned phonology is word-bounded. In this paper we present a number of long-distance morphologically conditioned phonological effects, cases where phonological processes within one word are conditioned by another word or the presence of a morpheme in another word. We provide a model, Cophonologies by Phase, which extends Cophonology Theory, intended to capture word-internal and lexically specified phonological alternations, to cyclically generated syntactic constituents. We show that Cophonologies by Phase makes better predictions about the long-distance morphologically conditioned phonological effects we find across languages than previous frameworks. Furthermore, Cophonologies by Phase derives such effects without requiring the phonological component to directly reference syntactic features or structure. 
    more » « less