skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Detection of Temporal Shifts in Semantics Using Local Graph Clustering
Many changes in our digital corpus have been brought about by the interplay between rapid advances in digital communication and the current environment characterized by pandemics, political polarization, and social unrest. One such change is the pace with which new words enter the mass vocabulary and the frequency at which meanings, perceptions, and interpretations of existing expressions change. The current state-of-the-art algorithms do not allow for an intuitive and rigorous detection of these changes in word meanings over time. We propose a dynamic graph-theoretic approach to inferring the semantics of words and phrases (“terms”) and detecting temporal shifts. Our approach represents each term as a stochastic time-evolving set of contextual words and is a count-based distributional semantic model in nature. We use local clustering techniques to assess the structural changes in a given word’s contextual words. We demonstrate the efficacy of our method by investigating the changes in the semantics of the phrase “Chinavirus”. We conclude that the term took on a much more pejorative meaning when the White House used the term in the second half of March 2020, although the effect appears to have been temporary. We make both the dataset and the code used to generate this paper’s results available.  more » « less
Award ID(s):
2154564
PAR ID:
10417882
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Machine Learning and Knowledge Extraction
Volume:
5
Issue:
1
ISSN:
2504-4990
Page Range / eLocation ID:
128 to 143
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Images can give us insights into the contextual meanings of words, but current image-text grounding approaches require detailed annotations. Such granular annotation is rare, expensive, and unavailable in most domain-specific contexts. In contrast, unlabeled multi-image, multi-sentence documents are abundant. Can lexical grounding be learned from such documents, even though they have significant lexical and visual overlap? Working with a case study dataset of real estate listings, we demonstrate the challenge of distinguishing highly correlated grounded terms, such as “kitchen” and “bedroom”, and introduce metrics to assess this document similarity. We present a simple unsupervised clustering-based method that increases precision and recall beyond object detection and image tagging baselines when evaluated on labeled subsets of the dataset. The proposed method is particularly effective for local contextual meanings of a word, for example associating “granite” with countertops in the real estate dataset and with rocky landscapes in a Wikipedia dataset. 
    more » « less
  2. Computational models of distributional semantics can analyze a corpus to derive representations of word meanings in terms of each word’s relationship to all other words in the corpus. While these models are sensitive to topic (e.g., tiger and stripes) and synonymy (e.g., soar and fly), the models have limited sensitivity to part of speech (e.g., book and shirt are both nouns). By augmenting a holographic model of semantic memory with additional levels of representations, we present evidence that sensitivity to syntax is supported by exploiting associations between words at varying degrees of separation. We find that sensitivity to associations at three degrees of separation reinforces the relationships between words that share part-of-speech and improves the ability of the model to construct grammatical sentences. Our model provides evidence that semantics and syntax exist on a continuum and emerge from a unitary cognitive system. 
    more » « less
  3. Knowing the perceived economic value of words is often desirable for applications such as product naming and pricing. However, there is a lack of understanding on the underlying economic worths of words, even though we have seen some breakthrough on learning the semantics of words. In this work, we bridge this gap by proposing a joint-task neural network model, Word Worth Model (WWM), to learn word embedding that captures the underlying economic worths. Through the design of WWM, we incorporate contextual factors, e.g., product’s brand name and restaurant’s city, that may affect the aggregated monetary value of a textual item. Via a comprehensive evaluation, we show that, compared with other baselines, WWM accurately predicts missing words when given target words. We also show that the learned embeddings of both words and contextual factors reflect well the underlying economic worths through various visualization analyses. 
    more » « less
  4. Human reasoning goes beyond knowledge about individual entities, extending to inferences based on relations between entities. Here we focus on the use of relations in verbal analogical mapping, sketching a general approach based on assessing similarity between patterns of semantic relations between words. This approach combines research in artificial intelligence with work in psychology and cognitive science, with the aim of minimizing hand coding of text inputs for reasoning tasks. The computational framework takes as inputs vector representations of individual word meanings, coupled with semantic representations of the relations between words, and uses these inputs to form semantic-relation networks for individual analogues. Analogical mapping is operationalized as graph matching under cognitive and computational constraints. The approach highlights the central role of semantics in analogical mapping. 
    more » « less
  5. To what extent do people attribute meanings to “nonsense” words? How general is such attribution of meaning? We used a set of words lacking conventional meanings to elicit drawings of made‐up creatures. Separate groups of participants rated the nonsense words and the drawings on several semantic dimensions, and selected what name best corresponded to each creature. Despite lacking conventional meanings, “nonsense” words elicited a high level of consistency in the produced drawings. Meaning attributions made to nonsense words corresponded with meaning attributions made by separate people to drawings that were inspired by the name. Naïve participants were able to recover the name that inspired the drawing with greater‐than‐chance accuracy. These results suggest that people make liberal and consistent use of non‐arbitrary relationships between forms and meanings. 
    more » « less