skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Causal Event Graph-Guided Language-based Spatiotemporal Question Answering
Large Language Models have excelled at encoding and leveraging language patterns in large text-based corpora for various tasks, including spatiotemporal event-based question answering (QA). However, due to encoding a text-based projection of the world, they have also been shown to lack a full bodied understanding of such events, e.g., a sense of intuitive physics, and cause-and-effect relationships among events. In this work, we propose using causal event graphs (CEGs) to enhance language understanding of spatiotemporal events in language models, using a novel approach that also provides proofs for the model’s capture of the CEGs. A CEG consists of events denoted by nodes, and edges that denote cause and effect relationships among the events. We perform experimentation and evaluation of our approach for benchmark spatiotemporal QA tasks and show effective performance, both quantitative and qualitative, over state-of-the-art baseline methods.  more » « less
Award ID(s):
2335967
PAR ID:
10530765
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
AAAI
Date Published:
Journal Name:
Proceedings of the AAAI Symposium Series
Volume:
3
Issue:
1
ISSN:
2994-4317
Page Range / eLocation ID:
227 to 233
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Tracking entities throughout a procedure de- scribed in a text is challenging due to the dy- namic nature of the world described in the pro- cess. Firstly, we propose to formulate this task as a question answering problem. This en- ables us to use pre-trained transformer-based language models on other QA benchmarks by adapting those to the procedural text un- derstanding. Secondly, since the transformer- based language models cannot encode the flow of events by themselves, we propose a Time- Stamped Language Model (TSLM model) to encode event information in LMs architec- ture by introducing the timestamp encoding. Our model evaluated on the Propara dataset shows improvements on the published state- of-the-art results with a 3.1% increase in F1 score. Moreover, our model yields better re- sults on the location prediction task on the NPN-Cooking dataset. This result indicates that our approach is effective for procedural text understanding in general. 
    more » « less
  2. Recent work has shown that prompting language models with code-like representations of natural language leads to performance improvements on structured reasoning tasks. However, such tasks comprise only a small subset of all natural language tasks. In our work, we seek to answer whether or not code-prompting is the preferred way of interacting with language models in general. We compare code and text prompts across three popular GPT models (davinci, code-davinci-002, and text-davinci-002) on a broader selection of tasks (e.g., QA, sentiment, summarization) and find that with few exceptions, code prompts do not consistently outperform text prompts. Furthermore, we show that the style of code prompt has a large effect on performance for some (but not all) tasks and that fine-tuning on text instructions leads to better relative performance of code prompts. 
    more » « less
  3. Commonsense question answering has primarily been tackled through supervised transfer learning, where a language model pre-trained on large amounts of data is used as the starting point. While successful, the approach requires large amounts of labeled question-answer pairs, with increasingly larger amounts of data required as the complexity of scenarios or tasks such as commonsense QA increases. In this paper, we hypothesize that large-scale pre-training of language models encodes the necessary commonsense knowledge to answer common questions in context without labeled data. We propose a novel framework called Iterative Self Distillation for QA (ISD-QA), which extracts the “dark knowledge” encoded during largescale pre-training of language models to provide supervision for commonsense question answering. We show that the approach can be used to train common neural QA models for commonsense question answering by distilling knowledge from language models in an unsupervised manner. With no bells and whistles, we achieve an average of 68% of the performance of fully supervised QA models while requiring no labeled training data. Extensive experiments on three public benchmarks (OpenBookQA, HellaSWAG, and CommonsenseQA) show the effectiveness of the proposed approach. 
    more » « less
  4. ABSTRACT Retrieval and recommendation are two essential tasks in modern search tools. This paper introduces a novel retrieval‐reranking framework leveraging large language models to enhance the spatiotemporal and semantic associated mining and recommendation of relevant, unusual climate and environmental events described in news articles and web posts. This framework uses advanced natural language processing techniques to address the limitations of traditional manual curation methods in terms of high labor costs and lack of scalability. Specifically, we explore an optimized solution to employ cutting‐edge embedding models for semantically analyzing spatiotemporal events (news) and propose a Geo‐Time Re‐ranking strategy that integrates multi‐faceted criteria including spatial proximity, temporal association, semantic similarity, and category‐instructed similarity to rank and identify similar spatiotemporal events. We apply the proposed framework to a dataset of four thousand local environmental observer network events, achieving top performance on recommending similar events among multiple cutting‐edge dense retrieval models. The search and recommendation pipeline can be applied to a wide range of similar data search tasks dealing with geospatial and temporal data. We hope that by linking relevant events, we can better aid the general public to gain enhanced understanding on climate change and its impact on different communities. 
    more » « less
  5. null (Ed.)
    Automated event extraction in social science applications often requires corpus-level evaluations: for example, aggregating text predictions across metadata and unbiased estimates of recall. We combine corpus-level evaluation requirements with a real-world, social science setting and introduce the IndiaPoliceEvents corpus—all 21,391 sentences from 1,257 English-language Times of India articles about events in the state of Gujarat during March 2002. Our trained annotators read and label every document for mentions of police activity events, allowing for unbiased recall evaluations. In contrast to other datasets with structured event representations, we gather annotations by posing natural questions, and evaluate off-the-shelf models for three different tasks: sentence classification, document ranking, and temporal aggregation of target events. We present baseline results from zero-shot BERT-based models fine-tuned on natural language inference and passage retrieval tasks. Our novel corpus-level evaluations and annotation approach can guide creation of similar social-science-oriented resources in the future. 
    more » « less