skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Curious Case of Combining Text and Visualization
Visualization research has made significant progress in demonstrating the value of graphical data representation. Even still, the value added by static visualization is disputed in some areas. When presenting Bayesian reasoning information, for example, some studies suggest that combining text and visualizations could have an interactive effect. In this paper, we use eye tracking to compare how people extract information from text and visualization. Using a Bayesian reasoning problem as a test bed, we provide evidence that visualization makes it easier to identify critical information, but that once identified as critical, information is more easily extracted from the text. These tendencies persist even when text and visualization are presented together, indicating that users do not integrate information well across the two representation types. We discuss these findings and argue that effective representations should consider the ease of both information identification and extraction.  more » « less
Award ID(s):
1755734
PAR ID:
10334141
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
EuroVis 2019 - Short Papers
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Spatial reasoning over text is challenging as the models not only need to extract the direct spatial information from the text but also reason over those and infer implicit spatial relations. Recent studies highlight the struggles even large language models encounter when it comes to performing spatial reasoning over text. In this paper, we explore the potential benefits of disentangling the processes of information extraction and reasoning in models to address this challenge. To explore this, we design various models that disentangle extraction and reasoning(either symbolic or neural) and compare them with state-of-the-art(SOTA) baselines with no explicit design for these parts. Our experimental results consistently demonstrate the efficacy of disentangling, showcasing its ability to enhance models{'} generalizability within realistic data domains. 
    more » « less
  2. This thesis investigates the computational modeling of belief and related cognitive states as expressed in text and speech. Understanding how speakers or authors convey commitment, certainty, and emotions is crucial for language understanding, yet poses significant challenges for current NLP systems. We present a comprehensive study spanning multiple facets of belief prediction. We begin by re-examining the widely used FactBank corpus, correcting a critical projection error and establishing new state-of-the-art results for author-only belief prediction through multi-task learning and error analysis. We then tackle the more complex task of source-and-target belief prediction, introducing a novel generative framework using Flan-T5. This includes developing a structured database representation for FactBank and proposing a linearized tree generation approach, culminating in the BeLeaf system for visualization and analysis, which achieves state-of-the-art performance on both FactBank and the MDP corpus. With the rise of large language models (LLMs), we investigate their zero-shot capabilities for the source-and-target belief task. We propose Unified and Hybrid prompting frameworks, finding that while current LLMs struggle, particularly with nested beliefs, our Hybrid approach paired with reasoning-focused LLMs achieves new state-of-the-art results on FactBank. Finally, we explore the role of multimodality among multiple cognitive states. We present the first study on multimodal belief prediction using the CB-Prosody corpus, demonstrating that integrating audio features via fine-tuned Whisper models significantly improves performance over text-only BERT models. We further introduce Synthetic Audio Data (SAD), showing that even synthetic audio generated by TTS systems provides orthogonal, beneficial signals for various cognitive state tasks (belief, emotion, sentiment). We conclude by presenting OmniVox, the first systematic evaluation of omni-LLMs for zero-shot emotion recognition directly from audio, demonstrating their competitiveness with fine-tuned models and analyzing their acoustic reasoning capabilities. 
    more » « less
  3. Proc. 2023 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (Ed.)
    Representation learning on networks aims to derive a meaningful vector representation for each node, thereby facilitating downstream tasks such as link prediction, node classification, and node clustering. In heterogeneous text-rich networks, this task is more challenging due to (1) presence or absence of text: Some nodes are associated with rich textual information, while others are not; (2) diversity of types: Nodes and edges of multiple types form a heterogeneous network structure. As pretrained language models (PLMs) have demonstrated their effectiveness in obtaining widely generalizable text representations, a substantial amount of effort has been made to incorporate PLMs into representation learning on text-rich networks. However, few of them can jointly consider heterogeneous structure (network) information as well as rich textual semantic information of each node effectively. In this paper, we propose Heterformer, a Heterogeneous Network-Empowered Transformer that performs contextualized text encoding and heterogeneous structure encoding in a unified model. Specifically, we inject heterogeneous structure information into each Transformer layer when encoding node texts. Meanwhile, Heterformer is capable of characterizing node/edge type heterogeneity and encoding nodes with or without texts. We conduct comprehensive experiments on three tasks (i.e., link prediction, node classification, and node clustering) on three large-scale datasets from different domains, where Heterformer outperforms competitive baselines significantly and consistently. 
    more » « less
  4. Knowledge representation and reasoning (KRR) is key to the vision of the intelligent Web. Unfortunately, wide deployment of KRR is hindered by the difficulty in specifying the requisite knowledge, which requires skills that most domain experts lack. A way around this problem could be to acquire knowledge automatically from documents. The difficulty is that, KRR requires high-precision knowledge and is sensitive even to small amounts of errors. Although most automatic information extraction systems developed for general text understandings have achieved remarkable results, their accuracy is still woefully inadequate for logical reasoning. A promising alternative is to ask the domain experts to author knowledge in Controlled Natural Language (CNL). Nonetheless, the quality of knowledge construc- tion even through CNL is still grossly inadequate, the main obstacle being the multiplicity of ways the same information can be described even in a controlled language. Our previous work addressed the problem of high accuracy knowledge authoring for KRR from CNL documents by introducing the Knowledge Au- thoring Logic Machine (KALM). This paper develops the query aspect of KALM with the aim of getting high precision answers to CNL questions against previously authored knowledge and is tolerant to linguistic variations in the queries. To make queries more expressive and easier to formulate, we propose a hybrid CNL, i.e., a CNL with elements borrowed from formal query languages. We show that KALM achieves superior accuracy in semantic parsing of such queries. 
    more » « less
  5. Spatial Reasoning from language is essential for natural language understanding. Supporting it requires a representation scheme that can capture spatial phenomena encountered in language as well as in images and videos. Existing spatial representations are not sufficient for describing spatial configurations used in complex tasks. This paper extends the capabilities of existing spatial representation languages and increases coverage of the semantic aspects that are needed to ground spatial meaning of natural language text in the world. Our spatial relation language is able to represent a large, comprehensive set of spatial concepts crucial for reasoning and is designed to support composition of static and dynamic spatial configurations. We integrate this language with the Abstract Meaning Representation (AMR) annotation schema and present a corpus annotated by this extended AMR. To exhibit the applicability of our representation scheme, we annotate text taken from diverse datasets and show how we extend the capabilities of existing spatial representation languages with fine-grained decomposition of semantics and blend it seamlessly with AMRs of sentences and discourse representations as a whole. 
    more » « less