skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Finding Support for Tabular LLM Outputs
With the emerging advancements of AI, validating data generated by AI models becomes a key challenge. In this work, we tackle the problem of validating tabular data generated by large language models (LLMs). By leveraging a recently proposed technique called Gen-T, we present a technique to verify if the data in the LLM table can be reclaimed (reproduced) using tables available in a given data lake (for example, tables used to train the LLM). Specically, we measure the number of data lake tables that support tuples (or partial tuples) in a generated table. We further provide suggestions for value replacements if a generated value cannot be reclaimed. Using this approach, users can evaluate their LLM-generated tables, consider potential modications for table values, and gauge how much support the modied table has from the data lake. We discuss two case studies showing that table values annotated with reclama- tion support scores, along with possible value replacements, can help users assess the trustworthiness of LLM-generated tables.  more » « less
Award ID(s):
2325632 2107248 1956096
PAR ID:
10539018
Author(s) / Creator(s):
; ;
Publisher / Repository:
PVLDB Workshop on Tabular Data Analysis (TaDA)
Date Published:
Subject(s) / Keyword(s):
Data Management
Format(s):
Medium: X
Location:
China
Sponsoring Org:
National Science Foundation
More Like this
  1. EDBT (Ed.)
    Unionable table search techniques input a query table from a user and search for data lake tables that can contribute additional rows to the query table. The definition of unionability is gener- ally based on similarity measures which may include similarity between columns (e.g., value overlap or semantic similarity of the values in the columns) or tables (e.g., similarity of table embed- dings). Due to this and the large redundancy in many data lakes (which can contain many copies and versions of the same table), the most unionable tables may be identical or nearly identical to the query table and may contain little new information. Hence, we introduce the problem of identifying unionable tuples from a data lake that are diverse with respect to the tuples already present in a query table. We perform an extensive experimen- tal analysis of well-known diversity algorithms applied to this novel problem and identify a gap that we address with a novel, clustering-based tuple diversity algorithm called DUST. DUST uses a novel embedding model to represent unionable tuples that outperforms other tuple representation models by at least 15% when representing unionable tuples. Using real data lake bench- marks, we show that our diversification algorithm is more than six times faster than the most efficient diversification baseline. We also show that it is more effective in diversifying unionable tuples than existing diversification algorithms. 
    more » « less
  2. We introduce the problem of Table Reclamation. Given a Source Table and a large table repository, reclamation finds a set of tables that, when integrated, reproduce the source table as closely as possible. Unlike query discovery problems like Query-by-Example or by-Target, Table Reclamation focuses on reclaiming the data in the Source Table as fully as possible using real tables that may be incomplete or inconsistent. To do this, we define a new measure of table similarity, called error-aware instance similarity, to measure how close a reclaimed table is to a Source Table, a measure grounded in instance similarity used in data exchange. Our search covers not only SELECT-PROJECT- JOIN queries, but integration queries with unions, outerjoins, and the unary operators subsumption and complementation that have been shown to be important in data integration and fusion. Using reclamation, a data scientist can understand if any tables in a repository can be used to exactly reclaim a tuple in the Source. If not, one can understand if this is due to differences in values or to incompleteness in the data. Our solution, Gen- T, performs table discovery to retrieve a set of candidate tables from the table repository, filters these down to a set of originating tables, then integrates these tables to reclaim the Source as closely as possible. We show that our solution, while approximate, is accurate, efficient and scalable in the size of the table repository with experiments on real data lakes containing up to 15K tables, where the average number of tuples varies from small (web tables) to extremely large (open data tables) up to 1M tuples. 
    more » « less
  3. The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent work has proposed algorithms to detect LLM-generated text and protect LLMs. In this paper, we investigate the robustness and reliability of these LLM detectors under adversarial attacks. We study two types of attack strategies: 1) replacing certain words in an LLM’s output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation. In both strategies, we leverage an auxiliary LLM to generate the word replacements or the instructional prompt. Different from previous works, we consider a challenging setting where the auxiliary LLM can also be protected by a detector. Experiments reveal that our attacks effectively compromise the performance of all detectors in the study with plausible generations, underscoring the urgent need to improve the robustness of LLM-generated text detection systems. Code is available at https://github.com/shizhouxing/LLM-Detector-Robustness 
    more » « less
  4. While offering the potential to support learning interactions, emerging AI applications like Large Language Models (LLMs) come with ethical concerns. Grounding technology design in human values can address AI ethics and ensure adoption. To this end, we apply Value‐Sensitive Design—involving empirical, conceptual and technical investigations—to centre human values in the development and evaluation of LLM‐based chatbots within a high school environmental science curriculum. Representing multiple perspectives and expertise, the chatbots help students refine their causal models of climate change's impact on local marine ecosystems, communities and individuals. We first perform an empirical investigation leveraging participatory design to explore the values that motivate students and educators to engage with the chatbots. Then, we conceptualize the values that emerge from the empirical investigation by grounding them in research in ethical AI design, human values, human‐AI interactions and environmental education. Findings illuminate considerations for the chatbots to support students' identity development, well‐being, human–chatbot relationships and environmental sustainability. We further map the values onto design principles and illustrate how these principles can guide the development and evaluation of the chatbots. Our research demonstrates how to conduct contextual, value‐sensitive inquiries of emergent AI technologies in educational settings. Practitioner notesWhat is already known about this topicGenerative artificial intelligence (GenAI) technologies like Large Language Models (LLMs) can not only support learning, but also raise ethical concerns such as transparency, trust and accountability.Value‐sensitive design (VSD) presents a systematic approach to centring human values in technology design.What this paper addsWe apply VSD to design LLM‐based chatbots in environmental education and identify values central to supporting students' learning.We map the values emerging from the VSD investigations to several stages of GenAI technology development: conceptualization, development and evaluation.Implications for practice and/or policyIdentity development, well‐being, human–AI relationships and environmental sustainability are key values for designing LLM‐based chatbots in environmental education.Using educational stakeholders' values to generate design principles and evaluation metrics for learning technologies can promote technology adoption and engagement. 
    more » « less
  5. EDBT (Ed.)
    In data lakes, one of the core challenges remains finding rele- vant tables. We introduce the notion of semantic data lakes, i.e., repositories where datasets are linked to concepts and entities described in a knowledge graph (KG). We formalize the problem of semantic table search, i.e., retrieving tables containing informa- tion semantically related to a given set of entities, and provide the first formal definition of semantic relatedness of a dataset to tuples of entities. Our solution offers the first general framework to compute the semantic relevance of the contents of a table w.r.t. entity tuples, as well as efficient algorithms (exploiting seman- tic signals, such as entity types and embeddings) to scale the semantic search to repositories with hundreds of thousands of distinct tables. Our extensive experiments on both real-world and synthetic benchmarks show that our approach is able to retrieve more relevant tables (up to 5.4 times higher recall) in comparison to existing methods while ensuring fast response times (up to 17 times faster with LSH). 
    more » « less