skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 1, 2026

Title: Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. Existing research has generally focused on generating SQL statements from text queries, and the broader challenge lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably, temporal-related queries. Our dataset is sourced from a smart building’s IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data improves overall text-to-SQL performance, nearly matching that of substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data (i.e., they are bad at tabular data understanding), thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs.  more » « less
Award ID(s):
2145357
PAR ID:
10587598
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Proceedings of the Workshop on Trustworthy NLP (TrustNLP 2025@NAACL)
Date Published:
Page Range / eLocation ID:
1-12
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Text-to-SQL systems empower users to interact with databases using natural language, automatically translating queries into executable SQL code. However, their reliance on database schema information for SQL generation exposes them to significant security vulnerabilities, particularly schema inference attacks that can lead to unauthorized data access or manipulation. In this paper, we introduce a novel zero-knowledge framework for reconstructing the underlying database schema of text-to-SQL models without any prior knowledge of the database. Our approach systematically probes text-to-SQL models with specially crafted questions and leverages a surrogate GPT-4 model to interpret the outputs, effectively uncovering hidden schema elements—including tables, columns, and data types. We demonstrate that our method achieves high accuracy in reconstructing table names, with F1 scores of up to .99 for generative models and .78 for fine-tuned models, underscoring the severity of schema leakage risks. We also show that our attack can steal prompt information in non-text-to-SQL models. Furthermore, we propose a simple protection mechanism for generative models and empirically show its limitations in mitigating these attacks. 
    more » « less
  2. We introduce ThalamusDB, a novel approximate query processing system that processes complex SQL queries on multi-modal data. ThalamusDB supports SQL queries integrating natural language predicates on visual, audio, and text data. To answer such queries, ThalamusDB exploits a collection of zero-shot models in combination with relational processing. ThalamusDB utilizes deterministic approximate query processing, harnessing the relative efficiency of relational processing to mitigate the computational demands of machine learning inference. For evaluating a natural language predicate, ThalamusDB requests a small number of labels from users. User can specify their preferences on the performance objective regarding the three relevant metrics: approximation error, computation time, and labeling overheads. The ThalamusDB query optimizer chooses optimized plans according to user preferences, prioritizing data processing and requested labels to maximize impact. Experiments with several real-world data sets, taken from Craigslist, YouTube, and Netflix, show that ThalamusDB achieves an average speedup of 35.0x over MindsDB, an exact processing baseline, and outperforms ABAE, a sampling-based method, in 78.9% of cases. 
    more » « less
  3. As IoT device adoption grows, ensuring cybersecurity compliance with IoT standards, like National Institute of Standards and Technology Interagency (NISTIR) 8259A, has become increasingly complex. These standards are typically presented in lengthy, text-based formats that are difficult to process and query automatically. We built a knowledge graph to address this challenge to represent the key concepts, relationships, and references within NISTIR 8259A. We further integrate this knowledge graph with Retrieval-Augmented Generation (RAG) techniques that can be used by large language models (LLMs) to enhance the accuracy and contextual relevance of information retrieval. Additionally, we evaluate the performance of RAG using both graph-based queries and vector database embeddings. Our framework, implemented in Neo4j, was tested using multiple LLMs, including LLAMA2, Mistral-7B, and GPT-4. Our findings show that combining knowledge graphs with RAG significantly improves query precision and contextual relevance compared to unstructured vector-based retrieval methods. While traditional rule-based compliance tools were not evaluated in this study, our results demonstrate the advantages of structured, graph driven querying for security standards like NISTIR 8259A. 
    more » « less
  4. Recent advances in cyber-physical systems, artificial intelligence, and cloud computing have driven the wide deployments of Internet-of-things (IoT) in smart homes. As IoT devices often directly interact with the users and environments, this paper studies if and how we could explore the collective insights from multiple heterogeneous IoT devices to infer user activities for home safety monitoring and assisted living. Specifically, we develop a new system, namely IoTMosaic, to first profile diverse user activities with distinct IoT device event sequences, which are extracted from smart home network traffic based on their TCP/IP data packet signatures. Given the challenges of missing and out-of-order IoT device events due to device malfunctions or varying network and system latencies, IoTMosaic further develops simple yet effective approximate matching algorithms to identify user activities from real-world IoT network traffic. Our experimental results on thousands of user activities in the smart home environment over two months show that our proposed algorithms can infer different user activities from IoT network traffic in smart homes with the overall accuracy, precision, and recall of 0.99, 0.99, and 1.00, respectively. 
    more » « less
  5. State-of-the-art Text-to-SQL models rely on fine-tuning or few-shot prompting to help LLMs learn from training datasets containing mappings from natural language (NL) queries to SQL statements. Consequently, the quality of the dataset can greatly affect the accuracy of these Text-to-SQL models. Unlike other NL tasks, Text-to-SQL datasets are prone to errors despite extensive manual efforts due to the subtle semantics of SQL. Our study has found a non-negligible (>30%) portion of incorrect NL to SQL mapping cases exists in popular datasets Spider and BIRD. This paper aims to improve the quality of Text-to-SQL training datasets and thereby increase the accuracy of the resulting models. To do so, we propose a necessary correctness condition called execution consistency. For a given database instance, an NL to SQL mapping satisfies execution consistency if the execution result of an NL query matches that of the corresponding SQL. We develop SQLDriller to detect incorrect NL to SQL mappings based on execution consistency in a best-effort manner by crafting database instances that likely result in violations of execution consistency. It generates multiple candidate SQL predictions that differ in their syntax structures. Using a SQL equivalence checker, SQLDriller obtains counterexample database instances that can distinguish non-equivalent candidate SQLs. It then checks the execution consistency of an NL to SQL mapping under this set of counterexamples. The evaluation shows SQLDriller effectively detects and fixes incorrect mappings in the Text-to-SQL dataset, and it improves the model accuracy by up to 13.6%. 
    more » « less