Automated Validating and Fixing of Text-to-SQL Translation with Execution Consistency

Yang, Yicun (ORCID:0009000453032599); Wang, Zhaoguo (ORCID:0000000202205726); Xia, Yu (ORCID:0009000274861391); Wei, Zhuoran (ORCID:0009000136032165); Ding, Haoran (ORCID:0009000881388639); Piskac, Ruzica (ORCID:0000000232670776); Chen, Haibo (ORCID:0000000297200361); Li, Jinyang (ORCID:0000000295741746)

doi:10.1145/3725271

Citation Details

This content will become publicly available on June 17, 2026

Automated Validating and Fixing of Text-to-SQL Translation with Execution Consistency

State-of-the-art Text-to-SQL models rely on fine-tuning or few-shot prompting to help LLMs learn from training datasets containing mappings from natural language (NL) queries to SQL statements. Consequently, the quality of the dataset can greatly affect the accuracy of these Text-to-SQL models. Unlike other NL tasks, Text-to-SQL datasets are prone to errors despite extensive manual efforts due to the subtle semantics of SQL. Our study has found a non-negligible (>30%) portion of incorrect NL to SQL mapping cases exists in popular datasets Spider and BIRD. This paper aims to improve the quality of Text-to-SQL training datasets and thereby increase the accuracy of the resulting models. To do so, we propose a necessary correctness condition called execution consistency. For a given database instance, an NL to SQL mapping satisfies execution consistency if the execution result of an NL query matches that of the corresponding SQL. We develop SQLDriller to detect incorrect NL to SQL mappings based on execution consistency in a best-effort manner by crafting database instances that likely result in violations of execution consistency. It generates multiple candidate SQL predictions that differ in their syntax structures. Using a SQL equivalence checker, SQLDriller obtains counterexample database instances that can distinguish non-equivalent candidate SQLs. It then checks the execution consistency of an NL to SQL mapping under this set of counterexamples. The evaluation shows SQLDriller effectively detects and fixes incorrect mappings in the Text-to-SQL dataset, and it improves the model accuracy by up to 13.6%. more »

Award ID(s):: 2220407

PAR ID:: 10651041

Author(s) / Creator(s):: Yang, Yicun; Wang, Zhaoguo; Xia, Yu; Wei, Zhuoran; Ding, Haoran; Piskac, Ruzica; Chen, Haibo; Li, Jinyang

Publisher / Repository:: ACM Journals

Date Published:: 2025-06-17

Journal Name:: Proceedings of the ACM on Management of Data

Volume:: 3

Issue:: 3

ISSN:: 2836-6573

Page Range / eLocation ID:: 1 to 28

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 17, 2026
Journal Article:
https://doi.org/10.1145/3725271

More Like this