NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Membership Testing for Semantic Regular Expressions

https://doi.org/10.1145/3729300

Huang, Yifei; Amini, Matin; Le_Glaunec, Alexis; Mamouras, Konstantinos; Raghothaman, Mukund (June 2025, Proceedings of the ACM on Programming Languages)

This paper is about semantic regular expressions (SemREs). This is a concept that was recently proposed by Smore (Chen et al. 2023) in which classical regular expressions are extended with a primitive to query external oracles such as databases and large language models (LLMs). SemREs can be used to identify lines of text containing references to semantic concepts such as cities, celebrities, political entities, etc. The focus in their paper was on automatically synthesizing semantic regular expressions from positive and negative examples. In this paper, we study themembership testing problem. First, we present a two-pass NFA-based algorithm to determine whether a stringwmatches a SemRErinO(|r|²|w|²+ |r| |w|³) time, assuming the oracle responds to each query in unit time. In common situations, where oracle queries are not nested, we show that this procedure runs inO(|r|²|w|²) time. Experiments with a prototype implementation of this algorithm validate our theoretical analysis, and show that the procedure massively outperforms a dynamic programming-based baseline, and incurs a ≈ 2 × overhead over the time needed for interaction with the oracle. Second, we establish connections between SemRE membership testing and the triangle finding problem from graph theory, which suggest that developing algorithms which are simultaneously practical and asymptotically faster might be challenging. Furthermore, algorithms for classical regular expressions primarily aim to optimize their time and memory consumption. In contrast, an important consideration in our setting is to minimize the cost of invoking the oracle. We demonstrate an Ω(|w|²) lower bound on the number of oracle queries necessary to make this determination.
more » « less
Free, publicly-accessible full text available June 10, 2026
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities

Li, Ziyang; Dutta, Saikat; Naik, Mayur (April 2025, ICLR 2025)

Free, publicly-accessible full text available April 24, 2026
Localized Explanations for Automatically Synthesized Network Configurations

https://doi.org/10.1145/3696348.3696888

Nazari, Amirmohammad; Zhang, Yongzheng; Raghothaman, Mukund; Chen, Haoxian (November 2024, ACM)

Full Text Available
Generating Function Names to Improve Comprehension of Synthesized Programs

https://doi.org/10.1109/VL/HCC60511.2024.00035

Nazari, Amirmohammad; Swayamdipta, Swabha; Chattopadhyay, Souti; Raghothaman, Mukund (September 2024, IEEE)

Full Text Available
Explainable Program Synthesis by Localizing Specifications

https://doi.org/10.1145/3622874

Nazari, Amirmohammad; Huang, Yifei; Samanta, Roopsha; Radhakrishna, Arjun; Raghothaman, Mukund (October 2023, Proceedings of the ACM on Programming Languages)

The traditional formulation of the program synthesis problem is to find a program that meets a logical correctness specification. When synthesis is successful, there is a guarantee that the implementation satisfies the specification. Unfortunately, synthesis engines are typically monolithic algorithms, and obscure the correspondence between the specification, implementation and user intent. In contrast, humans often include comments in their code to guide future developers towards the purpose and design of different parts of the codebase. In this paper, we introducesubspecificationsas a mechanism to augment the synthesized implementation with explanatory notes of this form. In this model, the user may ask for explanations of different parts of the implementation; the subspecification generated in response is a logical formula that describes the constraints induced on that subexpression by the global specification and surrounding implementation. We develop algorithms to construct and verify subspecifications and investigate their theoretical properties. We perform an experimental evaluation of the subspecification generation procedure, and measure its effectiveness and running time. Finally, we conduct a user study to determine whether subspecifications are useful: we find that subspecifications greatly aid in understanding the global specification, in identifying alternative implementations, and in debugging faulty implementations.
more » « less
Mobius: Synthesizing Relational Queries with Recursive and Invented Predicates

https://doi.org/10.1145/3622847

Thakkar, Aalok; Sands, Nathaniel; Petrou, George; Alur, Rajeev; Naik, Mayur; Raghothaman, Mukund (October 2023, Proceedings of the ACM on Programming Languages)

Synthesizing relational queries from data is challenging in the presence of recursion and invented predicates. We propose a fully automated approach to synthesize such queries. Our approach comprises of two steps: it first synthesizes a non-recursive query consistent with the given data, and then identifies recursion schemes in it and thereby generalizes to arbitrary data. This generalization is achieved by an iterative predicate unification procedure which exploits the notion of data provenance to accelerate convergence. In each iteration of the procedure, a constraint solver proposes a candidate query, and a query evaluator checks if the proposed program is consistent with the given data. The data provenance for a failed query allows us to construct additional constraints for the constraint solver and refine the search. We have implemented our approach in a tool named Mobius. On a suite of 21 challenging recursive query synthesis tasks, Mobius outperforms three state-of-the-art baselines Gensynth, ILASP, and Popper, both in terms of runtime and accuracy. We also demonstrate that the synthesized queries generalize well to unseen data.
more » « less
Relational Query Synthesis ⋈ Decision Tree Learning

https://doi.org/10.14778/3626292.3626306

Naik, Aaditya; Thakkar, Aalok; Stein, Adam; Alur, Rajeev; Naik, Mayur (October 2023, Proceedings of the VLDB Endowment)

We study the problem of synthesizing a core fragment of relational queries called select-project-join (SPJ) queries from input-output examples. Search-based synthesis techniques are suited to synthesizing projections and joins by navigating the network of relational tables but require additional supervision for synthesizing comparison predicates. On the other hand, decision tree learning techniques are suited to synthesizing comparison predicates when the input database can be summarized as a single labelled relational table. In this paper, we adapt and interleave methods from the domains of relational query synthesis and decision tree learning, and present an end-to-end framework for synthesizing relational queries with categorical and numerical comparison predicates. Our technique guarantees the completeness of the synthesis procedure and strongly encourages minimality of the synthesized program. We present Libra, an implementation of this technique and evaluate it on a benchmark suite of 1,475 instances of queries over 159 databases with multiple tables. Libra solves 1,361 of these instances in an average of 59 seconds per instance. It outperforms state-of-the-art program synthesis tools Scythe and PatSQL in terms of both the running time and the quality of the synthesized programs.
more » « less
Full Text Available
Sporq: An Interactive Environment for Exploring Code using Query-by-Example

https://doi.org/3472749.3474737

Naik, Aaditya; Mendelson, Jonathan; Sands, Nate; Wang, Yuepeng; Naik, Mayur; Raghothaman, Mukund (October 2021, UIST)

Full Text Available
Example-Guided Synthesis of Relational Queries

Thakkar, Aalok; Naik, Aaditya; Sands, Nate; Alur, Rajeev; Naik, Mayur; Raghothaman, Mukund (June 2021, PLDI 2021)

Full Text Available

Search for: All records