An empirical evaluation of pre-trained large language models for repairing declarative formal specifications

Alhanahnah, Mohannad; Rashedul_Hasan, Md; Xu, Lisong; Bagheri, Hamid

doi:10.1007/s10664-025-10687-1

Citation Details

This content will become publicly available on September 1, 2026

An empirical evaluation of pre-trained large language models for repairing declarative formal specifications

Abstract Automatic Program Repair (APR) has garnered significant attention as a practical research domain focused on automatically fixing bugs in programs. While existing APR techniques primarily target imperative programming languages like C and Java, there is a growing need for effective solutions applicable to declarative software specification languages. This paper systematically investigates the capacity of Large Language Models (LLMs) to repair declarative specifications in Alloy, a declarative formal language used for software specification. We designed six different repair settings, encompassing single-agent and dual-agent paradigms, utilizing various LLMs. These configurations also incorporate different levels of feedback, including an auto-prompting mechanism for generating prompts autonomously using LLMs. Our study reveals that dual-agent with auto-prompting setup outperforms the other settings, albeit with a marginal increase in the number of iterations and token usage. This dual-agent setup demonstrated superior effectiveness compared to state-of-the-art Alloy APR techniques when evaluated on a comprehensive set of benchmarks. This work is the first to empirically evaluate LLM capabilities to repair declarative specifications, while taking into account recent trending LLM concepts such as LLM-based agents, feedback, auto-prompting, and tools, thus paving the way for future agent-based techniques in software engineering. more »

Award ID(s):: 2124116 2139845

PAR ID:: 10618575

Author(s) / Creator(s):: Alhanahnah, Mohannad; Rashedul_Hasan, Md; Xu, Lisong; Bagheri, Hamid

Publisher / Repository:: Springer Nature

Date Published:: 2025-09-01

Journal Name:: Empirical Software Engineering

Volume:: 30

Issue:: 5

ISSN:: 1382-3256

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on September 1, 2026
Journal Article:
https://doi.org/10.1007/s10664-025-10687-1

More Like this