skip to main content

Title: G2SAT: Learning to Generate SAT Formulas
The Boolean Satisfiability (SAT) problem is the canonical NP-complete problem and is fundamental to computer science, with a wide array of applications in planning, verification, and theorem proving. Developing and evaluating practical SAT solvers relies on extensive empirical testing on a set of real-world benchmark formulas. However, the availability of such real-world SAT formulas is limited. While these benchmark formulas can be augmented with synthetically generated ones, existing approaches for doing so are heavily hand-crafted and fail to simultaneously capture a wide range of characteristics exhibited by real-world SAT instances. In this work, we present G2SAT, the first deep generative framework that learns to generate SAT formulas from a given set of input formulas. Our key insight is that SAT formulas can be transformed into latent bipartite graph representations which we model using a specialized deep generative neural network. We show that G2SAT can generate SAT formulas that closely resemble given real-world SAT instances, as measured by both graph metrics and SAT solver behavior. Further, we show that our synthetic SAT formulas could be used to improve SAT solver performance on real-world benchmarks, which opens up new opportunities for the continued development of SAT solvers and a deeper understanding of their performance.
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Advances in neural information processing systems
Sponsoring Org:
National Science Foundation
More Like this
  1. The globalization of the IC supply chain has raised many security threats, especially when untrusted parties are involved. This has created a demand for a dependable logic obfuscation solution to combat these threats. Amongst a wide range of threats and countermeasures on logic obfuscation in the 2010s decade, the Boolean satisfiability (SAT) attack, or one of its derivatives, could break almost all state-of-the-art logic obfuscation countermeasures. However, in some cases, particularly when the logic locked circuits contain complex structures, such as big multipliers, large routing networks, or big tree structures, the logic locked circuit is hard-to-be-solved for the SAT attack.more »Usage of these structures for obfuscation may lead a strong defense, as many SAT solvers fail to handle such complexity. However, in this paper, we propose a neural-network-guided SAT attack (NNgSAT), in which we examine the capability and effectiveness of a message-passing neural network (MPNN) for solving these complex structures (SAT-hard instances). In NNgSAT, after being trained as a classifier to predict SAT/UNSAT on a SAT problem (NN serves as a SAT solver), the neural network is used to guide/help the actual SAT solver for finding the SAT assignment(s). By training NN on conjunctive normal forms (CNFs) corresponded to a dataset of logic locked circuits, as well as fine-tuning the confidence rate of the NN prediction, our experiments show that NNgSAT could solve 93.5% of the logic locked circuits containing complex structures within a reasonable time, while the existing SAT attack cannot proceed the attack flow in them.« less
  2. Problem solvers vary their approaches to solving problems depending on the context of the problem, the requirements of the solution, and the ways in which the problems and material to solve the problem are represented, or representations. Representations take many forms (i.e. tables, graphs, figures, images, formulas, visualizations, and other similar contexts) and are used to communicate information to a problem solver. Engagement with certain representations varies between problem solvers and can influence design and solution quality. A problem solver’s evaluation of representations and the reasons for using a representation can be considered factors in problem-solving heuristics. These factors describemore »unique problem-solving behaviors that can help understand problem solvers. These behaviors may lead to important relationships between a problem solver’s decisions and their ability to solve a problem and overall quality of the solution. Therefore, we pose the following research question: How do factors of problem-solving heuristics describe the unique behaviors of engineering students as they solve multiple problems? To answer this question, we interviewed 16 undergraduate engineering students studying civil engineering. The interviews consisted of a problem-solving portion that was followed immediately by a semi-structured retrospective interview with probing questions created based on the real time monitoring of the problem-solving interview using eye tracking techniques. The problem-solving portion consisted of solving three problems related to the concept of headloss in fluid flow through pipes. Each of the three problems included the same four representations that were used by the students as approaches to solving the problem. The representations are common ways to present the concept of headloss in pipe flow and included two formulas, a set of tables, and a graph. This paper presents a set of common reasons for why decisions were made during the problem-solving process that help to understand more about the problem-solving behavior of engineering students.« less
  3. Reusable symbolic evaluators are a key building block of solver-aided verification and synthesis tools. A reusable evaluator reduces the semantics of all paths in a program to logical constraints, and a client tool uses these constraints to formulate a satisfiability query that is discharged with SAT or SMT solvers. The correctness of the evaluator is critical to the soundness of the tool and the domain properties it aims to guarantee. Yet so far, the trust in these evaluators has been based on an ad-hoc foundation of testing and manual reasoning. This paper presents the first formal framework for reasoning aboutmore »the behavior of reusable symbolic evaluators. We develop a new symbolic semantics for these evaluators that incorporates state merging. Symbolic evaluators use state merging to avoid path explosion and generate compact encodings. To accommodate a wide range of implementations, our semantics is parameterized by a symbolic factory, which abstracts away the details of merging and creation of symbolic values. The semantics targets a rich language that extends Core Scheme with assumptions and assertions, and thus supports branching, loops, and (first-class) procedures. The semantics is designed to support reusability, by guaranteeing two key properties: legality of the generated symbolic states, and the reducibility of symbolic evaluation to concrete evaluation. Legality makes it simpler for client tools to formulate queries, and reducibility enables testing of client tools on concrete inputs. We use the Lean theorem prover to mechanize our symbolic semantics, prove that it is sound and complete with respect to the concrete semantics, and prove that it guarantees legality and reducibility. To demonstrate the generality of our semantics, we develop Leanette, a reference evaluator written in Lean, and Rosette 4, an optimized evaluator written in Racket. We prove Leanette correct with respect to the semantics, and validate Rosette 4 against Leanette via solver-aided differential testing. To demonstrate the practicality of our approach, we port 16 published verification and synthesis tools from Rosette 3 to Rosette 4. Rosette 3 is an existing reusable evaluator that implements the classic merging semantics, adopted from bounded model checking. Rosette 4 replaces the semantic core of Rosette 3 but keeps its optimized symbolic factory. Our results show that Rosette 4 matches the performance of Rosette 3 across a wide range of benchmarks, while providing a cleaner interface that simplifies the implementation of client tools.« less
  4. The Multi-Agent Path Finding (MAPF) problem arises in many real-world applications, ranging from automated warehousing to multi-drone delivery. Solving the MAPF problem optimally is NP-hard, and existing optimal and bounded-suboptimal MAPF solvers thus usually do not scale to large MAPF instances. Greedy MAPF solvers scale to large MAPF instances, but their solution qualities are often bad. In this paper, we therefore propose a novel MAPF solver, Hierarchical Multi-Agent Path Planner (HMAPP), which creates a spatial hierarchy by partitioning the environment into multiple regions and decomposes a MAPF instance into smaller MAPF sub-instances for each region. For each sub-instance, it usesmore »a bounded-suboptimal MAPF solver to solve it with good solution quality. Our experimental results show that HMAPP solves as large MAPF instances as greedy MAPF solvers while achieving better solution qualities on various maps.« less
  5. Supervised learning, while deployed in real-life scenarios, often encounters instances of unknown classes. Conventional algorithms for training a supervised learning model do not provide an option to detect such instances, so they miss-classify such instances with 100% probability. Open Set Recognition (OSR) and Non-Exhaustive Learning (NEL) are potential solutions to overcome this problem. Most existing methods of OSR first classify members of existing classes and then identify instances of new classes. However, many of the existing methods of OSR only makes a binary decision, i.e., they only identify the existence of the unknown class. Hence, such methods cannot distinguish testmore »instances belonging to incremental unseen classes. On the other hand, the majority of NEL methods often make a parametric assumption over the data distribution, which either fail to return good results, due to the reason that real-life complex datasets may not follow a well-known data distribution. In this paper, we propose a new online non-exhaustive learning model, namely, Non-Exhaustive Gaussian Mixture Generative Adversarial Networks (NE-GM-GAN) to address these issues. Our proposed model synthesizes Gaussian mixture based latent representation over a deep generative model, such as GAN, for incremental detection of instances of emerging classes in the test data. Extensive experimental results on several benchmark datasets show that NE-GM-GAN significantly outperforms the state-of-the-art methods in detecting instances of novel classes in streaming data.« less