skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Developer’s Responsibility or Database’s Responsibility? Rethinking Concurrency Control in Databases
Many database applications execute transactions under a weaker isolation level, such as READ COMMITTED. This often leads to concurrency bugs that look like race conditions in multi-threaded programs. While this problem is well known, philosophies of how to address this problem vary a lot, ranging from making a SERIALIZABLE database faster to living with weaker isolation and the consequence of concurrency bugs. This paper studies the consequences, root causes, and how developers fix 93 real-world concurrency bugs in database applications. We observe that, on the one hand, developers still prefer preventing these bugs from happening. On the other hand, database systems are not providing sufficient support for this task, so developers often fix these bugs using ad-hoc solutions, which are often complicated and not fully correct. We further discuss research opportunities to improve concurrency control in database implementations.  more » « less
Award ID(s):
1816577
PAR ID:
10402625
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
13th Annual Conference on Innovative Data Systems Research (CIDR ’23). January 8-11, 2023, Amsterdam, The Netherlands.
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Serializability is a well-understood correctness criterion that simplifies reasoning about the behavior of concurrent transactions by ensuring they are isolated from each other while they execute. However, enforcing serializable isolation comes at a steep cost in performance because it necessarily restricts opportunities to exploit concurrency even when such opportunities would not violate application-specific invariants. As a result, database systems in practice support, and often encourage, developers to implement transactions using weaker alternatives. These alternatives break the strong isolation guarantees offered by serializable transactions to permit greater concurrency. Unfortunately, the semantics of weak isolation is poorly understood, and usually explained only informally in terms of low-level implementation artifacts. Consequently, verifying high-level correctness properties in such environments remains a challenging problem. To address this issue, we present a novel program logic that enables compositional reasoning about the behavior of concurrently executing weakly-isolated transactions. Recognizing that the proof burden necessary to use this logic may dissuade application developers, we also describe an inference procedure based on this foundation that ascertains the weakest isolation level that still guarantees the safety of high-level consistency assertions associated with such transactions. The key to effective inference is the observation that weakly-isolated transactions can be viewed as functional (monadic) computations over an abstract database state, allowing us to treat their operations as state transformers over the database. This interpretation enables automated verification using off-the-shelf SMT solvers. Our development is parametric over a transaction’s specific isolation semantics, allowing it to be applicable over a range of concurrency control mechanisms. Case studies and experiments on real-world applications (written in an embedded DSL in OCaml) demonstrate the utility of our approach, and provide strong evidence that automated verification of weakly-isolated transactions can be placed on the same formal footing as their strongly-isolated serializable counterparts. 
    more » « less
  2. Database applications frequently use weaker isolation levels, such as Read Committed, for better performance, which may lead to bugs that do not happen under Serializable. Although a number of works have proposed methods to identify such isolation-related bugs, the difficulty of analyzing reported bugs is often underestimated, since these bugs often involve multiple complicated transactions interleaved in a specific order and they often require users' feedback to improve the accuracy of bug analysis. This paper presents IsoBugView, a tool to visualize isolation bugs and incorporate users' feedback: to address the challenge that a complicated bug may include much information and thus is hard to present, IsoBugView displays a high-level overview of the bug first and displays further information of individual pieces if the developer needs further investigation. To incorporate users' feedback, IsoBugView embeds hook functions into the backend analysis tool to preprocess a dependency graph and postprocess a found cycle and further allows a user to apply predefined hook functions in its graphic user interface. Our experience shows that IsoBugView has greatly improved our productivity of analyzing isolation bugs. 
    more » « less
  3. The emergence of database-as-a-service platforms has made deploying database applications easier than before. Now, developers can quickly create scalable applications. However, designing performant, maintainable, and accurate applications is challenging. Developers may unknowingly introduce anti-patterns in the application's SQL statements. These anti-patterns are design decisions that are intended to solve a problem, but often lead to other problems by violating fundamental design principles. In this paper, we present SQLCheck, a holistic toolchain for automatically finding and fixing anti-patterns in database applications. We introduce techniques for automatically (1) detecting anti-patterns with high precision and recall, (2) ranking the anti-patterns based on their impact on performance, maintainability, and accuracy of applications, and (3) suggesting alternative queries and changes to the database design to fix these anti-patterns. We demonstrate the prevalence of these anti-patterns in a large collection of queries and databases collected from open-source repositories. We introduce an anti-pattern detection algorithm that augments query analysis with data analysis. We present a ranking model for characterizing the impact of frequently occurring anti-patterns. We discuss how SQLCheck suggests fixes for high-impact anti-patterns using rule-based query refactoring techniques. Our experiments demonstrate that SQLCheck enables developers to create more performant, maintainable, and accurate applications. 
    more » « less
  4. We will demonstrate a prototype of sqlcheck, a holistic toolchain for automatically finding and fixing anti-patterns in database appli- cations. The advent of modern database-as-a-service platforms has made it easy for developers to quickly create scalable applications. However, it is still challenging for developers to design performant, maintainable, and accurate applications. This is because develop- ers may unknowingly introduce anti-patterns in the application’s SQL statements. These anti-patterns are design decisions that are intended to solve a problem, but often lead to other problems by violating fundamental design principles. sqlcheck leverages techniques for automatically: (1) detecting anti-patterns with high accuracy, (2) ranking them based on their impact on performance, maintainability, and accuracy of applica- tions, and (3) suggesting alternative queries and changes to the database design to fix these anti-patterns. We will demonstrate that sqlcheck enables developers to create more performant, maintain- able, and accurate applications. We will show the prevalence of these anti-patterns in a large collection of queries and databases collected from open-source repositories. 
    more » « less
  5. Concurrency bugs are hard to discover and reproduce, even in well-synchronized programs that are free of data races. Thankfully, prior work on controlled concurrency testing (CCT) has developed sophisticated algorithms---such as partial-order based and selectively uniform sampling---to effectively search over the space of thread interleavings. Unfortunately, in practice, these techniques cannot easily be applied to real-world Java programs due to the difficulties of controlling concurrency in the presence of the managed runtime and complex synchronization primitives. So, mature Java projects that make heavy use of concurrency still rely on naive repeated stress testing in a loop. In this paper, we take a first-principles approach for elucidating the requirements and design space to enable CCT on arbitrary real-world JVM applications. We identify practical challenges with classical design choices described in prior work---such as concurrency mocking, VM hacking, and OS-level scheduling---that affect bug-finding effectiveness and/or the scope of target applications that can be easily supported. Based on these insights, we present Fray, a new platform for performing push-button concurrency testing (beyond data races) of JVM programs. The key design principle behind Fray is to orchestrate thread interleavings without replacing existing concurrency primitives, using a concurrency control mechanism called shadow locking for faithfully expressing the set of all possible program behaviors. With full concurrency control, Fray can test applications using a number of search algorithms from a simple random walk to sophisticated techniques like PCT, POS, and SURW. In an empirical evaluation on 53 benchmark programs with known bugs (SCTBench and JaConTeBe), Fray with random walk finds 70% more bugs than JPF and 77% more bugs than RR's chaos mode. We also demonstrate Fray's push-button applicability on 2,664 tests from Apache Kafka, Lucene, and Google Guava. In these mature projects, Fray successfully discovered 18 real-world concurrency bugs that can cause 371 of the existing tests to fail under specific interleavings. We believe that Fray serves as a bridge between classical academic research and industrial practice--- empowering developers with advanced concurrency testing algorithms that demonstrably uncover more bugs, while simultaneously providing researchers a platform for large-scale evaluation of search techniques. 
    more » « less