Modern web applications have stringent latency requirements while processing an ever-increasing amount of user data. To address these challenges and improve programmer productivity, Object Relational Mapping (ORM) frameworks have been developed to allow developers writing database processing code in an object-oriented manner. Despite such frameworks, prior work found that developers still struggle in developing performant ORM-based web applications. This paper presents PowerStation, a RubyMine IDE plugin for optimizing web applications developed using the Ruby on Rails ORM. Using automated static analysis, PowerStation detects ORMrelated inefficiency problems and suggests fixes to developers. Our evaluation on 12 real-world applications shows that PowerStation can automatically detects 1221 performance issues across all of them. We have uploaded a tutorial on using PowerStation plugin to https://youtu.be/v_uY5bjGuK0.
more »
« less
How not to structure your database-backed web applications: a study of performance bugs in the wild
Many web applications use databases for persistent data storage, and using Object Relational Mapping (ORM) frameworks is a common way to develop such database-backed web applications. Unfortunately, developing efficient ORM applications is challenging, as the ORM framework hides the underlying database query generation and execution. This problem is becoming more severe as these applications need to process an increasingly large amount of persistent data. Recent research has targeted specific aspects of performance problems in ORM applications. However, there has not been any systematic study to identify common performance anti-patterns in real-world such applications, how they affect resulting application performance, and remedies for them. In this paper, we try to answer these questions through a comprehensive study of 12 representative real-world ORM applications. We generalize 9 ORM performance anti-patterns from more than 200 performance issues that we obtain by studying their bug-tracking systems and profiling their latest versions. To prove our point, we manually fix 64 performance issues in their latest versions and obtain a median speedup of 2× (and up to 39× max) with fewer than 5 lines of code change in most cases. Many of the issues we found have been confirmed by developers, and we have implemented ways to identify other code fragments with similar issues as well.
more »
« less
- Award ID(s):
- 1546083
- PAR ID:
- 10067547
- Date Published:
- Journal Name:
- ICSE '18 Proceedings of the 40th International Conference on Software Engineering
- Page Range / eLocation ID:
- 800 to 810
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Testing database-backed web applications is chal- lenging because their behaviors (e.g., control flow) are highly dependent on data returned from SQL queries. Without a database containing sufficient and realistic data, it is challenging to reach potentially vulnerable code snippets, limiting various existing dynamic-based security testing approaches. However, obtaining such a database for testing is difficult in practice as it often contains sensitive information. Sharing it can lead to data leaks and privacy issues. In this paper, we present SYNTHDB, a program analysis- based database generation technique for database-backed PHP applications. SYNTHDB leverages a concolic execution engine to identify interactions between PHP codebase and the SQL queries. It then collects and solves various constraints to reconstruct a database that can enable exploring uncovered program paths without violating database integrity. Our evaluation results show that the database generated by SYNTHDB outperforms state-of- the-arts database generation techniques in terms of code and query coverage in 17 real-world PHP applications. Specifically, SYNTHDB generated databases achieve 62.9% code and 77.1% query coverages, which are 14.0% and 24.2% more in code and query coverages than the state-of-the-art techniques. Fur- thermore, our security analysis results show that SYNTHDB effectively aids existing security testing tools: Burp Suite, Wfuzz, and webFuzz. Burp Suite aided by SYNTHDB detects 76.8% of vulnerabilities while other existing techniques cover 55.7% or fewer. Impressively, with SYNTHDB, Burp Suite discovers 33 pre- viously unknown vulnerabilities from 5 real-world applications.more » « less
-
Server-side web applications are vulnerable to request races. While some previous studies of real-world request races exist, they primarily focus on the root cause of these bugs. To better combat request races in server-side web applications, we need a deep understanding of their characteristics. In this paper, we provide a complementary focus on race effects and fixes with an enlarged set of request races from web applications developed with Object-Relational Mapping (ORM) frameworks. We revisit characterization questions used in previous studies on newly included request races, distinguish the external and internal effects of request races, and relate requestrace fixes with concurrency control mechanisms in languages and frameworks for developing server-side web applications. Our study reveals that: (1) request races from ORM-based web applications share the same characteristics as those from raw-SQL web applications; (2) request races violating application semantics without explicit crashes and error messages externally are common, and latent request races, which only corrupt some shared resource internally but require extra requests to expose the misbehavior, are also common; and (3) various fix strategies other than using synchronization mechanisms are used to fix request races. We expect that our results can help developers better understand request races and guide the design and development of tools for combating request races.more » « less
-
Exploiting the relationships among data is a classical query optimization technique. As persistent data is increasingly being created and maintained programmatically, prior work that infers data relationships from data statistics misses an important opportunity. We present Coco, the first tool that identifies data relationships by analyzing database-backed applications. Once identified, Coco leverages the constraints to optimize the application's physical design and query execution. Instead of developing a fixed set of predefined rewriting rules, Coco employs an enumerate-test-verify technique to automatically exploit the discovered data constraints to improve query execution. Each resulting rewrite is provably equivalent to the original query. Using 14 real-world web applications, our experiments show that Coco can discover numerous data constraints from code analysis and improve real-world application performance significantly.more » « less
-
The emergence of database-as-a-service platforms has made deploying database applications easier than before. Now, developers can quickly create scalable applications. However, designing performant, maintainable, and accurate applications is challenging. Developers may unknowingly introduce anti-patterns in the application's SQL statements. These anti-patterns are design decisions that are intended to solve a problem, but often lead to other problems by violating fundamental design principles. In this paper, we present SQLCheck, a holistic toolchain for automatically finding and fixing anti-patterns in database applications. We introduce techniques for automatically (1) detecting anti-patterns with high precision and recall, (2) ranking the anti-patterns based on their impact on performance, maintainability, and accuracy of applications, and (3) suggesting alternative queries and changes to the database design to fix these anti-patterns. We demonstrate the prevalence of these anti-patterns in a large collection of queries and databases collected from open-source repositories. We introduce an anti-pattern detection algorithm that augments query analysis with data analysis. We present a ranking model for characterizing the impact of frequently occurring anti-patterns. We discuss how SQLCheck suggests fixes for high-impact anti-patterns using rule-based query refactoring techniques. Our experiments demonstrate that SQLCheck enables developers to create more performant, maintainable, and accurate applications.more » « less
An official website of the United States government

