skip to main content


Title: Qualitative Analysis of the Relationship Between Design Smells and Software Engineering Challenges
Software design debt aims to elucidate the rectification attempts of the present design flaws and studies the influence of those to the cost and time of the software. Design smells are a key cause of incurring design debt. Although the impact of design smells on design debt have been predominantly considered in current literature, how design smells are caused due to not following software engineering best practices require more exploration. This research provides a tool which is used for design smell detection in Java software by analyzing large volume of source codes. More specifically, 409,539 Lines of Code (LoC) and 17,760 class files of open source Java software are analyzed here. Obtained results show desirable precision values ranging from 81.01% to 93.43%. Based on the output of the tool, a study is conducted to relate the cause of the detected design smells to two software engineering challenges namely "irregular team meetings" and "scope creep". As a result, the gained information will provide insight to the software engineers to take necessary steps of design remediation actions.  more » « less
Award ID(s):
1724898 1842054 2007829
NSF-PAR ID:
10392788
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of ACM European Symposium on Software Engineering (ESSE 2022)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Software design debt aims to elucidate the rectification attempts of the present design flaws and studies the influence of those to the cost and time of the software. Design smells are a key cause of incurring design debt. Although the impact of design smells on design debt have been predominantly considered in current literature, how design smells are caused due to not following software engineering best practices require more exploration. This research provides a tool which is used for design smell detection in Java software by analyzing large volume of source codes. More specifically, 409,539 Lines of Code (LoC) and 17,760 class files of open source Java software are analyzed here. Obtained results show desirable precision values ranging from 81.01% to 93.43%. Based on the output of the tool, a study is conducted to relate the cause of the detected design smells to two software engineering challenges namely "irregular team meetings" and "scope creep". As a result, the gained information will provide insight to the software engineers to take necessary steps of design remediation actions. 
    more » « less
  2. One of the most significant impediments to the long-term maintainability of software applications is code smells. Keeping up with the best coding practices can be difficult for software developers, which might lead to performance throttling or code maintenance concerns. As a result, it is imperative that large applications be regularly monitored for performance issues and code smells, so that these issues can be corrected promptly. Resolving code smells in software systems can be done in a variety of ways, but doing so all at once would be prohibitively expensive and can be out of budget. Prioritizing these solutions are therefore critical. The majority of current research prioritizes code smells according to the type of smell they cause. This method, however, is not sufficient because of a lack of knowledge regarding the frequency of code usage and code changeability behavior. Even the most complex programs have some components that are more important than others. Maintaining the functionality of certain parts is essential since they are often used. Identifying and correcting code smells in places that are frequently utilized and subject to rapid change should take precedence over other code smells. A novel strategy is proposed for finding frequently used and change-prone areas in a codebase by combining business logic, heat map information, and commit history analysis in this study. It examines the codebase, commits, and log files of Java applications to identify business processes, heat map graphs, and severity levels of various types of code smells and their commit history. This is done in order to present a comprehensive, efficient, and resource-friendly technique for identifying and prioritizing performance throttling with also handling code maintenance concerns. 
    more » « less
  3. null ; null ; null (Ed.)
    Microservice Architecture (MSA) is rapidly taking over modern software engineering and becoming the predominant architecture of new cloud-based applications (apps). There are many advantages to using MSA, but there are many downsides to using a more complex architecture than a typical monolithic enterprise app. Beyond the normal bad coding practices and code-smells of a typical app, MSA specific code-smells are difficult to discover within a distributed app. There are many static code analysis tools for monolithic apps, but no tool exists to offer code-smell detection for MSA-based apps. This paper proposes a new approach to detect code smells in distributed apps based on MSA. We develop an open-source tool, MSANose, which can accurately detect up to eleven different types of MSA specific code smells. We demonstrate our tool through a case study on a benchmark MSA app and verify its accuracy. Our results show that it is possible to detect code-smells within MSA apps using bytecode and or source code analysis throughout the development or before deployment to production. 
    more » « less
  4. In a software system’s development lifecycle, engineers make numerous design decisions that subsequently cause architectural change in the system. Previous studies have shown that, more often than not, these architectural changes are unintentional by-products of continual software maintenance tasks. The result of inadvertent architectural changes is accumulation of technical debt and deterioration of software quality. Despite their important implications, there is a relative shortage of techniques, tools, and empirical studies pertaining to architectural design decisions. In this paper, we take a step toward addressing that scarcity by using the information in the issue and code repositories of open-source software systems to investigate the cause and frequency of such architectural design decisions. Furthermore, building on these results, we develop a predictive model that is able to identify the architectural significance of newly submitted issues, thereby helping engineers to prevent the adverse effects of architectural decay. The results of this study are based on the analysis of 21,062 issues affecting 301 versions of 5 large open-source systems for which the code changes and issues were publicly accessible. 
    more » « less
  5. null (Ed.)
    In modern Machine Learning, model training is an iterative, experimental process that can consume enormous computation resources and developer time. To aid in that process, experienced model developers log and visualize program variables during training runs. Exhaustive logging of all variables is infeasible, so developers are left to choose between slowing down training via extensive conservative logging, or letting training run fast via minimalist optimistic logging that may omit key information. As a compromise, optimistic logging can be accompanied by program checkpoints; this allows developers to add log statements post-hoc, and "replay" desired log statements from checkpoint---a process we refer to as hindsight logging. Unfortunately, hindsight logging raises tricky problems in data management and software engineering. Done poorly, hindsight logging can waste resources and generate technical debt embodied in multiple variants of training code. In this paper, we present methodologies for efficient and effective logging practices for model training, with a focus on techniques for hindsight logging. Our goal is for experienced model developers to learn and adopt these practices. To make this easier, we provide an open-source suite of tools for Fast Low-Overhead Recovery (flor) that embodies our design across three tasks: (i) efficient background logging in Python, (ii) adaptive periodic checkpointing, and (iii) an instrumentation library that codifies hindsight logging for efficient and automatic record-replay of model-training. Model developers can use each flor tool separately as they see fit, or they can use flor in hands-free mode, entrusting it to instrument their code end-to-end for efficient record-replay. Our solutions leverage techniques from physiological transaction logs and recovery in database systems. Evaluations on modern ML benchmarks demonstrate that flor can produce fast checkpointing with small user-specifiable overheads (e.g. 7%), and still provide hindsight log replay times orders of magnitude faster than restarting training from scratch. 
    more » « less