skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Code Smell Prioritization with Business Process Mining and Static Code Analysis: A Case Study
One of the most significant impediments to the long-term maintainability of software applications is code smells. Keeping up with the best coding practices can be difficult for software developers, which might lead to performance throttling or code maintenance concerns. As a result, it is imperative that large applications be regularly monitored for performance issues and code smells, so that these issues can be corrected promptly. Resolving code smells in software systems can be done in a variety of ways, but doing so all at once would be prohibitively expensive and can be out of budget. Prioritizing these solutions are therefore critical. The majority of current research prioritizes code smells according to the type of smell they cause. This method, however, is not sufficient because of a lack of knowledge regarding the frequency of code usage and code changeability behavior. Even the most complex programs have some components that are more important than others. Maintaining the functionality of certain parts is essential since they are often used. Identifying and correcting code smells in places that are frequently utilized and subject to rapid change should take precedence over other code smells. A novel strategy is proposed for finding frequently used and change-prone areas in a codebase by combining business logic, heat map information, and commit history analysis in this study. It examines the codebase, commits, and log files of Java applications to identify business processes, heat map graphs, and severity levels of various types of code smells and their commit history. This is done in order to present a comprehensive, efficient, and resource-friendly technique for identifying and prioritizing performance throttling with also handling code maintenance concerns.  more » « less
Award ID(s):
1854049
PAR ID:
10393896
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Electronics
Volume:
11
Issue:
12
ISSN:
2079-9292
Page Range / eLocation ID:
1880
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null; null; null (Ed.)
    Microservice Architecture (MSA) is rapidly taking over modern software engineering and becoming the predominant architecture of new cloud-based applications (apps). There are many advantages to using MSA, but there are many downsides to using a more complex architecture than a typical monolithic enterprise app. Beyond the normal bad coding practices and code-smells of a typical app, MSA specific code-smells are difficult to discover within a distributed app. There are many static code analysis tools for monolithic apps, but no tool exists to offer code-smell detection for MSA-based apps. This paper proposes a new approach to detect code smells in distributed apps based on MSA. We develop an open-source tool, MSANose, which can accurately detect up to eleven different types of MSA specific code smells. We demonstrate our tool through a case study on a benchmark MSA app and verify its accuracy. Our results show that it is possible to detect code-smells within MSA apps using bytecode and or source code analysis throughout the development or before deployment to production. 
    more » « less
  2. Background: Some developer activity traditionally performed manually, such as making code commits, opening, managing, or closing issues is increasingly subject to automation in many OSS projects. Specifically, such activity is often performed by tools that react to events or run at specific times. We refer to such automation tools as bots and, in many software mining scenarios related to developer productivity or code quality it is desirable to identify bots in order to separate their actions from actions of individuals. Aim: Find an automated way of identifying bots and code committed by these bots, and to characterize the types of bots based on their activity patterns. Method and Result: We propose BIMAN, a systematic approach to detect bots using author names, commit messages, files modified by the commit, and projects associated with the ommits. For our test data, the value for AUC-ROC was 0.9. We also characterized these bots based on the time patterns of their code commits and the types of files modified, and found that they primarily work with documentation files and web pages, and these files are most prevalent in HTML and JavaScript ecosystems. We have compiled a shareable dataset containing detailed information about 461 bots we found (all of whom have more than 1000 commits) and 13,762,430 commits they created. 
    more » « less
  3. null; null; null (Ed.)
    Code clones are fragments of code that are duplicated in the codebase of an application. They create problems with maintainability, duplicate buggy code, and increase the size of the repository. To combat these issues, there currently exists a multitude of programs to detect duplicated code segments. However, there are not many varieties of languages among the benchmarks for code clone detection tools. Without covering enough languages for modern software development, the development of code-clone detection tools remains stunted. This paper describes a novel tool that will take a seed of Python source code and generate Type 1, 2, and 3 code clones in Python. As one of the most used and rapidly-growing languages in modern software development, our testbed will provide the opportunity for Python code-clone detection tools to be developed and tested. 
    more » « less
  4. Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CoDiSum to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages. 
    more » « less
  5. Software applications and workloads, especially within the domains of Cloud computing and large-scale AI model training, exert considerable demand on computing resources, thus contributing significantly to the overall energy footprint of the IT industry. In this paper, we present an in-depth analysis of certain software coding practices that can play a substantial role in increasing the application’s overall energy consumption, primarily stemming from the suboptimal utilization of computing resources. Our study encompasses a thorough investigation of 16 distinct code smells and other coding malpractices across 31 real-world open-source applications written in Java and Python. Through our research, we provide compelling evidence that various common refactoring techniques, typically employed to rectify specific code smells, can unintentionally escalate the application’s energy consumption. We illustrate that a discerning and strategic approach to code smell refactoring can yield substantial energy savings. For selective refactorings, this yields a reduction of up to 13.1% of energy consumption and 5.1% of carbon emissions per workload on average. These findings underscore the potential of selective and intelligent refactoring to substantially increase energy efficiency of Cloud software systems. 
    more » « less