Background: Bots help automate many of the tasks performed by software developers and are widely used to commit code in various social coding platforms. At present, it is not clear what types of activities these bots perform and understanding it may help design better bots, and find application areas which might benefit from bot adoption. Aim: We aim to categorize the Bot Commits by the type of change (files added, deleted, or modified), find the more commonly changed file types, and identify the groups of file types that tend to get updated together. Method: 12,326,137 commits made by 461 popular bots (that made at least 1000 commits) were examined to identify the frequency and the type of files added/ deleted/ modified by the commits, and association rule mining was used to identify the types of files modified together. Result: Majority of the bot commits modify an existing file, a few of them add new files, while deletion of a file is very rare. Commits involving more than one type of operation are even rarer. Files containing data, configuration, and documentation are most frequently updated, while HTML is the most common type in terms of the number of files added, deleted, and modified. Files of the type "Markdown", "Ignore List", "YAML", "JSON" were the types that are updated together with other types of files most frequently. Conclusion: We observe that majority of bot commits involve single file modifications, and bots primarily work with data, configuration, and documentation files. A better understanding if this is a limitation of the bots and, if overcome, would lead to different kinds of bots remains an open question. 
                        more » 
                        « less   
                    
                            
                            How Do Code Changes Evolve in Different Platforms? A Mining-Based Investigation
                        
                    
    
            Software developed in different platforms has different characteristics and needs. More specifically, code changes are differently performed in the mobile platform compared to non-mobile platforms (e.g., desktop and Web platforms). Prior works have investigated the differences in specific platforms. However, we still lack a deeper understanding of how code changes evolve across different software platforms. In this paper, we present a study aiming at investigating the frequency of changes and how source code changes, build changes and test changes co-evolve in mobile and non-mobile platforms. We developed linear regression models to explain which factors influence the frequency of changes in different platforms and applied the Apriori algorithm to find types of changes that frequently occur together. Our findings show that non-mobile repositories have a higher number of commits per month compared to mobile and our regression models suggest that being mobile significantly impacts on the number of commits in a negative direction when controlling for confound factors, such as code size. We also found that developers do not usually change source code files together with build files or test files. We argue that our results can provide valuable information for developers on how changes are performed in different platforms so that practices adopted in successful software systems can be followed. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1813598
- PAR ID:
- 10180344
- Date Published:
- Journal Name:
- Proceedings of the 35th International Conference on Software Maintenance and Evolution (ICSME)
- Page Range / eLocation ID:
- 218 to 222
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Continuous integration (CI) has become a popular method for automating code changes, testing, and software project delivery. However, sufficient testing prior to code submission is crucial to prevent build breaks. Additionally, testing must provide developers with quick feedback on code changes, which requires fast testing times. While regression test selection (RTS) has been studied to improve the cost-effectiveness of regression testing for lower-level tests (i.e., unit tests), it has not been applied to the testing of user interfaces (UI) in application domains such as mobile apps. UI testing at the UI level requires different techniques such as impact analysis and automated test execution. In this paper, we examine the use of RTS in CI settings for UI testing across various open-source mobile apps. Our analysis focuses on using Frequency Analysis to understand the need for RTS, Cost Analysis to evaluate the cost of impact analysis and test case selection algorithms, and Test Reuse Analysis to determine the reusability of UI test sequences for automation. The insights from this study will guide practitioners and researchers in developing advanced RTS techniques that can be adapted to CI environments for mobile apps.more » « less
- 
            Recent work on open source sustainability shows that successful trajectories of projects in the Apache Software Foundation Incubator (ASFI) can be predicted early on, using a set of socio-technical measures. Because OSS projects are socio-technical systems centered around code artifacts,we hypothesize that sustainable projects may exhibit different code and process patterns than unsustainable ones, and that those patterns can grow more apparent as projects evolve over time. Here we studied the code and coding processes of over 200 ASFI projects, and found that ASFI graduated projects have different patterns of code quality and complexity than retired ones. Likewise for the coding processes – e.g., feature commits or bug-fixing commits are correlated with project graduation success. We find that minor contributors and major contributors (who contribute <5%, respectively >=95% commits) associate with graduation outcomes, implying that having also developers who contribute fewer commits are important for a project’s success. This study provides evidence that OSS projects, especially nascent ones, can benefit from introspection and instrumentation using multidimensional modeling of the whole system, including code, processes, and code quality measures, and how they are interconnected over time.more » « less
- 
            Background: Some developer activity traditionally performed manually, such as making code commits, opening, managing, or closing issues is increasingly subject to automation in many OSS projects. Specifically, such activity is often performed by tools that react to events or run at specific times. We refer to such automation tools as bots and, in many software mining scenarios related to developer productivity or code quality it is desirable to identify bots in order to separate their actions from actions of individuals. Aim: Find an automated way of identifying bots and code committed by these bots, and to characterize the types of bots based on their activity patterns. Method and Result: We propose BIMAN, a systematic approach to detect bots using author names, commit messages, files modified by the commit, and projects associated with the ommits. For our test data, the value for AUC-ROC was 0.9. We also characterized these bots based on the time patterns of their code commits and the types of files modified, and found that they primarily work with documentation files and web pages, and these files are most prevalent in HTML and JavaScript ecosystems. We have compiled a shareable dataset containing detailed information about 461 bots we found (all of whom have more than 1000 commits) and 13,762,430 commits they created.more » « less
- 
            Code changes are often reviewed before they are deployed. Popular source control systems aid code review by presenting textual differences between old and new versions of the code, leaving developers with the difficult task of determining whether the differences actually produced the desired behavior. Fortunately, we can mine such information from code repositories. We propose aiding code review with inter-version semantic differential analysis. During review of a new commit, a developer is presented with summaries of both code differences and behavioral differences, which are expressed as diffs of likely invariants extracted by running the system's test cases. As a result, developers can more easily determine that the code changes produced the desired effect. We created an invariant-mining tool chain, GETTY, to support our concept of semantically-assisted code review. To validate our approach, 1) we applied GETTY to the commits of 6 popular open source projects, 2) we assessed the performance and cost of running GETTY in different configurations, and 3) we performed a comparative user study with 18 developers. Our results demonstrate that semantically-assisted code review is feasible, effective, and that real programmers can leverage it to improve the quality of their reviews.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    