Abstract Developing sustainable software for the scientific community requires expertise in software engineering and domain science. This can be challenging due to the unique needs of scientific software, the insufficient resources for software engineering practices in the scientific community, and the complexity of developing for evolving scientific contexts. While open‐source software can partially address these concerns, it can introduce complicating dependencies and delay development. These issues can be reduced if scientists and software developers collaborate. We present a case study wherein scientists from the SuperNova Early Warning System collaborated with software developers from the Scalable Cyberinfrastructure for Multi‐Messenger Astrophysics project. The collaboration addressed the difficulties of open‐source software development, but presented additional risks to each team. For the scientists, there was a concern of relying on external systems and lacking control in the development process. For the developers, there was a risk in supporting a user‐group while maintaining core development. These issues were mitigated by creating a second Agile Scrum framework in parallel with the developers' ongoing Agile Scrum process. This Agile collaboration promoted communication, ensured that the scientists had an active role in development, and allowed the developers to evaluate and implement the scientists' software requirements. The collaboration provided benefits for each group: the scientists actuated their development by using an existing platform, and the developers utilized the scientists' use‐case to improve their systems. This case study suggests that scientists and software developers can avoid scientific computing issues by collaborating and that Agile Scrum methods can address emergent concerns. 
                        more » 
                        « less   
                    
                            
                            Collaboration challenges in building ML-enabled systems: communication, documentation, engineering, and process
                        
                    
    
            The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process, and collect recommendations to address these challenges. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2131477
- PAR ID:
- 10355592
- Date Published:
- Journal Name:
- ICSE '22: Proceedings of the 44th International Conference on Software Engineering
- Page Range / eLocation ID:
- 413 to 425
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.more » « less
- 
            Implicit Requirements (IMR) identification is part of the Requirements Engineering (RE) phase in Software Engineering during which data is gathered to create SRS (Software Requirements Specifications) documents. As opposed to explicit requirements clearly stated, IMRs constitute subtle data and need to be inferred. Research has shown that IMRs are crucial to the success of software development. Many software systems can encounter failures due to lack of IMR data management. SRS documents are large, often hundreds of pages, due to which manually identifying IMRs by human software engineers is not feasible. Moreover, such data is evergrowing due to the expansion of software systems. It is thus important to address the crucial issue of IMR data management. This article presents a survey on IMRs in SRS documents with the definition and overview of IMR data, detailed taxonomy of IMRs with explanation and examples, practices in managing IMR data, and tools for IMR identification. In addition to reviewing classical and state-of-the-art approaches, we highlight trends and challenges and point out open issues for future research. This survey article is interesting based on data quality, hidden information retrieval, veracity and salience, and knowledge discovery from large textual documents with complex heterogeneous data.more » « less
- 
            With the emergence of social coding platforms, collaboration has become a key and dynamic aspect to the success of software projects. In such platforms, developers have to collaborate and deal with issues of collaboration in open-source software development. Although collaboration is challenging, collaborative development produces better software systems than any developer could produce alone. Several approaches have investigated collaboration challenges, for instance, by proposing or evaluating models and tools to support collaborative work. Despite the undeniable importance of the existing efforts in this direction, there are few works on collaboration from perspectives of developers. In this work, we aim to investigate the perceptions of open-source software developers on collaborations, such as motivations, techniques, and tools to support global, productive, and collaborative development. Following an ad hoc literature review, an exploratory interview study with 12 open-source software developers from GitHub, our novel approach for this problem also relies on an extensive survey with 121 developers to confirm or refute the interview results. We found different collaborative contributions, such as managing change requests. Besides, we observed that most collaborators prefer to collaborate with the core team instead of their peers. We also found that most collaboration happens in software development (60%) and maintenance (47%) tasks. Furthermore, despite personal preferences to work independently, developers still consider collaborating with others in specific task categories, for instance, software development. Finally, developers also expressed the importance of the social coding platforms, such as GitHub, to support maintainers, and contributors in making decisions and developing tasks of the projects. Therefore, these findings may help project leaders optimize the collaborations among developers and reduce entry barriers. Moreover, these findings may support the project collaborators in understanding the collaboration process and engaging others in the project.more » « less
- 
            Grewe, Lynne L.; Blasch, Erik P.; Kadar, Ivan (Ed.)Sensor fusion combines data from a suite of sensors into an integrated solution that represents the target environment more accurately than that produced by individual sensors. New developments in Machine Learning (ML) algorithms are leading to increased accuracy, precision, and reliability in sensor fusion performance. However, these increases are accompanied by increases in system costs. Aircraft sensor systems have limited computing, storage, and bandwidth resources, which must balance monetary, computational, and throughput costs, sensor fusion performance, aircraft safety, data security, robustness, and modularity system objectives while meeting strict timing requirements. Performing trade studies of these system objectives should come before incorporating new ML models into the sensor fusion software. A scalable and automated solution is needed to quickly analyze the effects on the system’s objectives of providing additional resources to the new inference models. Given that model-based systems engineering (MBSE) is a focus of the majority of the aerospace industry for designing aircraft mission systems, it follows that leveraging these system models can provide scalability to the system analyses needed. This paper proposes adding empirically derived sensor fusion RNN performance and cost measurement data to machine-readable Model Cards. Furthermore, this paper proposes a scalable and automated sensor fusion system analysis process for ingesting SysML system model information and RNN Model Cards for system analyses. The value of this process is the integration of data analysis and system design that enables rapid enhancements of sensor system development.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    