skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 20, 2026

Title: OSSPREY: AI-Driven Forecasting and Intervention for OSS Project Sustainability
Open source software (OSS) underpins modern software infrastructure, yet many projects struggle with long- term sustainability. We introduce OSSPREY, an AI-powered platform that can predict the sustainability of any GitHub- hosted project. OSSPREY collects longitudinal socio-technical data, such as: commits, issues, and contributor interactions, and uses a transformer-based model to generate month-by-month sustainability forecasts. When project downturns are detected, it recommends evidence-based interventions drawn from published software engineering studies. OSSPREY integrates scraping, forecasting, and actionable guidance into an interactive dash- board, enabling maintainers to monitor project health, anticipate decline, and respond with targeted strategies. By connecting real- time project data with research-backed insights, OSSPREY offers a practical tool for sustaining OSS projects at scale. The codebase is linked to the project website at: https: //oss-prey.github.io/OSSPREY-Website/ The screencast is available at: https://www.youtube.com/ watch?v=N7a0v4hPylU  more » « less
Award ID(s):
2020751
PAR ID:
10639618
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ASE 2025
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Open Source Software (OSS) is a major component of our digital infrastructure, yet more than 80% of such projects fail. Seeking less uncertainty, many OSS projects join established software communi- ties, e.g., the Apache Software Foundation (ASF), with established rules and community support to guide projects toward sustainabil- ity. In their nascent stage, ASF projects are incubated in the ASF incubator (ASFI), which provides systematic mentorship toward long-term sustainability. Projects in ASFI eventually conclude their incubation by either graduating, if successful, or retiring, if not. Time-stamped traces of developer activities are publicly avail- able from ASF, and can be used for monitoring project trajectories toward sustainability. Here we present a web app dashboard tool, APEX, that allows internal and external stakeholders to monitor and explore ASFI project sustainability trajectories, including social and technical networks. 
    more » « less
  2. Mentoring has been a subject of study for 50 years. Most studies of mentoring programs evaluate the effect of the program on the participants but do not evaluate if different mentors have different effects on mentees. Open-source software (OSS) is software with a license that allows it to be freely used by other people. Such software has become foundational to the world economy. However, many OSS projects get abandoned by their creators. Various nonprofit organizations have arisen to help OSS projects become sustainable. One of the key services offered by many of these nonprofit organizations is a mentorship program where experienced OSS developers advise nascent projects on how to achieve sustainability. We use data from the Apache Software Foundation Incubator program where 303 mentors have mentored 286 projects, with most mentoring more than one project, to address this question: Is who a project has as a mentor associated with variation in project success? Who a project has as a mentor accounts for 45% of the variation in project outcomes, with some mentors being associated with positive and some with negative outcomes. These mentors could offer insights into how to improve the mentoring program. This result also demonstrates, more broadly, that the nature of specific mentoring relationships may be important to understanding how mentors impact outcomes in other mentoring programs. 
    more » « less
  3. Sustainable Open Source Software (OSS) forms much of the fabric of our digital society, especially successful and sustainable ones. But many OSS projects do not become sustainable, resulting in abandonment and even risks for the world's digital infrastructure. Prior work has looked at the reasons for this mainly from two very different perspectives. In software engineering, the focus has been on understanding success and sustainability from the socio-technical perspective: the OSS programmers' day-to-day activities and the artifacts they create. In institutional analysis, on the other hand, emphasis has been on institutional designs (e.g., policies, rules, and norms) that structure project governance. Even though each is necessary for a comprehensive understanding of OSS projects, the connection and interaction between the two approaches have been barely explored. In this paper, we make the first effort toward understanding OSS project sustainability using a dual-view analysis, by combining institutional analysis with socio-technical systems analysis. In particular, we (i) use linguistic approaches to extract institutional rules and norms from OSS contributors' communications to represent the evolution of their governance systems, and (ii) construct socio-technical networks based on longitudinal collaboration records to represent each project's organizational structure. We combined the two methods and applied them to a dataset of developer digital traces from 253 nascent OSS projects within the Apache Software Foundation (ASF) incubator. We find that the socio-technical and institutional features relate to each other, and provide complimentary views into the progress of the ASF's OSS projects. Refining these combined analyses can help provide a more precise understanding of the synchronization between the evolution of institutional governance and organizational structure. 
    more » « less
  4. Open Source Software (OSS) forms an infrastructure on which numerous (often critical) software applications are based. Substantial research was done to investigate central projects such as Linux kernel but we have only a limited understanding of how the periphery of the larger OSS ecosystem is interconnected through technical dependencies, code sharing, and knowledge flows. We aim to close this gap by a) creating a nearly complete and rapidly updateable collection of version control data for FLOSS projects; b) by cleaning, correcting, and augmenting the data to measure several types of dependencies among code, developers, and projects; c) by creating models that rely on the resulting supply chains to investigate structural and dynamic properties of the entire OSS. The current implementation is capable of being updated each month, occupies over 300Tb of disk space with 1.5B commits and 12B git objects. Highly accurate algorithms to correct identity data and extract dependencies from the source code are used to characterize the current structure of OSS and the way it has evolved. In particular, models of technology spread demonstrate the implicit factors developers use when choosing software components. We expect the resulting research platform will both spur investigations on how the huge periphery in OSS both sustains and is sustained by the central OSS projects and, as a result, will increase resiliency and effectiveness of the OSS. 
    more » « less
  5. Open source software (OSS), a form of Digital or Knowledge Commons, underlies much of the technology that we use in our daily lives. The existence and continuation of OSS relies on the contribution of private resources – personal time, volunteer energy, and effort of numerous actors (e.g., software developers’ time as a common-pool resource) – to public goods, the benefits of which are enjoyed by everyone. Nonprofit organizations such as the Apache Software Foundation (ASF) attempt to aid this process by providing various collective services to OSS projects, acting as a second-order actor in the production of the public good. To this end, the ASF Incubator has created policies – essentially rules or norms – that serve to protect its interests and, as they say, increase the sustainability of the projects. Each policy requires investment by ASF (in terms of money or the use of volunteer time) or an incubating project (in terms of taking project personnel time), the benefits of which can accrue to either party. Such policies may impose additional costs on incubating projects, leading to a decreased production of the OSS public good. Using the ASF Incubator policy documents, we construct a dataset that records who – ASF or an incubating project – bears the cost and who enjoys the benefit of each policy and procedure. We can code most policy statements as costing one party and benefiting one party. The distribution of costs and benefits according to party indicates whether the second-order actor is contributing to an increase in the public good and if they are doing so sustainably. Through a two-way ANOVA, we characterize the impact of ASF policies on the production of public goods (OSS). Being a part of ASF imposes some costs on projects, but these costs may make projects more sustainable. Our analysis shows that the distribution of costs and benefits is fairly symmetric between the ASF and incubating projects. Thus, the configuration of policies or the “institutional design” of the ASF could aid in producing the OSS public good by providing services that projects require. 
    more » « less