skip to main content


This content will become publicly available on May 1, 2024

Title: Gender Representation Among Contributors to Open-Source Infrastructure : An Analysis of 20 Package Manager Ecosystems
While the severe underrepresentation of women and non-binary people in open source is widely recognized, there is little empirical data on how the situation has changed over time and which subcommunities have been more effectively reducing the gender imbalance. To obtain a clearer image of gender representation in open source, we compiled and synthesized existing empirical data from the literature, and computed historical trends in the representation of women across 20 open source ecosystems. While inherently limited by the ability of automatic name-based gender inference to capture true gender identities at an individual level, our census still provides valuable population-level insights. Across all and in most ecosystems, we observed a promising upward trend in the percentage of women among code contributors over time, but also high variation in the percentage of women contributors across ecosystems. We also found that, in most ecosystems, women withdraw earlier from open-source participation than men.  more » « less
Award ID(s):
2107298
NSF-PAR ID:
10433706
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS)
Page Range / eLocation ID:
180 to 187
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Open Source Software (OSS) Foundations and projects are investing in creating Diversity and Inclusion (D&I) initiatives. However, little is known about contributors‘ perceptions about the usefulness and success of such initiatives. We aim to close this gap by investigating how contributors perceive the state of D&I in their community. In collaboration with the Apache Software Foundation (ASF), we surveyed 600+ OSS contributors and conducted 11 follow-up interviews. We used mixed methods to analyze our data-quantitative analysis of Likert-scale questions and qualitative analysis of open-ended survey question and the interviews to understand contributors‘ perceptions and critiques of the D&I initiative and how to improve it. Our results indicate that the ASF contributors felt that the state of D&I was still lacking, especially regarding gender, seniority, and English proficiency. Regarding the D&I initiative, some participants felt that the effort was unnecessary, while others agreed with the effort but critiqued its implementation. These findings show that D&I initiatives in OSS communities are a good start, but there is room for improvements. Our results can inspire the creation of new and the refinement of current initiatives. Open Source Software (OSS) is widely used in society (e.g., Linux, Chrome, and Firefox), and contributing to these projects helps individuals learn and showcase their skills, so much so that the history of contributions are increasingly being analyzed by hirers. However, the people who contribute to OSS are predominately men (about 90%). This means that women and other minorities lose out on job opportunities and OSS projects lose out on diversity of thought. OSS organizations such as the Apache Software Foundation (ASF) promote a variety of initiatives to increase diversity and inclusion (D&I) in their projects, but they are piecemeal and little is known about contributors‘ perceptions about the usefulness and success of these initiatives. Here, we surveyed and interviewed ASF contributors to understand their perceptions about the state of D&I in the ASF and the effectiveness of existing D&I initiatives. Our findings show that individuals who are in the minority face challenges (e.g., stereotyping, lack of peer-network, and representation in decision making) and contributors‘ perceptions of the D&I initiative are a mixed bag, ranging from commending the current efforts to considering them to be “lip service”. These findings suggest that current D&I initiatives in OSS communities are a good start, but much needs be done in terms of creating new successful initiatives and refining current ones. 
    more » « less
  2. University faculty divide their time into their main academic responsibilities, typically identified as teaching, research, service, and, at institutions with strong ties to their surrounding community, outreach. Most studies of time allocation have focused on faculty at Primarily White Institutions. The present study investigated how faculty at five Historically Black Universities (HBUs) allocate their time to their academic responsibilities. Data were analyzed based on their tenure status, gender, and representation in science, technology, engineering, and mathematics. Faculty estimated the percentage of time they currently allocate ( current ), the time they would ideally allocate ( ideal ), and the time they estimate their institution expects them to allocate ( expected ) to each academic responsibility. Across all demographics, there were discrepancies between current and ideal time allocation to research and teaching and, in some demographics, outreach. The greatest discrepancy between current and expected time allocation was observed in time allocated to research, with women and untenured faculty also showing a discrepancy in time allocated to teaching, and underrepresented faculty showing no discrepancies between current and expected time allocation. Women, untenured, and underrepresented faculty reported that their time allocation patterns were guided by external factors rather than personal preferences. The surveyed faculty also stated that the patterns of effort distribution expected to obtain tenure were not necessarily guided by the faculty handbooks at their institution. Although this study is limited by its relatively small sample size, it provides an insight into how faculty at HBUs divide their time and the reasons for them to do so. 
    more » « less
  3. Miller, Eva (Ed.)
    Nascent Professional Identity Development in Freshman Architecture, Engineering, and Construction (AEC) Women Increasing the persistence of talented women into male-dominated architecture, engineering, and construction (AEC) professions could reduce prevailing workforce shortages and improve gender diversity in AEC industry. Identity theorists advocate that professional identity development (PID) improves students’ persistence to become professionals. However, little empirical research exists to inform and guide AEC educators and professionals on AEC-PID in undergraduate AEC women. As the preliminary part of a larger nationwide and longitudinal research study investigating PID processes in undergraduate AEC women, the objective of this research is to examine the characteristics and nascent AEC-PID in 69 women enrolled in freshman AEC courses in five U.S. institutions. A purposive sampling approach ensures participants have a wide range of demographic characteristics. Data from a recruitment survey is analyzed using the NVivo qualitative data analysis software. Content and relational inductive open coding are conducted vertically for each participant and horizontally across different participants. Results indicate passion/interest, inherent abilities, significant others, benefits from industry, and desire to contribute to industry influence decisions to pursue AEC careers. With 52% of participants having science, technology, engineering, art, and math (STEAM) subject preferences, an in vivo code, Perfect Middle Ground, demonstrated the quest to combine STEM and visual art preferences in AEC career decisions. A participant noted that ‘this major (civil engineering) is the perfect middle ground because I can be creative, but still use my strong gift which happens to be math’. Girls with STEAM strengths and passion, particularly in math and fine art, are most likely to develop nascent AEC-PID. Beyond STEM pre-college programs, AEC educators should consider recruiting from sports, as well as visual and performing arts events for pre-college students. Participants’ positive views focus on the importance and significant societal impact of the AEC industry; while, negative views focus on the lack of gender and racial diversity. A combination of participants’ AEC professional experiences and views reveal four increasing levels of nascent AEC-PID which are categorized as the 4Ps: Plain, Passive, Progressive, and Proactive. As a guide to AEC education and professional communities, recommendations are made to increase the AEC-PID of women in each category. With the highest nascent AEC-PID, women in the Proactive category should serve as leaders in AEC classrooms and student organizations. Considering their AEC professional experience and enthusiasm, they should serve as peer mentors to other students, particularly AEC women. Furthermore, they should be given the opportunity to step into more complex roles during internships and encouraged to pursue co-op opportunities. Insights can guide more targeted recruitment, mentoring, preparation, and retention interventions that strengthen the persistence of the next generation of AEC women professionals. In the long term, this could reduce AEC workforce shortages, improve gender diversity, and foster the innovation and development of more gender friendly AEC products and services. 
    more » « less
  4. Women are underrepresented in Open Source Software (OSS) projects, as a result of which, not only do women lose career and skill development opportunities, but the projects themselves suffer from a lack of diversity of perspectives. Practitioners and researchers need to understand more about the phenomenon; however, studies about women in open source are spread across multiple fields, including information systems, software engineering, and social science. This paper systematically maps, aggregates, and synthesizes the state-of-the-art on women’s participation in OSS. It focuses on women contributors’ representation and demographics, how they contribute, their motivations and challenges, and strategies employed by communities to attract and retain women. We identified 51 articles (published between 2000 and 2021) that investigated women’s participation in OSS. We found evidence in these papers about who are the women who contribute, what motivates them to contribute, what types of contributions they make, challenges they face, and strategies proposed to support their participation. According to these studies, only about 5% of projects were reported to have women as core developers, and women authored less than 5% of pull-requests, but had similar or even higher rates of pull request acceptances than men. Women make both code and non-code contributions and their motivations to contribute include, learning new skills, altruism, reciprocity, and kinship. Challenges that women face in OSS are mainly social, including lack of peer parity and non-inclusive communication from a toxic culture. We found ten strategies reported in the literature, which we mapped to the reported challenges. Based on these results, we provide guidelines for future research and practice. 
    more » « less
  5. null (Ed.)
    Cyberbullying is rapidly becoming one of the most serious online risks for adolescents. This has motivated work on machine learning methods to automate the process of cyberbullying detection, which have so far mostly viewed cyberbullying as one-off incidents that occur at a single point in time. Comparatively less is known about how cyberbullying behavior occurs and evolves over time. This oversight highlights a crucial open challenge for cyberbullying-related research, given that cyberbullying is typically defined as intentional acts of aggression via electronic communication that occur repeatedly and persistently . In this article, we center our discussion on the challenge of modeling temporal patterns of cyberbullying behavior. Specifically, we investigate how temporal information within a social media session, which has an inherently hierarchical structure (e.g., words form a comment and comments form a session), can be leveraged to facilitate cyberbullying detection. Recent findings from interdisciplinary research suggest that the temporal characteristics of bullying sessions differ from those of non-bullying sessions and that the temporal information from users’ comments can improve cyberbullying detection. The proposed framework consists of three distinctive features: (1) a hierarchical structure that reflects how a social media session is formed in a bottom-up manner; (2) attention mechanisms applied at the word- and comment-level to differentiate the contributions of words and comments to the representation of a social media session; and (3) the incorporation of temporal features in modeling cyberbullying behavior at the comment-level. Quantitative and qualitative evaluations are conducted on a real-world dataset collected from Instagram, the social networking site with the highest percentage of users reporting cyberbullying experiences. Results from empirical evaluations show the significance of the proposed methods, which are tailored to capture temporal patterns of cyberbullying detection. 
    more » « less