<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Open science, communal culture, and women’s participation in the movement to improve science</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>09/29/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10211529</idno>
					<idno type="doi">10.1073/pnas.1921320117</idno>
					<title level='j'>Proceedings of the National Academy of Sciences</title>
<idno>0027-8424</idno>
<biblScope unit="volume">117</biblScope>
<biblScope unit="issue">39</biblScope>					

					<author>Mary C. Murphy</author><author>Amanda F. Mejia</author><author>Jorge Mejia</author><author>Xiaoran Yan</author><author>Sapna Cheryan</author><author>Nilanjana Dasgupta</author><author>Mesmin Destin</author><author>Stephanie A. Fryberg</author><author>Julie A. Garcia</author><author>Elizabeth L. Haines</author><author>Judith M. Harackiewicz</author><author>Alison Ledgerwood</author><author>Corinne A. Moss-Racusin</author><author>Lora E. Park</author><author>Sylvia P. Perry</author><author>Kate A. Ratliff</author><author>Aneeta Rattan</author><author>Diana T. Sanchez</author><author>Krishna Savani</author><author>Denise Sekaquaptewa</author><author>Jessi L. Smith</author><author>Valerie Jones Taylor</author><author>Dustin B. Thoman</author><author>Daryl A. Wout</author><author>Patricia L. Mabry</author><author>Susanne Ressl</author><author>Amanda B. Diekman</author><author>Franco Pestilli</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Science is undergoing rapid change with the movement to improve science focused largely on reproducibility/replicability and open science practices. This moment of change—in which science turns inward to examine its methods and practices—provides an opportunity to address its historic lack of diversity and noninclusive culture. Through network modeling and semantic analysis, we provide an initial exploration of the structure, cultural frames, and women’s participation in the open science and reproducibility literatures (              n              = 2,926 articles and conference proceedings). Network analyses suggest that the open science and reproducibility literatures are emerging relatively independently of each other, sharing few common papers or authors. We next examine whether the literatures differentially incorporate collaborative, prosocial ideals that are known to engage members of underrepresented groups more than independent, winner-takes-all approaches. We find that open science has a more connected, collaborative structure than does reproducibility. Semantic analyses of paper abstracts reveal that these literatures have adopted different cultural frames: open science includes more explicitly communal and prosocial language than does reproducibility. Finally, consistent with literature suggesting the diversity benefits of communal and prosocial purposes, we find that women publish more frequently in high-status author positions (first or last) within open science (vs. reproducibility). Furthermore, this finding is further patterned by team size and time. Women are more represented in larger teams within reproducibility, and women’s participation is increasing in open science over time and decreasing in reproducibility. We conclude with actionable suggestions for cultivating a more prosocial and diverse culture of science.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Science is undergoing rapid change with the movement to improve science focused largely on reproducibility/replicability and open science practices. This moment of change-in which science turns inward to examine its methods and practices-provides an opportunity to address its historic lack of diversity and noninclusive culture. Through network modeling and semantic analysis, we provide an initial exploration of the structure, cultural frames, and women's participation in the open science and reproducibility literatures (n = 2,926 articles and conference proceedings). Network analyses suggest that the open science and reproducibility literatures are emerging relatively independently of each other, sharing few common papers or authors. We next examine whether the literatures differentially incorporate collaborative, prosocial ideals that are known to engage members of underrepresented groups more than independent, winner-takes-all approaches. We find that open science has a more connected, collaborative structure than does reproducibility. Semantic analyses of paper abstracts reveal that these literatures have adopted different cultural frames: open science includes more explicitly communal and prosocial language than does reproducibility. Finally, consistent with literature suggesting the diversity benefits of communal and prosocial purposes, we find that women publish more frequently in high-status author positions (first or last) within open science (vs. reproducibility). Furthermore, this finding is further patterned by team size and time. Women are more represented in larger teams within reproducibility, and women's participation is increasing in open science over time and decreasing in reproducibility. We conclude with actionable suggestions for cultivating a more prosocial and diverse culture of science.</p><p>open science | reproducibility | replicability | women | culture A t the current moment, science is undergoing a "revolution" to better itself <ref type="bibr">(1)</ref>. The aim of this revolution is bold. At its core, the movement to improve science encompasses two primary goals: 1) understanding the flaws, weaknesses, and reproducibility of past scientific processes and findings (e.g., evaluating the strength of the evidence) and 2) improving research practices through greater rigor and transparency (e.g., open sharing of data, code, resources; standardized statistical procedures; preregistration). As with any revolution, a time of unrest can also be a time of opportunity. Indeed, researchers involved in the efforts to improve Significance Science is rapidly changing with the current movement to improve science focused largely on reproducibility/replicability and open science practices. Through network modeling and semantic analysis, this article provides an initial exploration of the structure, cultural frames of collaboration and prosociality, and representation of women in the open science and reproducibility literatures. Network analyses reveal that the open science and reproducibility literatures are emerging relatively independently with few common papers or authors. Open science has a more collaborative structure and includes more explicit language reflecting communality and prosociality than does reproducibility. Finally, women publish more frequently in high-status author positions within open science compared with reproducibility. Implications for cultivating a diverse, collaborative culture of science are discussed.</p><p>science have acknowledged a gender diversity problem <ref type="bibr">(2,</ref><ref type="bibr">3)</ref>, and this time of reform offers the opportunity to reinvent scientific culture in a more inclusive mode. If the movement to improve science perpetuates the traditional scientific culture that prioritizes independent, dominant, or adversarial values, it risks continuing to leave many talented individuals at the margins, feeling unwelcome and excluded (4)-exacerbating a global problem that the sciences are trying to solve <ref type="bibr">(5)</ref><ref type="bibr">(6)</ref><ref type="bibr">(7)</ref><ref type="bibr">(8)</ref>. In its efforts to improve its methods and replicability, we wondered whether science might also be achieving improvements in the gender representation and inclusivity of the movement itself. This article applies cultural and network analysis to examine the emerging cultures in the movement to improve science-specifically in the reproducibility and open science literatures-and to investigate the representation of women in these emerging subcultures. We discuss implications of these different cultural avenues for science going forward.</p><p>In cultural analyses, the actions and cognitions of individuals both rise from and produce the norms and practices of groups and institutions <ref type="bibr">(9)</ref>. Further, the "who" and the "how" of cultural practices are inextricably intertwined: "how" a subculture operates influences "who" engages in the subculture, and "who" engages in the subculture influences "how" a subculture operates. The cultural practices of the current scientific reform movements influence who engages. The emerging reform movements have their roots in the broader culture of science, technology, engineering, and math (STEM) that can serve as a barrier to the inclusion and advancement of women <ref type="bibr">(10)</ref><ref type="bibr">(11)</ref><ref type="bibr">(12)</ref>. The culture of science has long valued individual brilliance, competition, and a winner-take-all model of success <ref type="bibr">(13)</ref>. In particular, people inside and outside of STEM perceive STEM fields as affording more opportunities for individual success and achievement than for prosociality and collaboration <ref type="bibr">(14)</ref>.</p><p>The scientific practice of rewarding individual achievement has perhaps unwittingly fostered a more independent, competitive culture that ignores and possibly even disincentivizes cooperation <ref type="bibr">(15,</ref><ref type="bibr">16)</ref>. These cultural practices have implications for who joins and advances within scientific fields. For example, the perceived lack of prosocial and collaborative culture in STEM has been shown to deter women especially <ref type="bibr">(14,</ref><ref type="bibr">17)</ref>. Indeed, the presence of collaborative practices and prosocial purposes may be particularly important in fields focused on scientific reform: critiquing established authors or practices-no matter how well intended or delicately stated-is often interpreted as criticism and puts the critiqued in a defensive position.</p><p>The role of critic may be particularly risky and unappealing to female scientists. First, women may feel less able to voice dissent (particularly when in the numerical minority) against established figures, because this conflict-prone stance violates gender role expectations <ref type="bibr">(18)</ref>. Women who are perceived as self-promoting or aggressive face more negative evaluations than their male counterparts <ref type="bibr">(19)</ref>; thus, engaging in critiques or debates can elicit more backlash toward women than men, and the mere anticipation of backlash can inhibit women's engagement in these spheres. Second, women may prefer a collective approach for pragmatic and principled reasons. Pragmatically, there is psychological safety in numbers <ref type="bibr">(20)</ref><ref type="bibr">(21)</ref><ref type="bibr">(22)</ref>, and women's critiques may be more likely to be offered and listened to when they are part of a larger scientific team. Further, because combative and adversarial behaviors are perceived as masculine, women may be less socialized to engage in these behaviors than men and/or view them as off-putting and less likely to be productive <ref type="bibr">(23)</ref>. In principle, a collectivist orientation may disfavor challenges to the establishment when framed as for the benefit of the challenger (i.e., gaining recognition) rather than for the collective good (i.e., improving and advancing science).</p><p>However, we draw attention to another causal pathway as well: subcultures that include a larger proportion of women (or other underrepresented group members) could engage in different practices than more homogenous subcultures. For example, legislative bodies that include greater proportions of women legislators engage more with policies related to education and health care <ref type="bibr">(24)</ref><ref type="bibr">(25)</ref><ref type="bibr">(26)</ref>. Culture is a cyclical process, and thus greater inclusion and advancement of women foster norms and behaviors that in turn can contribute to increasing gender diversity <ref type="bibr">(6,</ref><ref type="bibr">27,</ref><ref type="bibr">28)</ref>.</p><p>The movement to improve science, to date, can be characterized by two contrasting motifs-both aimed to improve science. One focus centers on the assessment of the reproducibility and replicability of previously published scientific results. We note that the National Academies of Sciences, Engineering, and Medicine has only recently formalized a distinction between reproducibility and replicability <ref type="bibr">(29)</ref>. Before this formalization, the two terms had historically been used with different conventions in different fields, with a prevalence of the term reproducibility <ref type="bibr">(29)</ref><ref type="bibr">(30)</ref><ref type="bibr">(31)</ref><ref type="bibr">(32)</ref><ref type="bibr">(33)</ref><ref type="bibr">(34)</ref><ref type="bibr">(35)</ref><ref type="bibr">(36)</ref>. For this reason, our analysis (that uses historical data across fields) does not separate the two; instead, throughout the report, we use the term "reproducibility" to refer to the literature that we analyze.* A second approach aimed to improve science consists of "open science" practices that facilitate the sharing and reuse of research assets (e.g., data, code) in order to improve rigor and accelerate the rate of scientific discovery <ref type="bibr">(37)</ref><ref type="bibr">(38)</ref><ref type="bibr">(39)</ref>. For shorthand, we refer to these two literatures as "reproducibility" and "open science." Indeed, both literatures aim to improve science, are led by scientists, and engage in deep analysis and critique of current scientific practices while offering guidance and suggestions for how to improve scientific practices. Here, we explored whether the reproducibility and open science literatures exhibit different 1) collaborative structures, 2) explicitly prosocial foci, and 3) engagement of female scientists. We anticipated that this initial investigation would reveal evidence of different emerging cultures in the reproducibility vs. open science literatures-with implications for the future representation and practices of these movements.</p><p>Our team conducted network analyses of the open science and reproducibility literatures and found that these literatures have few common papers and authors-suggesting these improvement approaches have developed relatively independently from each other. Given this, we compared these literatures for hallmarks of collaborative and prosocial culture. We find a more interconnected authorship network within open science compared with reproducibility, and semantic text analyses of article abstracts reveal that the open science and reproducibility literatures appear to be adopting different explicit cultural frames. Open science includes significantly more language that reflects the cultural values of prosociality compared with reproducibility. We then examine the configuration of women's participation in these literatures. We find patterns of women's participation consistent with the theoretical idea that women's participation is less constrained in more collaborative and prosocial cultures (i.e., in open science than in reproducibility). Women scholars are more likely to occupy high-status author positions (taking the first or last author position) within open science compared with reproducibility (see Fig. <ref type="figure">3</ref>); further, women's high-status authorship occurs less frequently in smaller teams within reproducibility (compared with open science). In larger teams-that might offer greater collective safety or communal purpose-*Today, it is acknowledged that reproducibility can have different meanings in different fields of science <ref type="bibr">(29)</ref><ref type="bibr">(30)</ref><ref type="bibr">(31)</ref>. We explored how different approaches to reproducibility (e.g., repeatability, data sharing) were categorized by our process. We found that all papers with the MAG field of study tag "repeatability" were categorized by our method as "reproducibility" papers-in line with the National Academy of Sciences (NAS) conceptualization of reproducibility <ref type="bibr">(29)</ref>. Furthermore, almost all papers with the MAG field of study tags "open data" or "data sharing" were categorized by our method as "open science" papers, as intended (SI Appendix, Table <ref type="table">S1</ref>). We should also note that the dataset for this report was compiled in 2018 (SI Appendix)-1 y before the distinction between reproducibility and replicability was formalized by the NAS report <ref type="bibr">(29)</ref>.</p><p>there is little difference in women's representation in leadership roles between the two literatures. Finally, we find that women's participation in high-status authorship positions is increasing over time in open science, whereas it is decreasing in reproducibility.</p><p>Taken together, we find that despite current controversies ( <ref type="formula">2</ref>), the open science focus of the movement to improve science has the seed of an interconnected and prosocial culture that, if further cultivated, may continue to attract greater participation by women. We believe that the collaborative, forward-looking focus of open science has the potential to facilitate greater diversity and inclusiveness. While our focus on author gender in this article was motivated, in part, by the ability to apply validated, automated coding methods (that are highly reproducible) to determine author gender, we would nevertheless predict similar findings for scholars from other underrepresented groups. When fields are more adversarial and less prosocial, individuals from underrepresented groups (including women) may be less motivated to engage <ref type="bibr">(40)</ref> due at least in part to the power dynamics described above. In contrast, fields that emphasize collaborative and prosocial norms inspire greater participation among underrepresented groups <ref type="bibr">(41)</ref>. It should be noted that both adversarial and collaborative cultures can engage in rigorous debate and criticism. However, collaborative cultures may afford more constructive criticism, which is a hallmark of good, forward-thinking science and what all scientists expect of peers in the field. If we wish to improve and advance the field of science, then the onus is on investigators to nurture a culture that attracts and retains a diversity of people <ref type="bibr">(42)</ref><ref type="bibr">(43)</ref><ref type="bibr">(44)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results</head><p>We performed both network science and semantic text analyses to establish the structural landscape and cultural foci of the open science and reproducibility literatures and women's participation in them. To do so, our team analyzed data from Microsoft Academic Graph (MAG) <ref type="bibr">(45)</ref>, consisting of 2,926 scientific articles and conference proceedings (hereafter referred to as "papers") published between 2010 and 2017 that included "open science" or "reproducibility" as a field of study code (Methods and SI Appendix). This sample consisted of 879 open science papers and 2,047 reproducibility papers. Only 2.3% of papers shared "open science" and "reproducibility" field codes, suggesting these approaches are developing relatively independently (see SI Appendix for more details).</p><p>Open Science and Reproducibility Differ in Their Network Community Structures. We analyzed a total of 3,157 unique article author identification numbers (IDs) in the open science literature and 8,766 in the reproducibility literature. We built two collaboration networks using these author IDs from MAG (Fig. <ref type="figure">1</ref>). Nodes in these networks represent scientific articles; edges represent shared authorship such that two nodes share an edge if at least one author appears in both papers (see Methods for details). Results revealed that the open science network contained 879 nodes and 389 edges, while the reproducibility network contained 2,047 nodes and 856 edges. Importantly, the open science network is more edge-dense (0.101%) than the reproducibility network (0.041%)-demonstrating a higher degree of interconnectedness, which suggests a more dense collaborative network within the open science literature (one-sided Fisher's exact test:</p><p>We also performed a connected components analysis of each literature <ref type="bibr">(47,</ref><ref type="bibr">48)</ref> to measure the degree of isolation of individual subnetworks of papers within each literature (Methods). Results show that the reproducibility network (1,641; 0.80 components per article) contains more isolated articles (sharing fewer authors) than the open science network (661; 0.75 components per article). This components analysis indicates that the reproducibility literature's network is more fragmented. Examining the component size differences of the two networks as another indicator of connectedness, we find that the average component size (ACS) is also higher for the open science network (ACS: 1.33 vs. 1.25). Fig. <ref type="figure">1</ref> visualizes the two networks to facilitate interpretation of the observed network connectedness and fragmentation differences between the two literatures. In sum, the open science literature was found to have a greater number of connections (shared authors) between papers and the reproducibility literature contains more isolated and smaller paper networks-and these differences between the two literatures are statistically significant (P &lt; 0.01, as reported above). As a robustness check, we conducted the same analyses excluding all solo-authored papers. Results revealed that these findings are robust to this alternative analysis (see SI Appendix for details).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Semantic Text Analyses Suggest That the Explicit Cultures of the Open</head><p>Science and Reproducibility Literatures Are Different. Using a validated text-mining dictionary (49), we measured the presence of communal and prosocial constructs (e.g., contribute, encourage, help, nurture; see SI Appendix, Table <ref type="table">S2</ref> for the list of constructs used) in the abstracts of the papers from both literatures. We excluded papers with no available abstract and those with non-English titles. The resulting dataset included 595 open science papers and 1,169 reproducibility papers. In the open science dataset, 76% of the articles used words associated with communal and prosocial constructs, whereas in the reproducibility dataset, only 44% of the articles did (two-sided test for equality of binomial proportions, P &lt; 0.001). We computed the "prosocial word density" (PWD) within each dataset as the percentage of words in each abstract that reflect communal and prosocial constructs (Fig. <ref type="figure">2</ref> and<ref type="figure">Methods</ref>). The open science abstracts included more communal and prosocial words than the reproducibility abstracts (open science: mean PWD of 2.4%, median PWD of 1.8%; reproducibility: mean PWD of 0.9%, median PWD of 0.0%). A two-sided permutation test for differences in the mean and median PWD in each dataset shows that the open science literature includes significantly more frequent use of communal and prosocial words than does the reproducibility literature (P &lt; 0.001 for mean and median PWD). Thus, we find that abstracts in the open science literature include significantly more words associated with communality and prosociality than those in the reproducibility literature.</p><p>An alternative hypothesis is that these textual differences are simply driven by disciplinary field. To examine this possibility, we stratified the model by academic field of study (i.e., computer science, engineering, medicine) and found similar effects (see SI Appendix, Fig. <ref type="figure">S5</ref> for details). Thus, the finding that open science incorporates more explicitly prosocial language compared with reproducibility is robust to disciplinary field. We first analyzed single-author papers with identifiable author gender (we used an algorithm that employs census data to classify author names into the gender binary [SI Appendix], while acknowledging that gender is a complex and multidimensional social construct). As in scientific publishing more broadly <ref type="bibr">(50)</ref><ref type="bibr">(51)</ref><ref type="bibr">(52)</ref>, results revealed that, overall, women are significantly less likely than men to publish single-author papers in both literatures. An exact one-sided Binomial test indicated that the percentage of female single authors is 33.0% in the open science literature and 28.1% in reproducibility; both are lower than 50%-the proportion that would indicate gender parity (P &lt; 0.001 for both tests). This suggests that women are equally engaged with each topic area in single-author roles, although underrepresented in both literatures compared with their singleauthor male colleagues.</p><p>For the remaining analyses, we focus on multiauthor papers. Women hold high-status authorship positions in 60.6% of the multiple-author papers in the open science literature, compared with 57.9% in the reproducibility literature. Note that with gender parity, the expected percentage of multiple-author papers with a woman in a high-status (first or last) author position would be 75% (comprised of a 25% chance of woman first and last, a  25% chance of woman-first and man-last, and a 25% chance of man-first and woman-last).</p><p>We performed a regression analysis to better understand gender differences in high-status authorship positions across the two literatures. Specifically, we fit a logistic spline regression model controlling for time trends, team size, and manuscript type (i.e., journal article or conference proceeding). For this analysis, we used a subset of multiauthored papers for which we were able to conclude whether or not a woman holds a highstatus position (i.e., where with some degree of confidence, the gender of the first and last author could be determined, or the gender of the first or last author could be identified as female even if the others could not be identified). We also excluded 28 open science papers and 40 reproducibility papers with more than 12 authors to avoid giving these papers disproportionate influence on regression fit. The resulting dataset consisted of 454 open science papers and 955 reproducibility papers. After controlling for team size, year of publication, and manuscript type, we found that multiauthor papers in the reproducibility literature have 61% lower odds of having a woman in a high-status authorship position compared with the open science literature (P &lt; 0.001; SI Appendix, Table <ref type="table">S3</ref>). Thus, whereas women are underrepresented in high-status author positions on multiauthored papers in both literatures (relative to gender parity), there is significantly greater representation of women authors in highstatus author positions in the open science (vs. reproducibility) literature.</p><p>However, again, an alternative hypothesis is that these gender differences in high-status author positions are simply driven by disciplinary field. To examine this possibility, we fit the model controlling for the academic field of study and found similar effects (see SI Appendix for details). Thus, the gender representation difference in high-status authorship positions in open science (vs. reproducibility) is robust to disciplinary field.</p><p>Women's high-status authorship is more constrained by team size in reproducibility than in open science. Women's high-status authorship is differently patterned by team size in these literatures. Within multiauthored papers, women's likelihood of authoring in highstatus positions in the open science literature is greatest in smaller teams (two-to three-author papers; Fig. <ref type="figure">4</ref>) and remains relatively consistent as teams become larger (Fig. <ref type="figure">5</ref>, Left). However, within the reproducibility literature, women are less likely to author in high-status positions in smaller teams (two-to three-author papers) and more likely to do so in larger teams (six-to seven-author papers). Regression analyses confirm this difference after controlling for other important variables, including publication year and manuscript type (Fig. <ref type="figure">5</ref>, Left).</p><p>We also considered the alternative hypothesis that field differences could be driving the observed relationship between women's participation and team size. To examine this, we conducted the same regression analyses stratified by field and found that the results were largely robust across fields. That is, women are underrepresented in high-status author positions on smaller teams in the reproducibility literature (compared with the open science literature; see SI Appendix for a detailed description of these analyses and findings). Taken together, we find that women's participation in high-status author positions is more constrained in reproducibility than in open science and occurs more frequently in larger teams within the reproducibility literature.</p><p>Women's representation in high-status author positions is increasing in open science over time and decreasing in reproducibility. Further regression analyses reveal that in the open science literature, the representation of women in high-status authorship positions has grown over time, while it has declined or failed to increase in the reproducibility literature. We find that the odds of a woman holding a high-status position in the open science literature has grown at a rate of &#8764;15.6% (P &lt; 0.01) year-over-year from 2010 to 2017 (SI Appendix, Table <ref type="table">S3</ref>), controlling for team size and manuscript type. In the reproducibility literature, over the same time period the representation of women in high-status positions has declined at an estimated rate of &#8764;3.6%, although this decline is not statistically significant (P = 0.20). Examining the difference between these slopes reveals a statistically significant difference between women's representation over time between these literatures (P &lt; 0.01). Fig. <ref type="figure">5</ref>, Right illustrates the difference in trends over time between the two literatures on the probability scale.</p><p>Finally, we again explored the alternative field hypothesis: that women's participation over time was driven by field differences. Specifically, we conducted the same regression analyses stratified by field and found that the results were largely robust across fields. That is, we found growing participation of women in open science over time and decreasing participation of women in reproducibility in every field except psychology, where women's participation has grown over time in reproducibility (53) (see SI Appendix for detailed field analyses and findings).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Discussion</head><p>Our results reveal that the movement to improve science consists of two relatively independent groups of investigators with differing approaches: 1) open science and 2) reproducibility. These literatures have relatively few common papers and authors, indicating they are distinct, nonoverlapping communities. Each  Given these findings, we argue that there are strong reasons for science generally-including both subcultures of science reform-to adopt inclusive and prosocial cultures. First, a culture that portrays science as noncommunal does not reflect how scientific work actually unfolds-particularly with today's emphasis on grand challenges, transdisciplinary investigations, and network science. Indeed the (false) prototype of a scientist is one in which an individual scientist (usually a white male) toils away alone in his laboratory until a flash of insight occurs in a "eureka!" moment <ref type="bibr">(54)</ref><ref type="bibr">(55)</ref><ref type="bibr">(56)</ref>. This culture is epitomized by some of our most prestigious awards that celebrate individual efforts and contributions over that of teams (e.g., Nobel prize, MacArthur Fellowship Award, NIH Director's Pioneer Award, NSF Career Award; NIH "independent investigator" categorization). Moreover, faculty evaluations for tenure and promotion continue to prize individual performance almost exclusively-in some cases requiring scientists to show their independent contribution to collaborative projects and/or calculating the number of first-or last-authored (vs. coauthored) publications ( <ref type="formula">57</ref>). Today's science relies on teams coordinating their efforts to share insights and methods, build on past work, and develop new questions and approaches <ref type="bibr">(58)</ref>. These collaborative and complementary processes occur locally (e.g., direct work with other laboratories) as well as globally (e.g., broadening the scientific community, sharing equipment, data, and access) <ref type="bibr">(59)</ref>. Science today is more likely to be a collaborative, than individual, endeavor-where team size can matter. Indeed, larger and more diverse teams may be necessary to realize higher impact <ref type="bibr">(60)</ref>. A problem, however, is that while science is increasingly team-based, homophily processes mean that many teams are likely to be relatively homogeneous with regard to sociodemographic, behavioral, and intrapersonal characteristics <ref type="bibr">(61)</ref>. Attention should be paid, proactively, to the composition of teams.</p><p>Second, and consistent with the point above, there is an increasing appreciation among scientists and funding agencies that multidisciplinary "team science" is required to tackle the most pressing scientific, social, and health problems of our times. Over the last decade, organizations including NIH, NSF, and others have dedicated resources to facilitating team science. This work is evidenced by interdisciplinary and multidisciplinary team requirements in federal funding announcements and programs (e.g., National Institute of General Medical Sciences Collaborative Program Grant for Multidisciplinary Teams, NSF Office of Multidisciplinary Activities, NIH Interdisciplinary Program in the Common Fund and its predecessors in the NIH Roadmap, the National Cancer Institute's Science of Team Science Toolkit, NSF Big Data Regional Innovation Hubs Program, NSF Collaborative Computational Neuroscience Program, NSF Office of Multidisciplinary Activities) and many other programs under the NSF and NIH roadmaps and priorities). Moreover, funders are actively attempting to address the underrepresentation of women and minorities (e.g., NSF Broadening Participation), although there are still inequities in these processes <ref type="bibr">(62)</ref>.</p><p>Indeed, the complexity of the problems we are now facing in science demands the expertise of multiple disciplines working in coordinated fashion <ref type="bibr">(63,</ref><ref type="bibr">64)</ref>. For example, addressing the problem of opioid addiction requires the integrated knowledge of researchers who specialize in pain, addiction, neuroscience, economics, computer science, psychology, sociology, biochemistry, demography, medicine, and public health, just to name a few. Intellectually diverse, multidisciplinary teams create new insights by combining existing knowledge in innovative ways <ref type="bibr">(65,</ref><ref type="bibr">66)</ref>. In fact, data from the US Patent and Trademarks Office show that patents generated by teams represented more breakthroughs, landing among the top 95% of all cited patents, than those from lone inventors, suggesting their generative nature <ref type="bibr">(67)</ref>. Similarly, multiauthored articles are more often cited than single-authored articles <ref type="bibr">(60,</ref><ref type="bibr">68,</ref><ref type="bibr">69)</ref>, and while some have argued that this could be due to self-citation, others have suggested that it is more likely that highly collaborative projects include more diverse data and higher quality ideas, which result in greater impact <ref type="bibr">(70)</ref>. Importantly, it has also been suggested that whereas large teams advance science and technology, small teams can disrupt the established scientific understanding. Both types of contributions seem to be of fundamental importance <ref type="bibr">(71,</ref><ref type="bibr">72)</ref>. In any case, if diverse team science is the future, institutions must reconsider individually constructed incentive structures as these structures may not promote rapid progress if scientists remain tied to individual incentives.</p><p>Finally, a third reason to prefer a prosocial scientific culture, consistent with our findings and that of other research, is that noncommunal practices and values may deter people who value communal, interdependent, and prosocial goals, including women <ref type="bibr">(14)</ref>, underrepresented minorities <ref type="bibr">(41,</ref><ref type="bibr">73)</ref>, first-generation college students <ref type="bibr">(73)</ref>, and communally oriented men <ref type="bibr">(14)</ref>. If the movement to improve science is to harness this diversity, the open science focus currently appears to be more welcoming and inclusive than reproducibility. However, both foci have the common goal of improving our knowledge, rigor, and understanding. These contributions are likely enhanced when a diverse range of scientists are fully participating in either approach's efforts.</p><p>Lack of Diversity Can Be Problematic for Science. Lack of social diversity (e.g., gender and racial diversity) within scientific teams can be detrimental to science. There are many case studies where homogenous teams have produced serious failures of knowledge with regard to critical outcomes. For example, with no women on engineering and development teams, heart valves and seat belts are made that only fit men's bodies (significantly increasing mortality rates for women) (74), voice-recognition software only recognizes the voices of men <ref type="bibr">(74)</ref>, and image-recognition software tags Black people as apes <ref type="bibr">(75)</ref>. Including and heeding the voices and experiences of a range of people can foster outcomes that benefit a wider range of people. While teams with more gender and cultural diversity are more likely to develop new products and introduce radical innovations to market <ref type="bibr">(76,</ref><ref type="bibr">77)</ref>, and while papers authored by diverse scientific teams have more citations and higher impact factors <ref type="bibr">(78)</ref>, the mere presence of social diversity is not always sufficient to foster equal participation of diverse social groups. For example, a large-scale analysis of contemporary scientific articles found that women were significantly more likely to be associated with technical tasks, whereas men were associated with conceptual tasks <ref type="bibr">(79)</ref>. Similarly, in gender-diverse engineering teams of students, women were underrepresented in presenting technical content, while men were overrepresented <ref type="bibr">(80)</ref>. Indeed, the potential of social diversity often goes untapped, leading to null or negative results on group performance <ref type="bibr">(81)</ref><ref type="bibr">(82)</ref><ref type="bibr">(83)</ref><ref type="bibr">(84)</ref>.</p><p>To capitalize on the potential of social diversity, teams need to directly address the challenges that can accompany social diversity. For example, interactions and communication within diverse teams may be more difficult, especially at first <ref type="bibr">(85)</ref><ref type="bibr">(86)</ref><ref type="bibr">(87)</ref>. However, there is great potential of social diversity, particularly in complex tasks. Socially diverse teams encode and process information more accurately <ref type="bibr">(88)</ref>, especially when the sharing of disparate facts is a requirement for success <ref type="bibr">(89)</ref>. The mere presence of people from socially diverse backgrounds alters the cognition and behavior of majority group members to foster improved and accurate thinking and communication <ref type="bibr">(90)</ref>. In the presence of social diversity, majority group members raise more facts and make fewer factual errors, and when errors are made, they are more likely to be corrected <ref type="bibr">(90)</ref>. When questions and dissent are raised in socially diverse teams, it provokes more thought and consideration than when the exact same concerns are raised in homogenous teams <ref type="bibr">(91)</ref>. Finally, the presence of underrepresented group members can foster greater participation from other underrepresented group members. One example is that gender-diverse teams with more women foster women's active participation in team projects, whereas teams that are comprised of mostly men often render women silent <ref type="bibr">(86)</ref>.</p><p>The Emerging Movements to Improve Science. The psychological and brain sciences (PBS) are at the forefront of efforts to redefine the rules and standards of science <ref type="bibr">(92,</ref><ref type="bibr">93)</ref>. There is much to learn from this emerging movement, and several other fields <ref type="bibr">(94)</ref><ref type="bibr">(95)</ref><ref type="bibr">(96)</ref><ref type="bibr">(97)</ref><ref type="bibr">(98)</ref> are similarly taking stock, including biostatistics <ref type="bibr">(99,</ref><ref type="bibr">100)</ref>, computer science <ref type="bibr">(101)</ref>, and medicine <ref type="bibr">(102,</ref><ref type="bibr">103)</ref>. For example, the team science approach to improving science can be observed in theoretical and experimental physics where investigative necessity has promoted large-scale consortia and successful models of scientific collaboration <ref type="bibr">(104)</ref>. Similarly, the collaborative discipline of structural biology established standards for sharing and deposition of code and data (see Collaborative Computational Project No. 4 and Research Collaboratory for Structural Bioinformatics), and these communal practices coincided with a broader participation of women in the field over its ten decades <ref type="bibr">(105)</ref>.</p><p>In sum, open science has the seed of a communal and sharing culture that, if cultivated, may continue to foster the inclusion and participation of women. We suggest that pivoting toward this cultural style could help to diversify the reproducibility movement without detracting from its core goals. We believe that the collaborative, forward-looking aspect of open science has the potential to facilitate diversity and inclusiveness in two ways. First, the sharing of code, data, and resources lowers the barriers and entry cost to participate in science, thus establishing a more equal playing field and enhancing the inclusion of underrepresented groups-for example, scientists working in minorityserving institutions with less access to funding and other resources <ref type="bibr">(106)</ref>. Second, a culture of sharing, interdependence, and collaboration is consistent with research (cited above) that suggests these cultural features are more attractive to women, people of color, people from lower socioeconomic backgrounds, and communally oriented men.</p><p>Some aspects of the movements to improve science have explicitly focused on cultural values and practices to promote inclusivity. For example, the Society for the Improvement of Psychological Science explicitly includes working toward an inclusive culture in its mission statement, and the online methods and practices discussion group PsychMAP was founded to provide a more collaborative and communal space for discussion (see community ground rules). To be sure, reflecting and learning from within a cultural shift is difficult. The analysis we offer here suggests that we can still do more to improve science through social diversity. We propose that the benefits of team science will be realized when such teams are both socially and intellectually diverse and operate in contexts that welcome and pursue diversity, so that innovation, creativity, and the quality of science can flourish-despite an initial period of adjustment and discomfort. Science needs the participation of women and other underrepresented groups. The goals and ideals of open science have the potential to promote diversity and broader scientific participation. However, the promise of these emerging cultural trends is not yet a certainty; indeed, some features of the dominant scientific culture can deter participation among the very individuals who may contribute to the strength of diverse thinking. By fostering cultural change toward prosocial values, sharing, education, and cross-disciplinary cooperation, rather than independence and competitiveness, the movement to improve science may lead to greater knowledge generation, democratization, and inclusiveness in science.</p><p>Specific steps can and are being made to facilitate and advance the diversity we are promoting. Departments, institutions, and professional societies can create communal and prosocial structures for open science, such as open infrastructure and initiatives to allow for establishing educational networks, training, resources, and data sharing. Other specific examples include the development of Transparency and Openness Promotion Guidelines <ref type="bibr">(39)</ref> and the establishment of cloud-based platforms and associated user communities for research asset sharing. See examples in PBS, data in OpenNeuro.org <ref type="bibr">(107)</ref>, analyses in brainlife.io <ref type="bibr">(108)</ref><ref type="bibr">(109)</ref><ref type="bibr">(110)</ref>, and study registrations in Open Science Framework <ref type="bibr">(39)</ref>. Individual researchers can learn about the who, when, how, and why of their teams, including attending to the range of people represented, identifying opportunities to include diverse voices, and analyzing reasons and barriers for groups' or individuals' participation. Organizations that highlight the collaborative and communal aspects of scientific processes and success can feature connections in science, acknowledging how others help overcome stumbling blocks and rewarding teams that embody the values of open science. Each researcher can work toward broadening their collaboration and mentoring networks. We encourage readers and all members of the scientific community to embrace a learning mindset regarding team science and socially diverse teams. Science continually has more to teach, and the rewards of a cultural shift are not free; they come from investments of time, energy, understanding, and action.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Methods</head><p>Data Sources. A total of 11,338 original papers were collected using the snapshot of MAG (<ref type="url">https://academic.microsoft.com</ref>) on February 23, 2018. To collect the datasets, we searched MAG for all publications with specific "field of study tags" as "open science" or "reproducibility." The field of study tags are produced by an internal Microsoft algorithm based on the contents and metadata (e.g., abstracts) of each paper (not author-generated; see ref. <ref type="bibr">111</ref> for details). Among all of the records, only 68 papers were categorized as both "open science" and "reproducibility". Moreover, of the 36,296 unique author IDs represented in these literatures, very few (n = 457) have authored in both literatures. These findings suggest that the two literatures are developing rather independently. For the purposes of our analyses, we removed papers that were categorized as both "open science" and "reproducibility" to avoid double-counting papers and skewing analyses. Among the remaining records, we only considered formal published papers of the type "journal" or "conference." The resulting dataset included 3,431 open science papers and 7,839 reproducibility papers.</p><p>Among the remaining records, we only considered formal published papers of the type "journal" or "conference" (document types "book," "book chapter," and "patent" were removed). We also removed 43 papers with duplicate titles. We examined the remaining number of papers published each year within each literature (SI Appendix, Fig. <ref type="figure">S1</ref>). As very few open science papers were published prior to 2010, and few papers in either field were published in 2018, we only use data for papers published between 2010 and 2017, which includes 2,926 papers in total, with 879 open science papers and 2,047 reproducibility papers. This is the final dataset used for all analyses, except where otherwise noted.</p><p>Data compiled for the analyses can be found at Open Science Framework (<ref type="url">https://osf.io/97vcx</ref>) <ref type="bibr">(112)</ref>, and the code used for this work is available at GitHub (<ref type="url">https://github.com/everyxs/openScience</ref>).</p><p>Based on the sample between 2010 and 2017, we constructed the paper coauthorship networks for 879 open science papers and 2,047 reproducibility papers. Each node represents a scientific article. Two nodes share an edge if at least one author appears in both papers. Based on MAG author IDs, we identified 3,157 unique author names in the open science literature and 8,766 in the reproducibility literature. In the open science literature, the network contains 389 edges (i.e., pairs of papers with at least one author in common) and 856 edges in the reproducibility literature.</p><p>Network Analysis. For both networks, we conducted an edge density and connected components analysis as follows. Edge density. For an undirected network with n nodes and m edges, the edge density is defined as:</p><p>To test whether the open science network has higher edge density than the reproducibility network, we conducted a one-sided Fisher's exact test. We assumed a binomial edge generation process between all pairs of nodes and tested the hypothesis that the odds ratio of the two networks is greater than one. We estimated the odds ratio using the edge density of both networks,</p><p>where &#961; 1 represents the edge density of the open science network and &#961; 2 the edge density of the reproducibility network. The odds ratio test was used to handle the small values of the network density (0.057 and 0.047%), opposed to a test utilizing a linear scale. The test rejects the null hypothesis that the open science network does not have higher edge density than the reproducibility network with a P value of 7.35e -5 . Connected components. We performed an additional analysis to estimate how connected (or isolated) the subcomponents of each network are. For an undirected network, a connected component is defined as a maximal subgraph in which any two nodes are connected to each other by a sequence of edges. In our case, both networks are sparse with many separate connected components. We compared the two networks in terms of the size of the largest connected component, as well as the ACS, which is defined as the network size divided by the number of connected components. The connected components analysis is conducted using the software Gephi <ref type="bibr">(46)</ref>. As a robustness check, we conducted the same edge density and connected components analysis among the multiauthored papers only (excluding singleauthored papers). These analyses and visualizations can be found in SI Appendix, Fig. <ref type="figure">S2</ref>.</p><p>Semantic Text Analysis of Abstracts. Starting with the 2,926 papers from both open science and reproducibility described above, we first removed papers without available abstracts (205 open science and 815 reproducibility papers) and then removed those with non-English titles (79 open science and 63 reproducibility papers), as determined using the R textcat package <ref type="bibr">(113)</ref>. The resulting dataset used in the text analysis consisted of 1,764 papers, including 595 open science papers and 1,169 reproducibility papers. We then performed standard text preprocessing and removed stop words, stemming, and punctuation and converted the text to lowercase using the Senti-mentAnalysis R package. We measured prosocial constructs in the text by counting the frequency of occurrence of 127 words in a validated dictionary (113) (e.g., contribute, encourage, help, nurture; SI Appendix, Table <ref type="table">S2</ref>). This dictionary has been shown to have acceptable agreement with human judges (r = 0.67) <ref type="bibr">(114)</ref>. The prosocial word density is calculated as the ratio of the number of prosocial words over the total number of words in each abstract. Semantic text analysis stratified by field is described in SI Appendix, Fig. <ref type="figure">S5</ref>. Gender Participation Analyses. We performed a traditional gender (male, female) analysis by identifying the gender of the first and last authors given their name. To do so, we used the gender R package (<ref type="url">https://github.com/ ropensci/gender</ref>) <ref type="bibr">(115)</ref>; to determine the probability of the first and last author to be a female. The gender package uses historical data on gender to predict the gender of a person based on their given name(s) and birth year or year range. For each paper, we assumed birth year to be such that the author would be between the ages of 25 and 65 at the time of publication. To identify the first name of each author, we first identified the component of each author name by assuming that each name component was separated by one space in the data. We then considered the first and middle names (when available) and excluded all other initials to perform gender detection. We computed the probability of being female for each author with at least one full (noninitial) first or middle name part. Authors with probability over 0.5 were labeled "female" and those with probability below 0.5 were labeled "male." We used the "ssa" option of the gender package, which looks up names based from the US Social Security Administration baby name data from the period 1932 to 2012.</p><p>For Figs. <ref type="figure">3</ref> and<ref type="figure">4</ref>, we labeled papers as having a woman in a high-status author position if either the first or last author was labeled "female" using the method described above. We excluded papers with unknown high-status</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Downloaded by guest onJanuary 27, 2021   </p></note>
		</body>
		</text>
</TEI>
