skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using Explainable AI to Understand Team Formation and Team Impact
ABSTRACT The citation of scientific papers is considered a simple and direct indicator of papers' impact. This paper predicts papers' citations through team‐related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members' previous citation number, H‐index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features' relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male‐dominated teams can bring higher team citations. Our project can provide insights into how to form the best scientific teams and maximize team impact from team composition and team structure.  more » « less
Award ID(s):
2331366
PAR ID:
10632790
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Proceedings of the Association for Information Science and Technology
Volume:
60
Issue:
1
ISSN:
2373-9231
Page Range / eLocation ID:
469 to 478
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With teams growing in all areas of scientific and scholarly research, we explore the relationship between team structure and the character of knowledge they produce. Drawing on 89,575 self-reports of team member research activity underlying scientific publications, we show how individual activities cohere into broad roles of 1) leadership through the direction and presentation of research and 2) support through data collection, analysis, and discussion. The hidden hierarchy of a scientific team is characterized by its lead (or L) ratio of members playing leadership roles to total team size. The L ratio is validated through correlation with imputed contributions to the specific paper and to science as a whole, which we use to effectively extrapolate the L ratio for 16,397,750 papers where roles are not explicit. We find that, relative to flat, egalitarian teams, tall, hierarchical teams produce less novelty and more often develop existing ideas, increase productivity for those on top and decrease it for those beneath, and increase short-term citations but decrease long-term influence. These effects hold within person—the same person on the same-sized team produces science much more likely to disruptively innovate if they work on a flat, high-L-ratio team. These results suggest the critical role flat teams play for sustainable scientific advance and the training and advancement of scientists. 
    more » « less
  2. null (Ed.)
    Accurate prediction of scientific impact is important for scientists, academic recommender systems, and granting organizations alike. Existing approaches rely on many years of leading citation values to predict a scientific paper’s citations (a proxy for impact), even though most papers make their largest contributions in the first few years after they are published. In this paper, we tackle a new problem: predicting a new paper’s citation time series from the date of publication (i.e., without leading values). We propose HINTS, a novel end-to-end deep learning framework that converts citation signals from dynamic heterogeneous information networks (DHIN) into citation time series. HINTS imputes pseudo-leading values for a paper in the years before it is published from DHIN embeddings, and then transforms these embeddings into the parameters of a formal model that can predict citation counts immediately after publication. Empirical analysis on two real-world datasets from Computer Science and Physics show that HINTS is competitive with baseline citation prediction models. While we focus on citations, our approach generalizes to other “cold start” time series prediction tasks where relational data is available and accurate prediction in early timestamps is crucial. 
    more » « less
  3. Improving team interactions in engineering to model gender inclusivity has been at the forefront of many initiatives in both academia and industry. However, there has been limited evidence on the impact of gender-diverse teams on psychological safety. This is important because psychological safety has been shown to be a key facet for the development of innovative ideas, and has also been shown to be a cornerstone of effective teamwork. But how does the gender diversity of a team impact the development of psychological safety? The current study was developed to explore just this through an empirical study with 38 engineering design student teams over the course of an 8-week design project. These teams were designed to be half heterogeneous (either half-male and half-female, or majority male) or other half homogeneous (all male). We captured psychological safety at five time points between the homogenous and heterogenous teams and also explored individual dichotomous (peer-review) ratings of psychological safety at the end of the project. Results indicated that there was no difference in psychological safety between gender homogenous and heterogenous teams. However, females perceived themselves as more psychologically safe with other female team members compared to their ratings of male team members. Females also perceived themselves to be less psychologically safe with male team members compared to male ratings of female team members, indicating a discrepancy in perceptions between genders. These results point to the need to further explore the role of minoritized groups in psychological safety research and to explore how this effect presents itself (or is covered up) at the team level. 
    more » « less
  4. Citations have long been used to characterize the state of a scientific field and to identify influential works. However, writers use citations for different purposes, and this varied purpose influences uptake by future scholars. Unfortunately, our understanding of how scholars use and frame citations has been limited to small-scale manual citation analysis of individual papers. We perform the largest behavioral study of citations to date, analyzing how scientific works frame their contributions through different types of citations and how this framing affects the field as a whole. We introduce a new dataset of nearly 2,000 citations annotated for their function, and use it to develop a state-of-the-art classifier and label the papers of an entire field: Natural Language Processing. We then show how differences in framing affect scientific uptake and reveal the evolution of the publication venues and the field as a whole. We demonstrate that authors are sensitive to discourse structure and publication venue when citing, and that how a paper frames its work through citations is predictive of the citation count it will receive. Finally, we use changes in citation framing to show that the field of NLP is undergoing a significant increase in consensus. 
    more » « less
  5. This article introduces and applies a methodology to analyze the effect of team diversity on team design cognition. We explore team diversity in relation to team members’ gender. We studied two types of teams: heterogeneous teams composed of one female and one male mechanical engineering student and homogeneous teams of two male mechanical engineering students. We analyzed 28 design protocols using the Function-Behavior-Structure ontology to code protocols and measure team cognitive design behavior. We found that male design students in the mixed teams tend to dominate the design activity. Also, we found that mixed teams showed significantly more co-design activity compared to male only teams. 
    more » « less