- Award ID(s):
- 1829681
- NSF-PAR ID:
- 10125766
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Pollard, Tom J. (Ed.)Modern predictive models require large amounts of data for training and evaluation, absence of which may result in models that are specific to certain locations, populations in them and clinical practices. Yet, best practices for clinical risk prediction models have not yet considered such challenges to generalizability. Here we ask whether population- and group-level performance of mortality prediction models vary significantly when applied to hospitals or geographies different from the ones in which they are developed. Further, what characteristics of the datasets explain the performance variation? In this multi-center cross-sectional study, we analyzed electronic health records from 179 hospitals across the US with 70,126 hospitalizations from 2014 to 2015. Generalization gap, defined as difference between model performance metrics across hospitals, is computed for area under the receiver operating characteristic curve (AUC) and calibration slope. To assess model performance by the race variable, we report differences in false negative rates across groups. Data were also analyzed using a causal discovery algorithm “Fast Causal Inference” that infers paths of causal influence while identifying potential influences associated with unmeasured variables. When transferring models across hospitals, AUC at the test hospital ranged from 0.777 to 0.832 (1st-3rd quartile or IQR; median 0.801); calibration slope from 0.725 to 0.983 (IQR; median 0.853); and disparity in false negative rates from 0.046 to 0.168 (IQR; median 0.092). Distribution of all variable types (demography, vitals, and labs) differed significantly across hospitals and regions. The race variable also mediated differences in the relationship between clinical variables and mortality, by hospital/region. In conclusion, group-level performance should be assessed during generalizability checks to identify potential harms to the groups. Moreover, for developing methods to improve model performance in new environments, a better understanding and documentation of provenance of data and health processes are needed to identify and mitigate sources of variation.more » « less
-
Why, when, and how do stereotypes change? This paper develops a computational account based on the principles of structure learning: stereotypes are governed by probabilistic beliefs about the assignment of individuals to groups. Two aspects of this account are particularly important. First, groups are flexibly constructed based on the distribution of traits across individuals; groups are not fixed, nor are they assumed to map on to categories we have to provide to the model. This allows the model to explain the phenomena of group discovery and subtyping, whereby deviant individuals are segregated from a group, thus protecting the group’s stereotype. Second, groups are hierarchically structured, such that groups can be nested. This allows the model to explain the phenomenon of subgrouping, whereby a collection of deviant individuals is organized into a refinement of the superordinate group. The structure learning account also sheds light on several factors that determine stereotype change, including perceived group variability, individual typicality, cognitive load, and sample size.more » « less
-
Accurate detection of infected individuals is one of the critical steps in stopping any pandemic. When the underlying infection rate of the disease is low, testing people in groups, instead of testing each individual in the population, can be more efficient. In this work, we consider noisy adaptive group testing design with specific test sensitivity and specificity that select the optimal group given previous test results based on pre-selected utility function. As in prior studies on group testing, we model this problem as a sequential Bayesian Optimal Experimental Design (BOED) to adaptively design the groups for each test. We analyze the required number of group tests when using the updated posterior on the infection status and the corresponding Mutual Information (MI) as our utility function for selecting new groups. More importantly, we study how the potential bias on the ground-truth noise of group tests may affect the group testing sample complexity.more » « less
-
Explaining why animal groups vary in size is a fundamental problem in behavioral ecology. One hypothesis is that life-history differences among individuals lead to sorting of phenotypes into groups of different sizes where each individual does best. This hypothesis predicts that individuals should be relatively consistent in their use of particular group sizes across time. Little is known about whether animals’ choice of group size is repeatable across their lives, especially in long-lived species. We studied consistency in choice of breeding-colony size in colonially nesting cliff swallows ( Petrochelidon pyrrhonota ) in western Nebraska, United States, over a 32-year period, following 6,296 birds for at least four breeding seasons. Formal repeatability of size choice for the population was about 0.41. About 45% of individuals were relatively consistent in choice of colony size, while about 40% varied widely in the colony size they occupied. Birds using the smaller and larger colonies appeared more consistent in size use than birds occupying more intermediate sized colonies. Consistency in colony size was also influenced by whether a bird used the same physical colony site each year and whether the site had been fumigated to remove ectoparasites. The difference between the final and initial colony sizes for an individual, a measure of the net change in its colony size over its life, did not significantly depart from 0 for the dataset as a whole. However, different year-cohorts did show significant net change in colony size, both positive and negative, that may have reflected fluctuating selection on colony size among years based on climatic conditions. The results support phenotypic sorting as an explanation for group size variation, although cliff swallows also likely use past experience at a given site and the extent of ectoparasitism to select breeding colonies.more » « less
-
Abstract In several author name disambiguation studies, some ethnic name groups such as East Asian names are reported to be more difficult to disambiguate than others. This implies that disambiguation approaches might be improved if ethnic name groups are distinguished before disambiguation. We explore the potential of ethnic name partitioning by comparing performance of four machine learning algorithms trained and tested on the entire data or specifically on individual name groups. Results show that ethnicity‐based name partitioning can substantially improve disambiguation performance because the individual models are better suited for their respective name group. The improvements occur across all ethnic name groups with different magnitudes. Performance gains in predicting matched name pairs outweigh losses in predicting nonmatched pairs. Feature (e.g., coauthor name) similarities of name pairs vary across ethnic name groups. Such differences may enable the development of ethnicity‐specific feature weights to improve prediction for specific ethic name categories. These findings are observed for three labeled data with a natural distribution of problem sizes as well as one in which all ethnic name groups are controlled for the same sizes of ambiguous names. This study is expected to motive scholars to group author names based on ethnicity prior to disambiguation.