A calibrated BISG for inferring race from surname and geolocation

Greengard, Philip (ORCID:0000000300906497); Gelman, Andrew (ORCID:0000000269752601)

doi:10.1093/jrsssa/qnaf003

Citation Details

A calibrated BISG for inferring race from surname and geolocation

Abstract Bayesian Improved Surname Geocoding (BISG) is a ubiquitous tool for predicting race and ethnicity using an individual’s geolocation and surname. Here we demonstrate that statistical dependence of surname and geolocation within racial/ethnic categories in the US results in biases for minority subpopulations, and we introduce a raking-based improvement. Our method augments the data used by BISG—distributions of race by geolocation and race by surname—with the distribution of surname by geolocation obtained from state voter files. We validate our algorithm on state voter registration lists that contain self-identified race/ethnicity. more »

Award ID(s):: 2311354

PAR ID:: 10567677

Author(s) / Creator(s):: Greengard, Philip; Gelman, Andrew

Publisher / Repository:: Oxford University Press

Date Published:: 2025-01-23

Journal Name:: Journal of the Royal Statistical Society Series A: Statistics in Society

Volume:: 189

Issue:: 1

ISSN:: 0964-1998

Format(s):: Medium: X Size: p. 512-543

Size(s):: p. 512-543

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/jrsssa/qnaf003

More Like this