Rediscovering the human in AI design for fairness

De Assis Nunes, Ana Carolina; Zhang, Shaozeng

This paper is an initial report of our fair AI design project by a small research team made up of anthropologists and computer scientists. Our collaborative project was developed in response to the recent debates on AI's ethical and social issues (Elish and boyd 2018). We share this understanding that "numbers don't speak for themselves," but data enters into research projects already "fully cooked" (D'Ignazio and Klein 2020). Therefore, we take an anthropological approach to observing, recording, understanding, and reflecting upon the process of machine learning algorithm design from the first steps of choosing and coding datasets for training and building algorithms. We tease apart the encoding of social-cultural paradigms in the generation and use of datasets in algorithm design and testing. By doing so, we rediscover the human in data to challenge the methodological and social assumptions in data use and then to adjust the model and parameters of our algorithms. This paper centers on tracing the social trajectory of the Correctional Offender Management Profiling for Alternative Sanctions, known as the COMPAS dataset. This dataset contains data of over 10,000 criminal defendants in Broward County in Florida, the U.S. Since its publication, it has become a benchmark dataset in the study of algorithmic fairness and was also used to design and train our algorithm for recidivism prediction. This paper presents our observation that data results from a complex set of social, political, and historical assumptions and circumstances and demonstrates how the social trajectory of data can be taken into the design of AI as automated systems become more intricate into our daily lives.”

More Like this