skip to main content

Title: Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline
In recent years, many incidents have been reported where machine learning models exhibited discrimination among people based on race, sex, age, etc. Research has been conducted to measure and mitigate unfairness in machine learning models. For a machine learning task, it is a common practice to build a pipeline that includes an ordered set of data preprocessing stages followed by a classifier. However, most of the research on fairness has considered a single classifier based prediction task. What are the fairness impacts of the preprocessing stages in machine learning pipeline? Furthermore, studies showed that often the root cause of unfairness is ingrained in the data itself, rather than the model. But no research has been conducted to measure the unfairness caused by a specific transformation made in the data preprocessing stage. In this paper, we introduced the causal method of fairness to reason about the fairness impact of data preprocessing stages in ML pipeline. We leveraged existing metrics to define the fairness measures of the stages. Then we conducted a detailed fairness evaluation of the preprocessing stages in 37 pipelines collected from three different sources. Our results show that certain data transformers are causing the model to exhibit unfairness. We more » identified a number of fairness patterns in several categories of data transformers. Finally, we showed how the local fairness of a preprocessing stage composes in the global fairness of the pipeline. We used the fairness composition to choose appropriate downstream transformer that mitigates unfairness in the machine learning pipeline. « less
Authors:
;
Award ID(s):
1934884 1513263
Publication Date:
NSF-PAR ID:
10287421
Journal Name:
ESEC/FSE’2021: The 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Aug. 2021).
Sponsoring Org:
National Science Foundation
More Like this
  1. Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made based on protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics, evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have alsomore »presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice.« less
  2. Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made between different groups in a protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases.more »We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice.« less
  3. We study fairness in supervised few-shot meta-learning models that are sensitive to discrimination (or bias) in historical data. A machine learning model trained based on biased data tends to make unfair predictions for users from minority groups. Although this problem has been studied before, existing methods mainly aim to detect and control the dependency effect of the protected variables (e.g. race, gender) on target prediction based on a large amount of training data. These approaches carry two major drawbacks that (1) lacking showing a global cause-effect visualization for all variables; (2) lacking generalization of both accuracy and fairness to unseen tasks. In this work, we first discover discrimination from data using a causal Bayesian knowledge graph which not only demonstrates the dependency of the protected variable on target but also indicates causal effects between all variables. Next, we develop a novel algorithm based on risk difference in order to quantify the discriminatory influence for each protected variable in the graph. Furthermore, to protect prediction from unfairness, a the fast-adapted bias-control approach in meta-learning is proposed, which efficiently mitigates statistical disparity for each task and it thus ensures independence of protected attributes on predictions based on biased and few-shot data samples.more »Distinct from existing meta-learning models, group unfairness of tasks are efficiently reduced by leveraging the mean difference between (un)protected groups for regression problems. Through extensive experiments on both synthetic and real-world data sets, we demonstrate that our proposed unfairness discovery and prevention approaches efficiently detect discrimination and mitigate biases on model output as well as generalize both accuracy and fairness to unseen tasks with a limited amount of training samples.« less
  4. Background : Machine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior “in the wild.” Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms. Method : Twenty-eight free-living women wore an ActiGraph GT3X+ accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task. Results : The CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression andmore »76% for random forest), even without performing any feature engineering. Conclusion : Using the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model’s ability to deal with the complexity of free-living data and its potential transferability to new populations.« less
  5. Searching for parking has been a problem faced by many drivers, especially in urban areas. With an increasing public demand for parking information and services, as well as the proliferation of advanced smartphones, a range of smartphone-based parking management services began to emerge. Funded by the National Science Foundation, our research aims to explore the potential of smartphone-based parking management services as a solution to parking problems, to deepen our understandings of travelers’ parking behaviors, and to further advance the analytical foundations and methodologies for modeling and assessing parking solutions. This paper summarizes progress and results from our research projects on smartphone-based parking management, including parking availability information prediction, parking searching strategy, the development of a mobile parking application, and our next steps to learn and discover new knowledge from its deployment. To predict future parking occupancy, we proposed a practical framework that integrates machine-learning techniques with a model-based core approach that explicitly models the stochastic parking process. The framework is able to predict future parking occupancy from historical occupancy data alone, and can handle complex arrival and departure patterns in real-world case studies, including special event. With the predicted probabilistic availability information, a cost-minimizing parking searching strategy is developed.more »The parking searching problem for an individual user is a stochastic Markov decision process and is formalized as a dynamic programming problem. The cost-minimizing parking searching strategy is solved by value iteration. Our simulated experiments showed that cost-minimizing strategy has the lowest expected cost but tends to direct a user to visit more parking facilities compared with two greedy strategies. Currently, we are working on implementing the predictive framework and the searching algorithm in a mobile phone application. We are working closely with Arizona State University (ASU) Parking and Transit Services to implement a three-stage pilot deployment of the prototype application around the ASU main campus. In the first stage, our application will provide real-time information and we will incorporate availability prediction and searching guidance in the second and third stages. Once the mobile application is deployed, it will provide unique opportunities to collect data on parking search behaviors, discover emerging scenarios of smartphone-based parking management services, and assess the impacts of such systems.« less