skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: In-Processing Modeling Techniques for Machine Learning Fairness: A Survey
Machine learning models are becoming pervasive in high-stakes applications. Despite their clear benefits in terms of performance, the models could show discrimination against minority groups and result in fairness issues in a decision-making process, leading to severe negative impacts on the individuals and the society. In recent years, various techniques have been developed to mitigate the unfairness for machine learning models. Among them, in-processing methods have drawn increasing attention from the community, where fairness is directly taken into consideration during model design to induce intrinsically fair models and fundamentally mitigate fairness issues in outputs and representations. In this survey, we review the current progress of in-processing fairness mitigation techniques. Based on where the fairness is achieved in the model, we categorize them into explicit and implicit methods, where the former directly incorporates fairness metrics in training objectives, and the latter focuses on refining latent representation learning. Finally, we conclude the survey with a discussion of the research challenges in this community to motivate future exploration.  more » « less
Award ID(s):
1939716
PAR ID:
10397780
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Knowledge Discovery from Data
ISSN:
1556-4681
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made based on protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics, evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice. 
    more » « less
  2. Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made between different groups in a protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice. 
    more » « less
  3. The spread of infectious diseases is a highly complex spatiotemporal process, difficult to understand, predict, and effectively respond to. Machine learning and artificial intelligence (AI) have achieved impressive results in other learning and prediction tasks; however, while many AI solutions are developed for disease prediction, only a few of them are adopted by decision-makers to support policy interventions. Among several issues preventing their uptake, AI methods are known to amplify the bias in the data they are trained on. This is especially problematic for infectious disease models that typically leverage large, open, and inherently biased spatiotemporal data. These biases may propagate through the modeling pipeline to decision-making, resulting in inequitable policy interventions. Therefore, there is a need to gain an understanding of how the AI disease modeling pipeline can mitigate biased input data, in-processing models, and biased outputs. Specifically, our vision is to develop a large-scale micro-simulation of individuals from which human mobility, population, and disease ground-truth data can be obtained. From this complete dataset—which may not reflect the real world—we can sample and inject different types of bias. By using the sampled data in which bias is known (as it is given as the simulation parameter), we can explore how existing solutions for fairness in AI can mitigate and correct these biases and investigate novel AI fairness solutions. Achieving this vision would result in improved trust in such models for informing fair and equitable policy interventions. 
    more » « less
  4. Machine learning with artificial neural networks has recently transformed many scientific fields by introducing new data analysis and information processing techniques. Despite these advancements, efficient implementation of machine learning on conventional computers remains challenging due to speed and power constraints. Optical computing schemes have quickly emerged as the leading candidate for replacing their electronic counterparts as the backbone for artificial neural networks. Some early integrated photonic neural network (IPNN) techniques have already been fast-tracked to industrial technologies. This review article focuses on the next generation of optical neural networks (ONNs), which can perform machine learning algorithms directly in free space. We have aptly named this class of neural network model the free space optical neural network (FSONN). We systematically compare FSONNs, IPNNs, and the traditional machine learning models with regard to their fundamental principles, forward propagation model, and training process. We survey several broad classes of FSONNs and categorize them based on the technology used in their hidden layers. These technologies include 3D printed layers, dielectric and plasmonic metasurface layers, and spatial light modulators. Finally, we summarize the current state of FSONN research and provide a roadmap for its future development. 
    more » « less
  5. Graph is a ubiquitous type of data that appears in many real-world applications, including social network analysis, recommendations and financial security. Important as it is, decades of research have developed plentiful computational models to mine graphs. Despite its prosperity, concerns with respect to the potential algorithmic discrimination have been grown recently. Algorithmic fairness on graphs, which aims to mitigate bias introduced or amplified during the graph mining process, is an attractive yet challenging research topic. The first challenge corresponds to the theoretical challenge, where the non-IID nature of graph data may not only invalidate the basic assumption behind many existing studies in fair machine learning, but also introduce new fairness definition(s) based on the inter-correlation between nodes rather than the existing fairness definition(s) in fair machine learning. The second challenge regarding its algorithmic aspect aims to understand how to balance the trade-off between model accuracy and fairness. This tutorial aims to (1) comprehensively review the state-of-the-art techniques to enforce algorithmic fairness on graphs and (2) enlighten the open challenges and future directions. We believe this tutorial could benefit researchers and practitioners from the areas of data mining, artificial intelligence and social science. 
    more » « less