Ensuring fairness is crucial in developing modern algorithms and tests. To address potential biases and discrimination in algorithmic decision making, researchers have drawn insights from the test fairness literature, notably the work on differential algorithmic functioning (DAF) by Suk and Han. Nevertheless, the exploration of intersectionality in fairness investigations, within both test fairness and algorithmic fairness fields, is still relatively new. In this paper, we propose an extension of the DAF framework to include the concept of intersectionality. Similar to DAF, the proposed notion for intersectionality, which we term “interactive DAF,” leverages ideas from test fairness and algorithmic fairness. We also provide methods based on the generalized Mantel–Haenszel test, generalized logistic regression, and regularized group regression to detect DAF, interactive DAF, or other subtypes of DAF. Specifically, we employ regularized group regression with three different penalties and examine their performance via a simulation study. Finally, we demonstrate our intersectional DAF framework in real-world applications on grade retention and conditional cash transfer programs in education. 
                        more » 
                        « less   
                    
                            
                            Fairness for Robust Log Loss Classification
                        
                    
    
            Developing classification methods with high accuracy that also avoid unfair treatment of different groups has become increasingly important for data-driven decision making in social applications. Many existing methods enforce fairness constraints on a selected classifier (e.g., logistic regression) by directly forming constrained optimizations. We instead re-derive a new classifier from the first principles of distributional robustness that incorporates fairness criteria into a worst-case logarithmic loss minimization. This construction takes the form of a minimax game and produces a parametric exponential family conditional distribution that resembles truncated logistic regression. We present the theoretical benefits of our approach in terms of its convexity and asymptotic convergence. We then demonstrate the practical advantages of our approach on three benchmark fairness datasets. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1652530
- PAR ID:
- 10179931
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 34
- Issue:
- 04
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 5511 to 5518
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            As algorithmic decision making is increasingly deployed in every walk of life, many researchers have raised concerns about fairness-related bias from such algorithms. But there is little research on harnessing psychometric methods to uncover potential discriminatory bias inside decision-making algorithms. The main goal of this article is to propose a new framework for algorithmic fairness based on differential item functioning (DIF), which has been commonly used to measure item fairness in psychometrics. Our fairness notion, which we call differential algorithmic functioning (DAF), is defined based on three pieces of information: a decision variable, a “fair” variable, and a protected variable such as race or gender. Under the DAF framework, an algorithm can exhibit uniform DAF, nonuniform DAF, or neither (i.e., non-DAF). For detecting DAF, we provide modifications of well-established DIF methods: Mantel–Haenszel test, logistic regression, and residual-based DIF. We demonstrate our framework through a real dataset concerning decision-making algorithms for grade retention in K–12 education in the United States.more » « less
- 
            Cappellato, Linda; Eickhoff, Carsten; Ferro, Nicola; Névéol, Aurélie (Ed.)This paper describes the approach we took to create a machine learning model for the PAN 2020 Authorship Verification Task. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. We created two models: a Logistic Regression Model trained on a small dataset, and a Neural Network based model trained on the large dataset. These models achieved AUCs of 0.939 and 0.953 on the small and large datasets, making them the second-best models on both datasets submitted to the shared task.more » « less
- 
            IntroductionAI fairness seeks to improve the transparency and explainability of AI systems by ensuring that their outcomes genuinely reflect the best interests of users. Data augmentation, which involves generating synthetic data from existing datasets, has gained significant attention as a solution to data scarcity. In particular, diffusion models have become a powerful technique for generating synthetic data, especially in fields like computer vision. MethodsThis paper explores the potential of diffusion models to generate synthetic tabular data to improve AI fairness. The Tabular Denoising Diffusion Probabilistic Model (Tab-DDPM), a diffusion model adaptable to any tabular dataset and capable of handling various feature types, was utilized with different amounts of generated data for data augmentation. Additionally, reweighting samples from AIF360 was employed to further enhance AI fairness. Five traditional machine learning models—Decision Tree (DT), Gaussian Naive Bayes (GNB), K-Nearest Neighbors (KNN), Logistic Regression (LR), and Random Forest (RF)—were used to validate the proposed approach. Results and discussionExperimental results demonstrate that the synthetic data generated by Tab-DDPM improves fairness in binary classification.more » « less
- 
            We study the problem of classifier derandomization in machine learning: given a stochastic binary classifier f:X→[0,1], sample a deterministic classifier f̂ :X→{0,1} that approximates the output of f in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness -- that is, if f treated similar inputs similarly, f̂ did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple ``random threshold'' derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if f is α-metric fair according to a metric d with a locality-sensitive hash (LSH) family, then our derandomized f̂ is, with high probability, O(α)-metric fair and a close approximation of f. We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    