Predictive models learned from historical data are widely used to help companies and organizations make decisions. However, they may digitally unfairly treat unwanted groups, raising concerns about fairness and discrimination. In this paper, we study the fairness-aware ranking problem which aims to discover discrimination in ranked datasets and reconstruct the fair ranking. Existing methods in fairness-aware ranking are mainly based on statistical parity that cannot measure the true discriminatory effect since discrimination is causal. On the other hand, existing methods in causal-based anti-discrimination learning focus on classification problems and cannot be directly applied to handle the ranked data. To address these limitations, we propose to map the rank position to a continuous score variable that represents the qualification of the candidates. Then, we build a causal graph that consists of both the discrete profile attributes and the continuous score. The path-specific effect technique is extended to the mixed-variable causal graph to identify both direct and indirect discrimination. The relationship between the path-specific effects for the ranked data and those for the binary decision is theoretically analyzed. Finally, algorithms for discovering and removing discrimination from a ranked dataset are developed. Experiments using the real-world dataset show the effectiveness of our approaches. 
                        more » 
                        « less   
                    
                            
                            Fairness Auditing in Urban Decisions using LP-based Data Combination
                        
                    
    
            Auditing for fairness often requires relying on a secondary source, e.g., Census data, to inform about protected attributes. To avoid making assumptions about an overarching model that ties such information to the primary data source, a recent line of work has suggested finding the entire range of possible fairness valuations consistent with both sources. Though attractive, the current form of this methodology relies on rigid analytical expressions and lacks the ability to handle continuous decisions, e.g., metrics of urban services. We show that, in such settings, directly adapting these expressions can lead to loose and even vacuous results, particularly on just how fair the audited decisions may be. If used, the audit would be perceived more optimistically than it ought to be. We propose a linear programming formulation to handle continuous decisions, by finding the empirical fairness range when statistical parity is measured through the Kolmogorov-Smirnov distance. The size of this problem is linear in the number of data points and efficiently solvable. We analyze this approach and give finite-sample guarantees to the resulting fairness valuation. We then apply it to synthetic data and to 311 Chicago City Services data, and demonstrate its ability to reveal small but detectable bounds on fairness. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1939743
- PAR ID:
- 10486303
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
- ISBN:
- 9798400701924
- Page Range / eLocation ID:
- 1817 to 1825
- Format(s):
- Medium: X
- Location:
- Chicago IL USA
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Fairness in data-driven decision-making studies scenarios where individuals from certain population segments may be unfairly treated when being considered for loan or job applications, access to public resources, or other types of services. In location-based applications, decisions are based on individual whereabouts, which often correlate with sensitive attributes such as race, income, and education. While fairness has received significant attention recently, e.g., in machine learning, there is little focus on achieving fairness when dealing with location data. Due to their characteristics and specific type of processing algorithms, location data pose important fairness challenges. We introduce the concept of spatial data fairness to address the specific challenges of location data and spatial queries. We devise a novel building block to achieve fairness in the form of fair polynomials. Next, we propose two mechanisms based on fair polynomials that achieve individual spatial fairness, corresponding to two common location-based decision-making types: distance-based and zone-based. Extensive experimental results on real data show that the proposed mechanisms achieve spatial fairness without sacrificing utility.more » « less
- 
            null (Ed.)Modern datacenter infrastructures are increasingly architected as a cluster of loosely coupled services. The cluster states are typically maintained in a logically centralized, strongly consistent data store (e.g., ZooKeeper, Chubby and etcd), while the services learn about the evolving state by reading from the data store, or via a stream of notifications. However, it is challenging to ensure services are correct, even in the presence of failures, networking issues, and the inherent asynchrony of the distributed system. In this paper, we identify that partial histories can be used to effectively reason about correctness for individual services in such distributed infrastructure systems. That is, individual services make decisions based on observing only a subset of changes to the world around them. We show that partial histories, when applied to distributed infrastructures, have immense explanatory power and utility over the state of the art. We discuss the implications of partial histories and sketch tooling for reasoning about distributed infrastructure systems.more » « less
- 
            This paper explores how individuals' privacy-related decision-making processes may be influenced by their pre-existing relationships to companies in a wider social and economic context. Through an online role-playing exercise, we explore attitudes to a range of services including home automation, Internet-of-Things and financial services. We find that individuals do not only consider the privacy-related attributes of applications, devices or services in the abstract. Rather, their decisions are heavily influenced by their pre-existing perceptions of, and relationships with, the companies behind such apps, devices and services. In particular, perceptions about a company's size, level of regulatory scrutiny, relationships with third parties, and pre-existing data exposure lead some users to choose an option which might otherwise appear worse from a privacy perspective. This finding suggests a need for tools that support users to incorporate these existing perceptions and relationships into their privacy-related decision making.more » « less
- 
            Given an algorithmic predictor that is "fair" on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound? In this paper, we study the transferability of statistical group fairness for machine learning predictors (i.e., classifiers or regressors) subject to bounded distribution shifts. Such shifts may be introduced by initial training data uncertainties, user adaptation to a deployed predictor, dynamic environments, or the use of pre-trained models in new settings. Herein, we develop a bound that characterizes such transferability, flagging potentially inappropriate deployments of machine learning for socially consequential tasks. We first develop a framework for bounding violations of statistical fairness subject to distribution shift, formulating a generic upper bound for transferred fairness violations as our primary result. We then develop bounds for specific worked examples, focusing on two commonly used fairness definitions (i.e., demographic parity and equalized odds) and two classes of distribution shift (i.e., covariate shift and label shift). Finally, we compare our theoretical bounds to deterministic models of distribution shift and against real-world data, finding that we are able to estimate fairness violation bounds in practice, even when simplifying assumptions are only approximately satisfied.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
