Predictive models learned from historical data are widely used to help companies and organizations make decisions. However, they may digitally unfairly treat unwanted groups, raising concerns about fairness and discrimination. In this paper, we study the fairness-aware ranking problem which aims to discover discrimination in ranked datasets and reconstruct the fair ranking. Existing methods in fairness-aware ranking are mainly based on statistical parity that cannot measure the true discriminatory effect since discrimination is causal. On the other hand, existing methods in causal-based anti-discrimination learning focus on classification problems and cannot be directly applied to handle the ranked data. To address these limitations, we propose to map the rank position to a continuous score variable that represents the qualification of the candidates. Then, we build a causal graph that consists of both the discrete profile attributes and the continuous score. The path-specific effect technique is extended to the mixed-variable causal graph to identify both direct and indirect discrimination. The relationship between the path-specific effects for the ranked data and those for the binary decision is theoretically analyzed. Finally, algorithms for discovering and removing discrimination from a ranked dataset are developed. Experiments using the real-world dataset show the effectiveness of our approaches.
more »
« less
Quantifying Causal Path-Specific Importance in Structural Causal Model
Path-specific effect analysis is a powerful tool in causal inference. This paper provides a definition of causal counterfactual path-specific importance score for the structural causal model (SCM). Different from existing path-specific effect definitions, which focus on the population level, the score defined in this paper can quantify the impact of a decision variable on an outcome variable along a specific pathway at the individual level. Moreover, the score has many desirable properties, including following the chain rule and being consistent. Finally, this paper presents an algorithm that can leverage these properties and find the k-most important paths with the highest importance scores in a causal graph effectively.
more »
« less
- Award ID(s):
- 2134901
- PAR ID:
- 10486957
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Computation
- Volume:
- 11
- Issue:
- 7
- ISSN:
- 2079-3197
- Page Range / eLocation ID:
- 133
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Propensity score weighting is a tool for causal inference to adjust for measured confounders in observational studies. In practice, data often present complex structures, such as clustering, which make propensity score modeling and estimation challenging. In addition, for clustered data, there may be unmeasured cluster-level covariates that are related to both the treatment assignment and outcome. When such unmeasured cluster-specific confounders exist and are omitted in the propensity score model, the subsequent propensity score adjustment may be biased. In this article, we propose a calibration technique for propensity score estimation under the latent ignorable treatment assignment mechanism, i. e., the treatment-outcome relationship is unconfounded given the observed covariates and the latent cluster-specific confounders. We impose novel balance constraints which imply exact balance of the observed confounders and the unobserved cluster-level confounders between the treatment groups. We show that the proposed calibrated propensity score weighting estimator is doubly robust in that it is consistent for the average treatment effect if either the propensity score model is correctly specified or the outcome follows a linear mixed effects model. Moreover, the proposed weighting method can be combined with sampling weights for an integrated solution to handle confounding and sampling designs for causal inference with clustered survey data. In simulation studies, we show that the proposed estimator is superior to other competitors. We estimate the effect of School Body Mass Index Screening on prevalence of overweight and obesity for elementary schools in Pennsylvania.more » « less
-
An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.more » « less
-
de Campos, C.; Maathuis, M. H. (Ed.)An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.more » « less
-
Abstract: Using synthetic examples, Aronow et al. (2025) provide a convincing case for the special status of randomized controlled trials (RCTs) in which the propensity scores are known and can be used to make causal inferences. We provide a Bayesian perspective on the propensity score, which has been a controversial topic for some time. We review the Bayesian approach for population-level causal inference with an emphasis on role of the prior. Under the assumption of prior independence of the propensity score and outcome model parameters, the propensity score plays no role in Bayesian inference for causal effects. However, recent work has shown why assuming prior independence can be problematic in high-dimensional models. We discuss how various approaches that have been proposed for incorporating the propensity score in Bayesian causal inference correspond to different ways of relaxing prior independence. Given these considerations, we illustrate how a Bayesian might approach the synthetic examples of Aronow et al. (2025) in RCTs.more » « less
An official website of the United States government

