Ranking evaluation metrics play an important role in information retrieval, providing optimization objectives during development and means of assessment of deployed performance. Recently, fairness of rankings has been recognized as crucial, especially as automated systems are increasingly used for high impact decisions. While numerous fairness metrics have been proposed, a comparative analysis to understand their interrelationships is lacking. Even for fundamental statistical parity metrics which measure group advantage, it remains unclear whether metrics measure the same phenomena, or when one metric may produce different results than another. To address these open questions, we formulate a conceptual framework for analytical comparisonmore »
This content will become publicly available on July 11, 2023
Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison
Information access systems, such as search and recommender systems, often use ranked lists to present results believed to be relevant to the user’s information need. Evaluating these lists for their fairness along with other traditional metrics provide a more complete understanding of an information access system’s behavior beyond
accuracy or utility constructs. To measure the (un)fairness of rankings, particularly with respect to protected group(s) of producers or providers, several metrics have been proposed in the last several years. However, an empirical and comparative analyses of these metrics showing the applicability to specific scenario or real data,
conceptual similarities, and differences is still lacking. We aim to bridge the gap between theoretical and practical application of these metrics. In this paper we describe several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks. We also provide a sensitivity analysis to assess the impact of the design
choices and parameter settings that go in to these metrics and point to additional work needed to improve fairness measurement.
- Award ID(s):
- 1751278
- Publication Date:
- NSF-PAR ID:
- 10329880
- Journal Name:
- Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Single-cell RNA sequencing (scRNA-seq) data provides unprecedented information on cell fate decisions; however, the spatial arrangement of cells is often lost. Several recent computational methods have been developed to impute spatial information onto a scRNA-seq dataset through analyzing known spatial expression patterns of a small subset of genes known as a reference atlas. However, there is a lack of comprehensive analysis of the accuracy, precision, and robustness of the mappings, along with the generalizability of these methods, which are often designed for specific systems. We present a system-adaptive deep learning-based method (DEEPsc) to impute spatial information onto a scRNA-seq datasetmore »
-
Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments of fairness rely on statistical correlations that can be fooled by statistical anomalies, such as Simpson's paradox. Proposals for causality-based definitions of fairness can correctly model some of these situations, but they require specification of the underlying causal models. In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed tomore »
-
This paper provides detailed information for a poster that will be presented in the National Science Foundation (NSF) Grantees Poster Session during the 2020 ASEE Annual Conference & Exposition. The poster describes the progress and the state of an NSF Scholarships in Science, Technology, Engineering, and Math (S-STEM) project. The objectives of this project are to 1) enhance student learning by providing access to extra- and co-curricular experiences, 2) create a positive student experience through mentorship, and 3) ensure successful student placement in the STEM workforce or graduate school. S-STEM Scholars supported by this program receive financial, academic, professional, andmore »
-
Dynamic Information Flow Tracking (DIFT), also called Dynamic Taint Analysis (DTA), is a technique for tracking the information as it flows through a program's execution. Specifically, some inputs or data get tainted and then these taint marks (tags) propagate usually at the instruction-level. While DIFT has been a fundamental concept in computer and network security for the past decade, it still faces open challenges that impede its widespread application in practice; one of them being the indirect flow propagation dilemma: should the tags involved in an indirect flow, e.g., in a control or address dependency, be propagated? Propagating all thesemore »