NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FairSpace: An Interactive Visualization System for Constructing Fair Consensus from Many Rankings

https://doi.org/10.1111/cgf.70132

Shrestha, H; Cachel, K; Alkhathlan, M; Rundensteiner, E; Harrison, L (June 2025, Computer Graphics Forum)

Abstract Decisions involving algorithmic rankings affect our lives in many ways, from product recommendations, receiving scholarships, to securing jobs. While tools have been developed for interactively constructing fair consensus rankings from a handful of rankings, addressing the more complex real‐world scenario— where diverse opinions are represented by a larger collection of rankings— remains a challenge. In this paper, we address these challenges by reformulating the exploration of rankings as a dimension reduction problem in a system called FairSpace. FairSpace provides new views, including Fair Divergence View and Cluster Views, by juxtaposing fairness metrics of different local and alternative global consensus rankings to aid ranking analysis tasks. We illustrate the effectiveness of FairSpace through a series of use cases, demonstrating via interactive workflows that users are empowered to create local consensuses by grouping rankings similar in their fairness or utility properties, followed by hierarchically aggregating local consensuses into a global consensus through direct manipulation. We discuss how FairSpace opens the possibility for advances in dimension reduction visualization to benefit the research area of supporting fair decision‐making in ranking based decision‐making contexts. Code, datasets and demo video available at:osf.io/d7cwk
more » « less
Full Text Available
Exploring “Just Noticeable” Group Fairness in Rankings

Alkhathlan, Mallak; Shrestha, Hilson; Harrison, Lane; Rundensteiner, Elke (October 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org).)

The plethora of fairness metrics developed for ranking-based decision-making raises the question: which metrics align best with people’s perceptions of fairness, and why? Most prior studies examining people’s perceptions of fairness metrics tend to use ordinal rating scales (e.g., Likert scales). However, such scales can be ambiguous in their interpretation across participants, and can be influenced by interface features used to capture responses.We address this gap by exploring the use of two-alternative forced choice methodologies— used extensively outside the fairness community for comparing visual stimuli— to quantitatively compare participant perceptions across fairness metrics and ranking characteristics. We report a crowdsourced experiment with 224 participants across four conditions: two alternative rank fairness metrics, ARP and NDKL, and two ranking characteristics, lists of 20 and 100 candidates, resulting in over 170,000 individual judgments. Quantitative results show systematic differences in how people interpert these metrics, and surprising exceptions where fairness metrics disagree with people’s perceptions. Qualitative analyses of participant comments reveals an interplay between cognitive and visual strategies that affects people’s perceptions of fairness. From these results, we discuss future work in aligning fairness metrics with people’s perceptions, and highlight the need and benefits of expanding methodologies for fairness studies.
more » « less
Full Text Available
Group Fair Rated Preference Aggregation: Ties Are (Mostly) All You Need

https://doi.org/10.1145/3715275.3732042

Cachel, Kathleen; Rundensteiner, Elke (June 2025, ACM ( conference FAccT ’25))

Rated preference aggregation is conventionally performed by averaging ratings from multiple evaluators to create a consensus ordering of candidates from highest to lowest average rating. Ideally, the consensus is fair, meaning critical opportunities are not withheld from marginalized groups of candidates, even if group biases may be present in the to-be-combined ratings. Prior work operationalizing fairness in preference aggregation is limited to settings where evaluators provide rankings of candidates (e.g., Joe > Jack > Jill). Yet, in practice, many evaluators assign ratings such as Likert scales or categories (e.g., yes, no, maybe) to each candidate. Ratings convey different information than rankings leading to distinct fairness issues during their aggregation. The existing literature does not characterize these fairness concerns nor provide applicable bias-mitigation solutions. Unlike the ranked setting studied previously, two unique forms of bias arise in rating aggregation. First, biased rating stems from group disparities in to-be-aggregated evaluator ratings. Second, biased tie-breaking occurs because ties in average ratings must be resolved when aggregating ratings into a consensus ranking, and this tie-breaking act can unfairly advantage certain groups. To address this gap, we define the open fair rated preference aggregation problem and introduce the corresponding Fate methodology. Fate offers the first group fairness metric specifically for rated preference data. We propose two Fate algorithms. Fate-Break works in settings when ties need to be broken, explicitly fairness-enhancing such processes without lowering consensus utility. Fate-Rate mitigates disparities in how groups are rated, by using a Markov-chain approach to generate outcomes where groups are, in as much as possible, equally represented. Our experimental study illustrates the FATE methods provide the most bias-mitigation compared to adapting prior methods to fair tie-breaking and rating aggregation.
more » « less
Full Text Available
Wise Fusion: Group Fairness Enhanced Rank Fusion

https://doi.org/10.1145/3627673.3679649

Cachel, Kathleen; Rundensteiner, Elke (October 2024, ACM)

Full Text Available
FairRankTune: A Python Toolkit for Fair Ranking Tasks

https://doi.org/10.1145/3627673.3679238

Cachel, Kathleen; Rundensteiner, Elke (October 2024, ACM)

We present FairRankTune, a multi-purpose open-source Python toolkit offering three primary services: quantifying fairness-related harms, leveraging bias mitigation algorithms, and constructing custom fairness-relevant datasets. FairRankTune provides researchers and practitioners with a self-contained resource for fairness auditing, experimentation, and advancing research. The central piece of FairRankTune is a novel fairness-tunable ranked data generator, RankTune, that streamlines the creation of custom fairness-relevant ranked datasets. FairRankTune also offers numerous fair ranking metrics and fairness-aware ranking algorithms within the same plug-and-play package. We demonstrate the key innovations of FairRankTune, focusing on features that are valuable to stakeholders via use cases highlighting workflows in the end-to-end process of mitigating bias in ranking systems. FairRankTune addresses the gap of limited publicly available datasets, auditing tools, and implementations for fair ranking.
more » « less
Full Text Available
Hidden or Inferred - Fair Learning-To-Rank With Unknown Demographics

https://doi.org/10.1609/aies.v7i1.31705

Olulana, Oluseun; Cachel, Kathleen; Murai, Fabricio; Rundensteiner, Elke (October 2024, Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society)

As learning-to-rank models are increasingly deployed for decision-making in areas with profound life implications, the FairML community has been developing fair learning-to-rank (LTR) models. These models rely on the availability of sensitive demographic features such as race or sex. However, in practice, regulatory obstacles and privacy concerns protect this data from collection and use. As a result, practitioners may either need to promote fairness despite the absence of these features or turn to demographic inference tools to attempt to infer them. Given that these tools are fallible, this paper aims to further understand how errors in demographic inference impact the fairness performance of popular fair LTR strategies. In which cases would it be better to keep such demographic attributes hidden from models versus infer them? We examine a spectrum of fair LTR strategies ranging from fair LTR with and without demographic features hidden versus inferred to fairness-unaware LTR followed by fair re-ranking. We conduct a controlled empirical investigation modeling different levels of inference errors by systematically perturbing the inferred sensitive attribute. We also perform three case studies with real-world datasets and popular open-source inference methods. Our findings reveal that as inference noise grows, LTR-based methods that incorporate fairness considerations into the learning process may increase bias. In contrast, fair re-ranking strategies are more robust to inference errors. All source code, data, and experimental artifacts of our experimental study are available here: https://github.com/sewen007/hoiltr.git
more » « less
Full Text Available
Exploring Fairness across Many Rankings

Shrestha, Hilson; Cachel, Kathleen; Alkhathlan, Mallak; Rundensteiner, Elke; Harrison, Lane (October 2024, published as poster (and 2 page short paper) in : IEEE Visualization and Visual Analytics (VIS), 2024.)

Poster.
more » « less
Full Text Available
PreFAIR: Combining Partial Preferences for Fair Consensus Decision-making

https://doi.org/10.1145/3630106.3658961

Cachel, Kathleen; Rundensteiner, Elke (June 2024, ACM)

Preference aggregation mechanisms help decision-makers combine diverse preference rankings produced by multiple voters into a single consensus ranking. Prior work has developed methods for aggregating multiple rankings into a fair consensus over the same set of candidates. Yet few real-world problems present themselves as such precisely formulated aggregation tasks with each voter fully ranking all candidates. Instead, preferences are often expressed as rankings over partial and even disjoint subsets of candidates. For instance, hiring committee members typically opt to rank their top choices instead of exhaustively ordering every single job applicant. However, the existing literature does not offer a framework for characterizing nor ensuring group fairness in such partial preference aggregation tasks. Unlike fully ranked settings, partial preferences imply both a selection decision of whom to rank plus an ordering decision of how to rank the selected candidates. Our work fills this gap by conceptualizing the open problem of fair partial preference aggregation. We introduce an impossibility result for fair selection from partial preferences and design a computational framework showing how we can navigate this obstacle. Inspired by Single Transferable Voting, our proposed solution PreFair produces consensus rankings that are fair in the selection of candidates and also in their relative ordering. Our experimental study demonstrates that PreFair achieves the best performance in this dual fairness objective compared to state-of-the-art alternatives adapted to this new problem while still satisfying voter preferences.
more » « less
Full Text Available
Balancing Act: Evaluating People’s Perceptions of Fair Ranking Metrics

https://doi.org/10.1145/3630106.3659018

Alkhathlan, Mallak; Cachel, Kathleen; Shrestha, Hilson; Harrison, Lane; Rundensteiner, Elke (June 2024, Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency)

Algorithmic decision-making using rankings— prevalent in areas from hiring and bail to university admissions— raises concerns of potential bias. In this paper, we explore the alignment between people’s perceptions of fairness and two popular fairness metrics designed for rankings. In a crowdsourced experiment with 480 participants, people rated the perceived fairness of a hypothetical scholarship distribution scenario. Results suggest a strong inclination towards relying on explicit score values. There is also evidence of people’s preference for one fairness metric, NDKL, over the other metric, ARP. Qualitative results paint a more complex picture: some participants endorse meritocratic award schemes and express concerns about fairness metrics being used to modify rankings; while other participants acknowledge socio-economic factors in score-based rankings as justification for adjusting rankings. In summary, we find that operationalizing algorithmic fairness in practice is a balancing act between mitigating harms towards marginalized groups and societal conventions of leveraging traditional performance scores such as grades in decision-making contexts.
more » « less
Full Text Available
Fair&Share: Fast and Fair Multi-Criteria Selections

https://doi.org/10.1145/3583780.3614874

Cachel, Kathleen; Rundensteiner, Elke (October 2023, ACM)

Full Text Available

« Prev Next »

Search for: All records