NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Exploring “Just Noticeable” Group Fairness in Rankings

Alkhathlan, Mallak; Shrestha, Hilson; Harrison, Lane; Rundensteiner, Elke (October 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org).)

The plethora of fairness metrics developed for ranking-based decision-making raises the question: which metrics align best with people’s perceptions of fairness, and why? Most prior studies examining people’s perceptions of fairness metrics tend to use ordinal rating scales (e.g., Likert scales). However, such scales can be ambiguous in their interpretation across participants, and can be influenced by interface features used to capture responses.We address this gap by exploring the use of two-alternative forced choice methodologies— used extensively outside the fairness community for comparing visual stimuli— to quantitatively compare participant perceptions across fairness metrics and ranking characteristics. We report a crowdsourced experiment with 224 participants across four conditions: two alternative rank fairness metrics, ARP and NDKL, and two ranking characteristics, lists of 20 and 100 candidates, resulting in over 170,000 individual judgments. Quantitative results show systematic differences in how people interpert these metrics, and surprising exceptions where fairness metrics disagree with people’s perceptions. Qualitative analyses of participant comments reveals an interplay between cognitive and visual strategies that affects people’s perceptions of fairness. From these results, we discuss future work in aligning fairness metrics with people’s perceptions, and highlight the need and benefits of expanding methodologies for fairness studies.
more » « less
Free, publicly-accessible full text available October 20, 2026
recVisit: Enabling Experimentation and Evaluation in Recommender System User Interfaces

Shrestha, Hilson; Shrestha, Bijesh; Harrison, Lane (September 2025, Laboratory for Analytic Sciences)

Free, publicly-accessible full text available September 1, 2026
Exploring Contradictions with OpenTLDR

A, William; Browning, Abigail; Harrison, Lane (September 2025, Laboratory for Analytic Sciences)

Free, publicly-accessible full text available September 1, 2026
Faster, Smarter, User-Aligned: EvalOps and the Future of Integrated Evaluation for the IC

Shrestha, Bijesh; Shrestha, Hilson; Bonilla, Karen; Harrison, Lane T; Crouser, R Jordan (September 2025, Laboratory for Analytic Sciences)

Free, publicly-accessible full text available September 1, 2026
Crowdsourced Think-Aloud Studies

https://doi.org/10.1145/3706598.3714305

Cutler, Zach; Harrison, Lane; Nobre, Carolina; Lex, Alexander (April 2025, ACM)

The think-aloud (TA) protocol is a useful method for evaluating user interfaces, including data visualizations. However, TA studies are time-consuming to conduct and hence often have a small number of participants. Crowdsourcing TA studies would help alleviate these problems, but the technical overhead and the unknown quality of results have restricted TA to synchronous studies. To address this gap we introduce CrowdAloud, a system for creating and analyzing asynchronous, crowdsourced TA studies. CrowdAloud captures audio and provenance (log) data as participants interact with a stimulus. Participant audio is automatically transcribed and visualized together with events data and a full recreation of the state of the stimulus as seen by participants. To gauge the value of crowdsourced TA studies, we conducted two experiments: one to compare lab-based and crowdsourced TA studies, and one to compare crowdsourced TA studies with crowdsourced text prompts. Our results suggest that crowdsourcing is a viable approach for conducting TA studies at scale.
more » « less
Free, publicly-accessible full text available April 25, 2026
Promises and Pitfalls: Using Large Language Models to Generate Visualization Items

https://doi.org/10.1109/TVCG.2024.3456309

Cui, Yuan; Ge, Lily W; Ding, Yiren; Harrison, Lane; Yang, Fumeng; Kay, Matthew (January 2025, IEEE Transactions on Visualization and Computer Graphics)

Full Text Available
Exploring Fairness across Many Rankings

Shrestha, Hilson; Cachel, Kathleen; Alkhathlan, Mallak; Rundensteiner, Elke; Harrison, Lane (October 2024, published as poster (and 2 page short paper) in : IEEE Visualization and Visual Analytics (VIS), 2024.)

Poster.
more » « less
Full Text Available
SummShaper: Empowering Analysts to Tailor Attributed Summaries

Harrison, Lane; Evans, Melissa; Crouser, R Jordan; Fathi, Razieh; Wang, Yue; Shrestha, Hilson; Shrestha, Bijesh (September 2024, Laboratory for Analytic Sciences)

Full Text Available
Balancing Act: Evaluating People’s Perceptions of Fair Ranking Metrics

https://doi.org/10.1145/3630106.3659018

Alkhathlan, Mallak; Cachel, Kathleen; Shrestha, Hilson; Harrison, Lane; Rundensteiner, Elke (June 2024, Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency)

Algorithmic decision-making using rankings— prevalent in areas from hiring and bail to university admissions— raises concerns of potential bias. In this paper, we explore the alignment between people’s perceptions of fairness and two popular fairness metrics designed for rankings. In a crowdsourced experiment with 480 participants, people rated the perceived fairness of a hypothetical scholarship distribution scenario. Results suggest a strong inclination towards relying on explicit score values. There is also evidence of people’s preference for one fairness metric, NDKL, over the other metric, ARP. Qualitative results paint a more complex picture: some participants endorse meritocratic award schemes and express concerns about fairness metrics being used to modify rankings; while other participants acknowledge socio-economic factors in score-based rankings as justification for adjusting rankings. In summary, we find that operationalizing algorithmic fairness in practice is a balancing act between mitigating harms towards marginalized groups and societal conventions of leveraging traditional performance scores such as grades in decision-making contexts.
more » « less
Full Text Available
Adaptive Assessment of Visualization Literacy

https://doi.org/10.1109/TVCG.2023.3327165

Cui, Yuan; Ge, Lily W.; Ding, Yiren; Yang, Fumeng; Harrison, Lane; Kay, Matthew (January 2024, IEEE Transactions on Visualization and Computer Graphics)

Visualization literacy is an essential skill for accurately interpreting data to inform critical decisions. Consequently, it is vital to understand the evolution of this ability and devise targeted interventions to enhance it, requiring concise and repeatable assessments of visualization literacy for individuals. However, current assessments, such as the Visualization Literacy Assessment Test ( vlat ), are time-consuming due to their fixed, lengthy format. To address this limitation, we develop two streamlined computerized adaptive tests ( cats ) for visualization literacy, a-vlat and a-calvi , which measure the same set of skills as their original versions in half the number of questions. Specifically, we (1) employ item response theory (IRT) and non-psychometric constraints to construct adaptive versions of the assessments, (2) finalize the configurations of adaptation through simulation, (3) refine the composition of test items of a-calvi via a qualitative study, and (4) demonstrate the test-retest reliability (ICC: 0.98 and 0.98) and convergent validity (correlation: 0.81 and 0.66) of both CATS via four online studies. We discuss practical recommendations for using our CATS and opportunities for further customization to leverage the full potential of adaptive assessments. All supplemental materials are available at https://osf.io/a6258/ .
more » « less
Full Text Available

« Prev Next »

Search for: All records