skip to main content


Title: Balancing Fairness and Accuracy in Sentiment Detection using Multiple Black Box Models
Sentiment detection is an important building block for multiple information retrieval tasks such as product recommendation, cyberbullying, fake news and misinformation detection. Unsurprisingly, multiple commercial APIs, each with different levels of accuracy and fairness, are now publicly available for sentiment detection. Users can easily incorporate these APIs in their applications. While combining inputs from multiple modalities or black-box models for increasing accuracy is commonly studied in multimedia computing literature, there has been little work on combining different modalities for increasing fairness of the resulting decision. In this work, we audit multiple commercial sentiment detection APIs for the gender bias in two-actor news headlines settings and report on the level of bias observed. Next, we propose a "Flexible Fair Regression" approach, which ensures satisfactory accuracy and fairness by jointly learning from multiple black-box models. The results pave way for fair yet accurate sentiment detectors for multiple applications.  more » « less
Award ID(s):
1915790
PAR ID:
10231394
Author(s) / Creator(s):
;
Date Published:
Journal Name:
In Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia
Page Range / eLocation ID:
13 to 19
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Recent reports of bias in multimedia algorithms (e.g., lesser accuracy of face detection for women and persons of color) have underscored the urgent need to devise approaches which work equally well for different demographic groups. Hence, we posit that ensuring fairness in multimodal cyber-bullying detectors (e.g., equal performance irrespective of the gender of the victim) is an important research challenge. We propose a fairness-aware fusion framework that ensures that both fairness and accuracy remain important considerations when combining data coming from multiple modalities. In this Bayesian fusion framework, the inputs coming from different modalities are combined in a way that is cognizant of the different confidence levels associated with each feature and the interdependencies between features. Specifically, this framework assigns weights to different modalities not just based on accuracy but also their fairness. Results of applying the framework on a multimodal (visual + text) cyberbullying detection problem demonstrate the value of the proposed framework in ensuring both accuracy and fairness. 
    more » « less
  2. Misinformation in online spaces can stoke mistrust of established media, misinform the public and lead to radicalization. Hence, multiple automated algorithms for misinformation detection have been proposed in the recent past. However, the fairness (e.g., performance across left- and right- leaning news articles) of these algorithms has been repeatedly questioned, leading to decreased trust in such systems. This work motivates and grounds the need for an audit of machine learning based misinformation detection algorithms and possible ways to mitigate bias (if found). Using a large (N>100K) corpus of news articles, we report that multiple standard machine learning based misinformation detection approaches are susceptible to bias. Further, we find that an intuitive post-processing approach (Reject Option Classifier) can reduce bias while maintaining high accuracy in the above setting. The results pave the way for accurate yet fair misinformation detection algorithms. 
    more » « less
  3. Ensuring fairness in anomaly detection models has received much attention recently as many anomaly detection applications involve human beings. However, existing fair anomaly detection approaches mainly focus on association-based fairness notions. In this work, we target counterfactual fairness, which is a prevalent causation-based fairness notion. The goal of counterfactually fair anomaly detection is to ensure that the detection outcome of an individual in the factual world is the same as that in the counterfactual world where the individual had belonged to a different group. To this end, we propose a counterfactually fair anomaly detection (CFAD) framework which consists of two phases, counterfactual data generation and fair anomaly detection. Experimental results on a synthetic dataset and two real datasets show that CFAD can effectively detect anomalies as well as ensure counterfactual fairness. 
    more » « less
  4. Combining the preferences of many rankers into one single consensus ranking is critical for consequential applications from hiring and admissions to lending. While group fairness has been extensively studied for classification, group fairness in rankings and in particular rank aggregation remains in its infancy. Recent work introduced the concept of fair rank aggregation for combining rankings but restricted to the case when candidates have a single binary protected attribute, i.e., they fall into two groups only. Yet it remains an open problem how to create a consensus ranking that represents the preferences of all rankers while ensuring fair treatment for candidates with multiple protected attributes such as gender, race, and nationality. In this work, we are the first to define and solve this open Multi-attribute Fair Consensus Ranking (MFCR) problem. As a foundation, we design novel group fairness criteria for rankings, called MANI-Rank, ensuring fair treatment of groups defined by individual protected attributes and their intersection. Leveraging the MANI-Rank criteria, we develop a series of algorithms that for the first time tackle the MFCR problem. Our experimental study with a rich variety of consensus scenarios demonstrates our MFCR methodology is the only approach to achieve both intersectional and protected attribute fairness while also representing the preferences expressed through many base rankings. Our real-world case study on merit scholarships illustrates the effectiveness of our MFCR methods to mitigate bias across multiple protected attributes and their intersections. 
    more » « less
  5. Multiple recent efforts have used large-scale data and computational models to automatically detect misinformation in online news articles. Given the potential impact of misinformation on democracy, many of these efforts have also used the political ideology of these articles to better model misinformation and study political bias in such algorithms. However, almost all such efforts have used source level labels for credibility and political alignment, thereby assigning the same credibility and political alignment label to all articles from the same source (e.g., the New York Times or Breitbart). Here, we report on the impact of journalistic best practices to label individual news articles for their credibility and political alignment. We found that while source level labels are decent proxies for political alignment labeling, they are very poor proxies-almost the same as flipping a coin-for credibility ratings. Next, we study the implications of such source level labeling on downstream processes such as the development of automated misinformation detection algorithms and political fairness audits therein. We find that the automated misinformation detection and fairness algorithms can be suitably revised to support their intended goals but might require different assumptions and methods than those which are appropriate using source level labeling. The results suggest caution in generalizing recent results on misinformation detection and political bias therein. On a positive note, this work shares a new dataset of journalistic quality individually labeled articles and an approach for misinformation detection and fairness audits. 
    more » « less