The prevalent commercial deployment of automated facial analysis systems such as face recognition as a robust authentication method has increasingly fueled scientific attention. Current machine learning algorithms allow for a relatively reliable detection, recognition, and categorization of face images comprised of age, race, and gender. Algorithms with such biased data are bound to produce skewed results. It leads to a significant decrease in the performance of state-of-the-art models when applied to images of gender or ethnicity groups. In this paper, we study the gender bias in facial recognition with gender balanced and imbalanced training sets using five traditional machine learning algorithms. We aim to report the machine learning classifiers which are inclined towards gender bias and the ones which mitigate it. Miss rates metric is effective in finding out potential bias in predictions. Our study utilizes miss rates metric along with a standard metric such as accuracy, precision or recall to evaluate possible gender bias effectively.
more »
« less
Evaluating Impact of Race in Facial Recognition across Machine Learning and Deep Learning Algorithms
The research aims to evaluate the impact of race in facial recognition across two types of algorithms. We give a general insight into facial recognition and discuss four problems related to facial recognition. We review our system design, development, and architectures and give an in-depth evaluation plan for each type of algorithm, dataset, and a look into the software and its architecture. We thoroughly explain the results and findings of our experimentation and provide analysis for the machine learning algorithms and deep learning algorithms. Concluding the investigation, we compare the results of two kinds of algorithms and compare their accuracy, metrics, miss rates, and performances to observe which algorithms mitigate racial bias the most. We evaluate racial bias across five machine learning algorithms and three deep learning algorithms using racially imbalanced and balanced datasets. We evaluate and compare the accuracy and miss rates between all tested algorithms and report that SVC is the superior machine learning algorithm and VGG16 is the best deep learning algorithm based on our experimental study. Our findings conclude the algorithm that mitigates the bias the most is VGG16, and all our deep learning algorithms outperformed their machine learning counterparts.
more »
« less
- Award ID(s):
- 1900087
- PAR ID:
- 10296823
- Date Published:
- Journal Name:
- Computers
- Volume:
- 10
- Issue:
- 9
- ISSN:
- 2073-431X
- Page Range / eLocation ID:
- 113
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
NA (Ed.)Facial attribute classification algorithms frequently manifest demographic biases by obtaining differential performance across gender and racial groups. Existing bias mitigation techniques are mostly in-processing techniques, i.e., implemented during the classifier’s training stage, that often lack generalizability, require demographically annotated training sets, and exhibit a trade-off between fairness and classification accuracy. In this paper, we propose a technique to mitigate bias at the test time i.e., during the deployment stage, by harnessing prediction uncertainty and human–machine partnership. To this front, we propose to utilize those lowest percentages of test data samples identified as outliers with high prediction uncertainty. These identified uncertain samples at test-time are labeled by human analysts for decision rendering and for subsequently retraining the deep neural network in a continual learning framework. With minimal human involvement and through iterative refinement of the network with human guidance at test-time, we seek to enhance the accuracy as well as the fairness of the already deployed facial attribute classification algorithms. Extensive experiments are conducted on gender and smile attribute classification tasks using four publicly available datasets and with gender and race as the protected attributes. The obtained outcomes consistently demonstrate improved accuracy by up to 2% and 5% for the gender and smile attribute classification tasks, respectively, using our proposed approaches. Further, the demographic bias was significantly reduced, outperforming the State-of-the-Art (SOTA) bias mitigation and baseline techniques by up to 55% for both classification tasks.more » « less
-
Automated and contactless face recognition is a widely used machine learning technology for identifying people which has been applied in scenarios like secure login to electronic devices, automated border control, community surveillance, tracking school attendance. The use of face masks has become essential due to the global spread of COVID-19, raising concerns about the performance of recognition systems. Conventional face recognition technologies were primarily designed to work with unmasked faces, and the widespread use of masked face images significantly degrades their performance. To address this understudied issue, we evaluated the performance of six deep learning models, namely, VGG-16, AlexNet, GoogleNet, LeNet, ResNet-50, and FaceNet on masked and unmasked face images. We aim to find out if deep learning models struggle with masked face recognition and identify the models that mitigate the impact of masked face images. We track, and report miss rates for both masked and unmasked images, along with performance metrics like accuracy and F1 scores in this paper.more » « less
-
The proliferation of online face images has heightened privacy concerns, as adversaries can exploit facial features for nefarious purposes. While adversarial perturbations have been proposed to safeguard these images, their effectiveness remains questionable. This paper introduces IVORY, a novel adversarial purification method leveraging Diffusion Transformer-based Stable Diffusion 3 model to purify perturbed images and improve facial feature extraction. Evaluated across gender recognition, ethnicity recognition and age group classification tasks with CNNs like VGG16, SENet and MobileNetV3 and vision transformers like SwinFace, IVORY consistently restores classifier performance to near-clean levels in white-box settings, outperforming traditional defenses such as Adversarial Training, DiffPure and IMPRESS. For example, it improved gender recognition accuracy from 37.8% to 96% under the PGD attack for VGG16 and age group classification accuracy from 2.1% to 52.4% under AutoAttack for MobileNetV3. In black-box scenarios, IVORY achieves a 22.8% average accuracy gain. IVORY also reduces SSIM noise by over 50% at 1x resolution and up to 80% at 2x resolution compared to DiffPure. Our analysis further reveals that adversarial perturbations alone do not fully protect against soft-biometric extraction, highlighting the need for comprehensive evaluation frameworks and robust defenses.more » « less
-
We explore gender bias in the presence of facial masks in automated face recognition systems using various deep learning algorithms in this research study. The paper focuses on an experimental study using an imbalanced image database with a smaller percentage of female subjects compared to a larger percentage of male subjects and examines the impact of masked images in evaluating gender bias. The conducted experiments aim to understand how different algorithms perform in mitigating gender bias in the presence of face masks and highlight the significance of gender distribution within datasets in identifying and mitigating bias. We present the methodology used to conduct the experiments and elaborate the results obtained from male only, female only, and mixed-gender datasets. Overall, this research sheds light on the complexities of gender bias in masked versus unmasked face recognition technology and its implications for real-world applications.more » « less
An official website of the United States government

