The proliferation of online face images has heightened privacy concerns, as adversaries can exploit facial features for nefarious purposes. While adversarial perturbations have been proposed to safeguard these images, their effectiveness remains questionable. This paper introduces IVORY, a novel adversarial purification method leveraging Diffusion Transformer-based Stable Diffusion 3 model to purify perturbed images and improve facial feature extraction. Evaluated across gender recognition, ethnicity recognition and age group classification tasks with CNNs like VGG16, SENet and MobileNetV3 and vision transformers like SwinFace, IVORY consistently restores classifier performance to near-clean levels in white-box settings, outperforming traditional defenses such as Adversarial Training, DiffPure and IMPRESS. For example, it improved gender recognition accuracy from 37.8% to 96% under the PGD attack for VGG16 and age group classification accuracy from 2.1% to 52.4% under AutoAttack for MobileNetV3. In black-box scenarios, IVORY achieves a 22.8% average accuracy gain. IVORY also reduces SSIM noise by over 50% at 1x resolution and up to 80% at 2x resolution compared to DiffPure. Our analysis further reveals that adversarial perturbations alone do not fully protect against soft-biometric extraction, highlighting the need for comprehensive evaluation frameworks and robust defenses.
more »
« less
Exploring racial and gender disparities in voice biometrics
Abstract Systemic inequity in biometrics systems based on racial and gender disparities has received a lot of attention recently. These disparities have been explored in existing biometrics systems such as facial biometrics (identifying individuals based on facial attributes). However, such ethical issues remain largely unexplored in voice biometric systems that are very popular and extensively used globally. Using a corpus of non-speech voice records featuring a diverse group of 300 speakers by race (75 each from White, Black, Asian, and Latinx subgroups) and gender (150 each from female and male subgroups), we explore and reveal that racial subgroup has a similar voice characteristic and gender subgroup has a significant different voice characteristic. Moreover, non-negligible racial and gender disparities exist in speaker identification accuracy by analyzing the performance of one commercial product and five research products. The average accuracy for Latinxs can be 12% lower than Whites (p < 0.05, 95% CI 1.58%, 14.15%) and can be significantly higher for female speakers than males (3.67% higher, p < 0.05, 95% CI 1.23%, 11.57%). We further discover that racial disparities primarily result from the neural network-based feature extraction within the voice biometric product and gender disparities primarily due to both voice inherent characteristic difference and neural network-based feature extraction. Finally, we point out strategies (e.g., feature extraction optimization) to incorporate fairness and inclusive consideration in biometrics technology.
more »
« less
- Award ID(s):
- 2050910
- PAR ID:
- 10321081
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A Continuous Articulatory Gesture Based Liveness Detection for Voice Authentication on Smart DevicesVoice biometrics is drawing increasing attention to user authentication on smart devices. However, voice biometrics is vulnerable to replay attacks, where adversaries try to spoof voice authentication systems using pre-recorded voice samples collected from genuine users. To this end, we propose VoiceGesture, a liveness detection solution for voice authentication on smart devices such as smartphones and smart speakers. With audio hardware advances on smart devices, VoiceGesture leverages built-in speaker and microphone pairs on smart devices as Doppler Radar to sense articulatory gestures for liveness detection during voice authentication. The experiments with 21 participants and different smart devices show that VoiceGesture achieves over 99% and around 98% detection accuracy for text-dependent and text-independent liveness detection, respectively. Moreover, VoiceGesture is robust to different device placements, low audio sampling frequency, and supports medium range liveness detection on smart speakers in various use scenarios, including smart homes and smart vehicles.more » « less
-
With the advances in deep learning, speaker verification has achieved very high accuracy and is gaining popularity as a type of biometric authentication option in many scenes of our daily life, especially the growing market of web services. Compared to traditional passwords, “vocal passwords” are much more convenient as they relieve people from memorizing different passwords. However, new machine learning attacks are putting these voice authentication systems at risk. Without a strong security guarantee, attackers could access legitimate users’ web accounts by fooling the deep neural network (DNN) based voice recognition models. In this article, we demonstrate an easy-to-implement data poisoning attack to the voice authentication system, which cannot be captured effectively by existing defense mechanisms. Thus, we also propose a more robust defense method called Guardian, a convolutional neural network-based discriminator. The Guardian discriminator integrates a series of novel techniques including bias reduction, input augmentation, and ensemble learning. Our approach is able to distinguish about 95% of attacked accounts from normal accounts, which is much more effective than existing approaches with only 60% accuracy.more » « less
-
Gender, Soft Skills, and Patient Experience in Online Physician Reviews: A Large-Scale Text AnalysisBackground Online physician reviews are an important source of information for prospective patients. In addition, they represent an untapped resource for studying the effects of gender on the doctor-patient relationship. Understanding gender differences in online reviews is important because it may impact the value of those reviews to patients. Documenting gender differences in patient experience may also help to improve the doctor-patient relationship. This is the first large-scale study of physician reviews to extensively investigate gender bias in online reviews or offer recommendations for improvements to online review systems to correct for gender bias and aid patients in selecting a physician. Objective This study examines 154,305 reviews from across the United States for all medical specialties. Our analysis includes a qualitative and quantitative examination of review content and physician rating with regard to doctor and reviewer gender. Methods A total of 154,305 reviews were sampled from Google Place reviews. Reviewer and doctor gender were inferred from names. Reviews were coded for overall patient experience (negative or positive) by collapsing a 5-star scale and coded for general categories (process, positive/negative soft skills), which were further subdivided into themes. Computational text processing methods were employed to apply this codebook to the entire data set, rendering it tractable to quantitative methods. Specifically, we estimated binary regression models to examine relationships between physician rating, patient experience themes, physician gender, and reviewer gender). Results Female reviewers wrote 60% more reviews than men. Male reviewers were more likely to give negative reviews (odds ratio [OR] 1.15, 95% CI 1.10-1.19; P<.001). Reviews of female physicians were considerably more negative than those of male physicians (OR 1.99, 95% CI 1.94-2.14; P<.001). Soft skills were more likely to be mentioned in the reviews written by female reviewers and about female physicians. Negative reviews of female doctors were more likely to mention candor (OR 1.61, 95% CI 1.42-1.82; P<.001) and amicability (OR 1.63, 95% CI 1.47-1.90; P<.001). Disrespect was associated with both female physicians (OR 1.42, 95% CI 1.35-1.51; P<.001) and female reviewers (OR 1.27, 95% CI 1.19-1.35; P<.001). Female patients were less likely to report disrespect from female doctors than expected from the base ORs (OR 1.19, 95% CI 1.04-1.32; P=.008), but this effect overrode only the effect for female reviewers. Conclusions This work reinforces findings in the extensive literature on gender differences and gender bias in patient-physician interaction. Its novel contribution lies in highlighting gender differences in online reviews. These reviews inform patients’ choice of doctor and thus affect both patients and physicians. The evidence of gender bias documented here suggests review sites may be improved by providing information about gender differences, controlling for gender when presenting composite ratings for physicians, and helping users write less biased reviews.more » « less
-
null (Ed.)Metric learning is a valuable technique for enabling the ongoing enrollment of new users within biometric systems. While this approach has been heavily employed for other biometric modalities such as facial recognition, applications to eye movements have only recently been explored. This manuscript further investigates the application of metric learning to eye movement biometrics. A set of three multilayer perceptron networks are trained for embedding feature vectors describing three classes of eye movements: fixations, saccades, and post-saccadic oscillations. The network is validated on a dataset containing eye movement traces of 269 subjects recorded during a reading task. The proposed algorithm is benchmarked against a previously introduced statistical biometric approach. While mean equal error rate (EER) was increased versus the benchmark method, the proposed technique demonstrated lower dispersion in EER across the four test folds considered herein.more » « less
An official website of the United States government

