Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

Carton, Samuel; Mei, Qiaozhu; Resnick, Paul

doi:10.18653/v1/D18-1386

Citation Details

Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. Building on an existing architecture for extractive explanations via hard attention, we add an adversarial layer which scans the residual of the attention for remaining predictive signal. Motivated by the important domain of detecting personal attacks in social media comments, we additionally demonstrate the importance of manually setting a semantically appropriate “default” behavior for the model by explicitly manipulating its bias term. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes. more »

Award ID(s):: 1717688

PAR ID:: 10107870

Author(s) / Creator(s):: Carton, Samuel; Mei, Qiaozhu; Resnick, Paul

Date Published:: 2018-10-01

Journal Name:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Page Range / eLocation ID:: 3497 to 3507

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/D18-1386

More Like this