Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text

Bannihatti Kumar, Vinayshekhar; Iyengar, Roger; Nisal, Namita; Feng, Yuanyuan; Habib, Hana; Story, Peter; Cherivirala, Sushain; Hagan, Margaret; Cranor, Lorrie; Wilson, Shomir; Schaub, Florian; Sadeh, Norman

doi:10.1145/3366423.3380262

Citation Details

Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text

Website privacy policies sometimes provide users the option to opt-out of certain collections and uses of their personal data. Unfortunately, many privacy policies bury these instructions deep in their text, and few web users have the time or skill necessary to discover them. We describe a method for the automated detection of opt-out choices in privacy policy text and their presentation to users through a web browser extension. We describe the creation of two corpora of opt-out choices, which enable the training of classifiers to identify opt-outs in privacy policies. Our overall approach for extracting and classifying opt-out choices combines heuristics to identify commonly found opt-out hyperlinks with supervised machine learning to automatically identify less conspicuous instances. Our approach achieves a precision of 0.93 and a recall of 0.9. We introduce Opt-Out Easy, a web browser extension designed to present available opt-out choices to users as they browse the web. We evaluate the usability of our browser extension with a user study. We also present results of a large-scale analysis of opt-outs found in the text of thousands of the most popular websites. more »

Award ID(s):: 1914486

PAR ID:: 10169862

Author(s) / Creator(s):: Bannihatti Kumar, Vinayshekhar; Iyengar, Roger; Nisal, Namita; Feng, Yuanyuan; Habib, Hana; Story, Peter; Cherivirala, Sushain; Hagan, Margaret; Cranor, Lorrie; Wilson, Shomir; Schaub, Florian; Sadeh, Norman

Date Published:: 2020-04-19

Journal Name:: WWW '20: Proceedings of the Web Conference 2020

Page Range / eLocation ID:: 1943 to 1954

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3366423.3380262

More Like this