Abstract—Personal Identifiable Information (PII) is any information that permits the identity of an individual to be directly or indirectly inferred. It should be protected against random access. This paper studies the extent of PII exposure on the Internet. It is hoped that the results of this study can help raise the Internet users’ awareness on privacy protection.
more »
« less
Linking Personally Identifiable Information from the Dark Web to the Surface Web: A Deep Entity Resolution Approach
The information privacy of the Internet users
has become a major societal concern. The rapid growth of
online services increases the risk of unauthorized access to
Personally Identifiable Information (PII) of at-risk
populations, who are unaware of their PII exposure. To
proactively identify online at-risk populations and increase
their privacy awareness, it is crucial to conduct a holistic
privacy risk assessment across the internet. Current privacy
risk assessment studies are limited to a single platform within
either the surface web or the dark web. A comprehensive
privacy risk assessment requires matching exposed PII on
heterogeneous online platforms across the surface web and
the dark web. However, due to the incompleteness and
inaccuracy of PII records in each platform, linking the
exposed PII to users is a non-trivial task. While Entity
Resolution (ER) techniques can be used to facilitate this task,
they often require ad-hoc, manual rule development and
feature engineering. Recently, Deep Learning (DL)-based ER
has outperformed manual entity matching rules by
automatically extracting prominent features from incomplete
or inaccurate records. In this study, we enhance the existing
privacy risk assessment with a DL-based ER method, namely
Multi-Context Attention (MCA), to comprehensively evaluate
individuals’ PII exposure across the different online
platforms in the dark web and surface web. Evaluation
against benchmark ER models indicates the efficacy of MCA.
Using MCA on a random sample of data breach victims in the
dark web, we are able to identify 4.3% of the victims on the
surface web platforms and calculate their privacy risk scores.
more »
« less
- PAR ID:
- 10218323
- Date Published:
- Journal Name:
- International Conference on Data Mining Workshops (ICDMW)
- Page Range / eLocation ID:
- 488 to 495
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Dark patterns are user interface elements that can influence a person's behavior against their intentions or best interests. Prior work identified these patterns in websites and mobile apps, but little is known about how the design of platforms might impact dark pattern manifestations and related human vulnerabilities. In this paper, we conduct a comparative study of mobile application, mobile browser, and web browser versions of 105 popular services to investigate variations in dark patterns across modalities. We perform manual tests, identify dark patterns in each service, and examine how they persist or differ by modality. Our findings show that while services can employ some dark patterns equally across modalities, many dark patterns vary between platforms, and that these differences saddle people with inconsistent experiences of autonomy, privacy, and control. We conclude by discussing broader implications for policymakers and practitioners, and provide suggestions for furthering dark patterns research.more » « less
-
Privacy policies, despite the important information they provide about the collection and use of one's data, tend to be skipped over by most Internet users. In this paper, we seek to make privacy policies more accessible by automatically classifying web privacy. We use natural language processing techniques and multiple machine learning models to determine the effectiveness of each method in the classification method. We also explore the effectiveness of these methods to classify privacy policies of Internet of Things (IoT) devices.more » « less
-
Personally Identifiable Information (PII) leakage can lead to identity theft, financial loss, reputation damage, and anxiety. However, individuals remain largely unaware of their PII exposure on the Internet, and whether providing individuals with information about the extent of their PII exposure can trigger privacy protection actions requires further investigation. In this pilot study, grounded by Protection Motivation Theory (PMT), we examine whether receiving privacy alerts in the form of threat and countermeasure information will trigger senior citizens to engage in protective behaviors. We also examine whether providing personalized information moderates the relationship between information and individuals' perceptions. We contribute to the literature by shedding light on the determinants and barriers to adopting privacy protection behaviors.more » « less
-
This report will analyze issues related to web browser security and privacy. The web browser applications that will be looked at are Google Chrome, Bing, Mozilla Firefox, Internet Explorer, Microsoft Edge, Safari, and Opera. In recent months web browsers have increased the number of daily users. With the increase in daily users who may not be as well versed in data security and privacy, comes an increase in attacks. This study will discuss the pros and cons of each web browser, how many have been hacked, how often they have been hacked, why they have been hacked, security flaws, and more. The study utilizes research and a user survey to make a proper analysis and provide recommendations on the topic.more » « less