Poster: Video Fingerprinting in Tor
Over 8 million users rely on the Tor network each day to protect their anonymity online. Unfortunately, Tor has been shown to be vulnerable to the website fingerprinting attack, which allows an attacker to deduce the website a user is visiting based on patterns in their traffic. The state-of-the-art attacks leverage deep learning to achieve high classification accuracy using raw packet information. Work thus far, however, has examined only one type of media delivered over the Tor network: web pages, and mostly just home pages of sites. In this work, we instead investigate the fingerprintability of video content served over Tor. We collected a large new dataset of network traces for 50 YouTube videos of similar length. Our preliminary experiments utilizing a convolutional neural network model proposed in prior works has yielded promising classification results, achieving up to 55% accuracy. This shows the potential to unmask the individual videos that users are viewing over Tor, creating further privacy challenges to consider when defending against website fingerprinting attacks.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
- Page Range or eLocation-ID:
- 2629 to 2631
- Sponsoring Org:
- National Science Foundation
More Like this
Most privacy-conscious users utilize HTTPS and an anonymity network such as Tor to mask source and destination IP addresses. It has been shown that encrypted and anonymized network traffic traces can still leak information through a type of attack called a website fingerprinting (WF) attack. The adversary records the network traffic and is only able to observe the number of incoming and outgoing messages, the size of each message, and the time difference between messages. In previous work, the effectiveness of website fingerprinting has been shown to have an accuracy of over 90% when using Tor as the anonymity network. Thus, an Internet Service Provider can successfully identify the websites its users are visiting. One main concern about website fingerprinting is its practicality. The common assumption in most previous work is that a victim is visiting one website at a time and has access to the complete network trace of that website. However, this is not realistic. We propose two new algorithms to deal with situations when the victim visits one website after another (continuous visits) and visits another website in the middle of visiting one website (overlapping visits). We show that our algorithm gives an accuracy of 80% (comparedmore »
The Tor anonymity system is vulnerable to website fingerprinting attacks that can reveal users Internet browsing behavior. The state-of-the-art website fingerprinting attacks use convolutional neural networks to automatically extract features from packet traces. One such attack undermines an efficient fingerprinting defense previously considered a candidate for implementation in Tor. In this work, we study the use of neural network attribution techniques to visualize activity in the attack's model. These visualizations, essentially heatmaps of the network, can be used to identify regions of particular sensitivity and provide insight into the features that the model has learned. We then examine how these heatmaps may be used to create a new website fingerprinting defense that applies random padding to the website trace with an emphasis towards highly fingerprintable regions. This defense reduces the attacker's accuracy from 98% to below 70% with a packet overhead of approximately 80%.
Website Fingerprinting (WF) attacks pose a serious threat to users' online privacy, including for users of the Tor anonymity system. By exploiting recent advances in deep learning, WF attacks like Deep Fingerprinting (DF) have reached up to 98% accuracy. The DF attack, however, requires large amounts of training data that needs to be updated regularly, making it less practical for the weaker attacker model typically assumed in WF. Moreover, research on WF attacks has been criticized for not demonstrating attack effectiveness under more realistic and more challenging scenarios. Most research on WF attacks assumes that the testing and training data have similar distributions and are collected from the same type of network at about the same time. In this paper, we examine how an attacker could leverage N-shot learning---a machine learning technique requiring just a few training samples to identify a given class---to reduce the effort of gathering and training with a large WF dataset as well as mitigate the adverse effects of dealing with different network conditions. In particular, we propose a new WF attack called Triplet Fingerprinting (TF) that uses triplet networks for N-shot learning. We evaluate this attack in challenging settings such as where the training andmore »
null (Ed.)Abstract We introduce Generative Adversarial Networks for Data-Limited Fingerprinting (GANDaLF), a new deep-learning-based technique to perform Website Fingerprinting (WF) on Tor traffic. In contrast to most earlier work on deep-learning for WF, GANDaLF is intended to work with few training samples, and achieves this goal through the use of a Generative Adversarial Network to generate a large set of “fake” data that helps to train a deep neural network in distinguishing between classes of actual training data. We evaluate GANDaLF in low-data scenarios including as few as 10 training instances per site, and in multiple settings, including fingerprinting of website index pages and fingerprinting of non-index pages within a site. GANDaLF achieves closed-world accuracy of 87% with just 20 instances per site (and 100 sites) in standard WF settings. In particular, GANDaLF can outperform Var-CNN and Triplet Fingerprinting (TF) across all settings in subpage fingerprinting. For example, GANDaLF outperforms TF by a 29% margin and Var-CNN by 38% for training sets using 20 instances per site.