UNDERSTANDING FEATURE DISCOVERY IN WEBSITE FINGERPRINTING ATTACKS
The Tor anonymity system is vulnerable to website fingerprinting attacks that can reveal users Internet browsing behavior. The state-of-the-art website fingerprinting attacks use convolutional neural networks to automatically extract features from packet traces. One such attack undermines an efficient fingerprinting defense previously considered a candidate for implementation in Tor. In this work, we study the use of neural network attribution techniques to visualize activity in the attack's model. These visualizations, essentially heatmaps of the network, can be used to identify regions of particular sensitivity and provide insight into the features that the model has learned. We then examine how these heatmaps may be used to create a new website fingerprinting defense that applies random padding to the website trace with an emphasis towards highly fingerprintable regions. This defense reduces the attacker's accuracy from 98% to below 70% with a packet overhead of approximately 80%.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Proceedings of the 2018 IEEE Western New York Image and Signal Processing Workshop
- Page Range or eLocation-ID:
- 1 to 5
- Sponsoring Org:
- National Science Foundation
More Like this
Over 8 million users rely on the Tor network each day to protect their anonymity online. Unfortunately, Tor has been shown to be vulnerable to the website fingerprinting attack, which allows an attacker to deduce the website a user is visiting based on patterns in their traffic. The state-of-the-art attacks leverage deep learning to achieve high classification accuracy using raw packet information. Work thus far, however, has examined only one type of media delivered over the Tor network: web pages, and mostly just home pages of sites. In this work, we instead investigate the fingerprintability of video content served over Tor. We collected a large new dataset of network traces for 50 YouTube videos of similar length. Our preliminary experiments utilizing a convolutional neural network model proposed in prior works has yielded promising classification results, achieving up to 55% accuracy. This shows the potential to unmask the individual videos that users are viewing over Tor, creating further privacy challenges to consider when defending against website fingerprinting attacks.
Abstract Recent advances in Deep Neural Network (DNN) architectures have received a great deal of attention due to their ability to outperform state-of-the-art machine learning techniques across a wide range of application, as well as automating the feature engineering process. In this paper, we broadly study the applicability of deep learning to website fingerprinting. First, we show that unsupervised DNNs can generate lowdimensional informative features that improve the performance of state-of-the-art website fingerprinting attacks. Second, when used as classifiers, we show that they can exceed performance of existing attacks across a range of application scenarios, including fingerprinting Tor website traces, fingerprinting search engine queries over Tor, defeating fingerprinting defenses, and fingerprinting TLS-encrypted websites. Finally, we investigate which site-level features of a website influence its fingerprintability by DNNs.
Tor provides low-latency anonymous and uncensored network access against a local or network adversary. Due to the design choice to minimize traffic overhead (and increase the pool of potential users) Tor allows some information about the client's connections to leak. Attacks using (features extracted from) this information to infer the website a user visits are called Website Fingerprinting (WF) attacks. We develop a methodology and tools to measure the amount of leaked information about a website. We apply this tool to a comprehensive set of features extracted from a large set of websites and WF defense mechanisms, allowing us to make more fine-grained observations about WF attacks and defenses.
Abstract A passive local eavesdropper can leverage Website Fingerprinting (WF) to deanonymize the web browsing activity of Tor users. The value of timing information to WF has often been discounted in recent works due to the volatility of low-level timing information. In this paper, we more carefully examine the extent to which packet timing can be used to facilitate WF attacks. We first propose a new set of timing-related features based on burst-level characteristics to further identify more ways that timing patterns could be used by classifiers to identify sites. Then we evaluate the effectiveness of both raw timing and directional timing which is a combination of raw timing and direction in a deep-learning-based WF attack. Our closed-world evaluation shows that directional timing performs best in most of the settings we explored, achieving: (i) 98.4% in undefended Tor traffic; (ii) 93.5% on WTF-PAD traffic, several points higher than when only directional information is used; and (iii) 64.7% against onion sites, 12% higher than using only direction. Further evaluations in the open-world setting show small increases in both precision (+2%) and recall (+6%) with directional-timing on WTF-PAD traffic. To further investigate the value of timing information, we perform an information leakagemore »