skip to main content


Title: FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic
Mobile-application fingerprinting of network traffic is valuable for many security solutions as it provides insights into the apps active on a network. Unfortunately, existing techniques require prior knowledge of apps to be able to recognize them. However, mobile environments are constantly evolving, i.e., apps are regularly installed, updated, and uninstalled. Therefore, it is infeasible for existing fingerprinting approaches to cover all apps that may appear on a network. Moreover, most mobile traffic is encrypted, shows similarities with other apps, e.g., due to common libraries or the use of content delivery networks, and depends on user input, further complicating the fingerprinting process. As a solution, we propose FlowPrint, a semi-supervised approach for fingerprinting mobile apps from (encrypted) network traffic. We automatically find temporal correlations among destination-related features of network traffic and use these correlations to generate app fingerprints. Our approach is able to fingerprint previously unseen apps, something that existing techniques fail to achieve. We evaluate our approach for both Android and iOS in the setting of app recognition, where we achieve an accuracy of 89.2%, significantly outperforming state-of-the-art solutions. In addition, we show that our approach can detect previously unseen apps with a precision of 93.5%, detecting 72.3% of apps within the first five minutes of communication.  more » « less
Award ID(s):
1909020
NSF-PAR ID:
10192513
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Network and Distributed System Security Symposium (NDSS)
Volume:
27
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In recent years, we have seen rapid growth in the use and adoption of Internet of Things (IoT) devices. However, some loT devices are sensitive in nature, and simply knowing what devices a user owns can have security and privacy implications. Researchers have, therefore, looked at fingerprinting loT devices and their activities from encrypted network traffic. In this paper, we analyze the feasibility of fingerprinting IoT devices and evaluate the robustness of such fingerprinting approach across multiple independent datasets — collected under different settings. We show that not only is it possible to effectively fingerprint 188 loT devices (with over 97% accuracy), but also to do so even with multiple instances of the same make-and-model device. We also analyze the extent to which temporal, spatial and data-collection-methodology differences impact fingerprinting accuracy. Our analysis sheds light on features that are more robust against varying conditions. Lastly, we comprehensively analyze the performance of our approach under an open-world setting and propose ways in which an adversary can enhance their odds of inferring additional information about unseen devices (e.g., similar devices manufactured by the same company). 
    more » « less
  2. With the growth of smartphone sales and app usage, fingerprinting and identification of smartphone apps have become a considerable threat to user security and privacy. Traffic analysis is one of the most common methods for identifying apps. Traditional countermeasures towards traffic analysis includes traffic morphing and multipath routing. The basic idea of multipath routing is to increase the difficulty for adversary to eavesdrop all traffic by splitting traffic into several subflows and transmitting them through different routes. Previous works in multipath routing mainly focus on Wireless Sensor Networks (WSNs) or Mobile Ad Hoc Networks (MANETs). In this paper, we propose a multipath routing scheme for smartphones with edge network assistance to mitigate traffic analysis attack. We consider an adversary with limited capability, that is, he can only intercept the traffic of one node following certain attack probability, and try to minimize the traffic an adversary can intercept. We formulate our design as a flow routing optimization problem. Then a heuristic algorithm is proposed to solve the problem. Finally, we present the simulation results for our scheme and justify that our scheme can effectively protect smartphones from traffic analysis attack. 
    more » « less
  3. This paper proposes FingerprinTV, a fully automated methodology for extracting fingerprints from the network traffic of smart TV apps and assessing their performance. FingerprinTV (1) installs, repeatedly launches, and collects network traffic from smart TV apps; (2) extracts three different types of network fingerprints for each app, i.e., domain-based fingerprints (DBF), packet-pair-based fingerprints (PBF), and TLS-based fingerprints (TBF); and (3) analyzes the extracted fingerprints in terms of their prevalence, distinctiveness, and sizes. From applying FingerprinTV to the top-1000 apps of the three most popular smart TV platforms, we find that smart TV app network fingerprinting is feasible and effective: even the least prevalent type of fingerprint manifests itself in at least 68% of apps of each platform, and up to 89% of fingerprints uniquely identify a specific app when two fingerprinting techniques are used together. By analyzing apps that exhibit identical fingerprints, we find that these apps often stem from the same developer or “no code” app generation toolkit. Furthermore, we show that many apps that are present on all three platforms exhibit platform-specific fingerprints. 
    more » « less
  4. This paper proposes FingerprinTV, a fully automated methodology for extracting fingerprints from the network traffic of smart TV apps and assessing their performance. FingerprinTV (1) installs, repeatedly launches, and collects network traffic from smart TV apps; (2) extracts three different types of network fingerprints for each app, i.e., domain-based fingerprints (DBF), packet-pair-based fingerprints (PBF), and TLS-based fingerprints (TBF); and (3) analyzes the extracted fingerprints in terms of their prevalence, distinctiveness, and sizes. From applying FingerprinTV to the top-1000 apps of the three most popular smart TV platforms, we find that smart TV app network fingerprinting is feasible and effective: even the least prevalent type of fingerprint manifests itself in at least 68% of apps of each platform, and up to 89% of fingerprints uniquely identify a specific app when two fingerprinting techniques are used together. By analyzing apps that exhibit identical fingerprints, we find that these apps often stem from the same developer or “no code” app generation toolkit. Furthermore, we show that many apps that are present on all three platforms exhibit platformspecific fingerprints. 
    more » « less
  5. Mobile apps are widely used and often process users’ sensitive data. Many taint analysis tools have been applied to analyze sensitive information flows and report data leaks in apps. These tools require a list of sources (where sensitive data is accessed) as input, and researchers have constructed such lists within the Android platform by identifying Android API methods that allow access to sensitive data. However, app developers may also define methods or use third-party library’s methods for accessing data. It is difficult to collect such source methods because they are unique to the apps, and there are a large number of third-party libraries available on the market that evolve over time. To address this problem, we propose DAISY, a Dynamic-Analysis-Induced Source discoverY approach for identifying methods that return sensitive information from apps and third-party libraries. Trained on an automatically labeled data set of methods and their calling context, DAISY identifies sensitive methods in unseen apps. We evaluated DAISY on real-world apps and the results show that DAISY can achieve an overall precision of 77.9% when reporting the most confident results. Most of the identified sources and leaks cannot be detected by existing technologies. 
    more » « less