skip to main content


Title: PrivateBus: Privacy Identification and Protection in Large-Scale Bus WiFi Systems
Recently, the ubiquity of mobile devices leads to an increasing demand of public network services, e.g., WiFi hot spots. As a part of this trend, modern transportation systems are equipped with public WiFi devices to provide Internet access for passengers as people spend a large amount of time on public transportation in their daily life. However, one of the key issues in public WiFi spots is the privacy concern due to its open access nature. Existing works either studied location privacy risk in human traces or privacy leakage in private networks such as cellular networks based on the data from cellular carriers. To the best of our knowledge, none of these work has been focused on bus WiFi privacy based on large-scale real-world data. In this paper, to explore the privacy risk in bus WiFi systems, we focus on two key questions how likely bus WiFi users can be uniquely re-identified if partial usage information is leaked and how we can protect users from the leaked information. To understand the above questions, we conduct a case study in a large-scale bus WiFi system, which contains 20 million connection records and 78 million location records from 770 thousand bus WiFi users during a two-month period. Technically, we design two models for our uniqueness analyses and protection, i.e., a PB-FIND model to identify the probability a user can be uniquely re-identified from leaked information; a PB-HIDE model to protect users from potentially leaked information. Specifically, we systematically measure the user uniqueness on users' finger traces (i.e., connection URL and domain), foot traces (i.e., locations), and hybrid traces (i.e., both finger and foot traces). Our measurement results reveal (i) 97.8% users can be uniquely re-identified by 4 random domain records of their finger traces and 96.2% users can be uniquely re-identified by 5 random locations on buses; (ii) 98.1% users can be uniquely re-identified by only 2 random records if both their connection records and locations are leaked to attackers. Moreover, the evaluation results show our PB-HIDE algorithm protects more than 95% users from the potentially leaked information by inserting only 1.5% synthetic records in the original dataset to preserve their data utility.  more » « less
Award ID(s):
1849238 1932223
NSF-PAR ID:
10436098
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume:
4
Issue:
1
ISSN:
2474-9567
Page Range / eLocation ID:
1 to 23
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In dynamic spectrum access (DSA), Environmental Sensing Capability (ESC) systems are implemented to detect the incumbent users' (IU) activities for protecting them from secondary users' (SU) interference as well as maximizing secondary spectrum usage. However, IU location information is often highly sensitive and hence it is preferable to hide its true location under the detection of ESCs. In this paper, we design novel schemes to preserve both static and moving IU's location information by adjusting IU's radiation pattern and transmit power. We first formulate IU privacy protection problem for static IU. Due to the intractable nature of this problem, we propose a heuristic approach based on sampling. We also formulate the privacy protection problem for moving IUs, in which two cases are analyzed: (1) protect IU's moving traces; (2) protect its real-time current location information. Our analysis provides insightful advice for IU to preserve its location privacy against ESCs. Simulation results show that our approach provides great protection for IU's location privacy. 
    more » « less
  2. Vincent Poor and Zhu Han (Ed.)
    Recently, blockchain has received much attention from the mobility-centric Internet of Things (IoT). It is deemed the key to ensuring the built-in integrity of information and security of immutability by design in the peer-to-peer network (P2P) of mobile devices. In a permissioned blockchain, the authority of the system has control over the identities of its users. Such information can allow an ill-intentioned authority to map identities with their spatiotemporal data, which undermines the location privacy of a mobile user. In this paper, we study the location privacy preservation problem in the context of permissioned blockchain-based IoT systems under three conditions. First, the authority of the blockchain holds the public and private key distribution task in the system. Second, there exists a spatiotemporal correlation between consecutive location-based transactions. Third, users communicate with each other through short-range communication technologies such that it constitutes a proof of location (PoL) on their actual locations. We show that, in a permissioned blockchain with an authority and a presence of a PoL, existing approaches cannot be applied using a plug-and-play approach to protect location privacy. In this context, we propose BlockPriv, an obfuscation technique that quantifies, both theoretically and experimentally, the relationship between privacy and utility in order to dynamically protect the privacy of sensitive locations in the permissioned blockchain. 
    more » « less
  3. Human mobility data may lead to privacy concerns because a resident can be re-identified from these data by malicious attacks even with anonymized user IDs. For an urban service collecting mobility data, an efficient privacy risk assessment is essential for the privacy protection of its users. The existing methods enable efficient privacy risk assessments for service operators to fast adjust the quality of sensing data to lower privacy risk by using prediction models. However, for these prediction models, most of them require massive training data, which has to be collected and stored first. Such a large-scale long-term training data collection contradicts the purpose of privacy risk prediction for new urban services, which is to ensure that the quality of high-risk human mobility data is adjusted to low privacy risk within a short time. To solve this problem, we present a privacy risk prediction model based on transfer learning, i.e., TransRisk, to predict the privacy risk for a new target urban service through (1) small-scale short-term data of its own, and (2) the knowledge learned from data from other existing urban services. We envision the application of TransRisk on the traffic camera surveillance system and evaluate it with real-world mobility datasets already collected in a Chinese city, Shenzhen, including four source datasets, i.e., (i) one call detail record dataset (CDR) with 1.2 million users; (ii) one cellphone connection data dataset (CONN) with 1.2 million users; (iii) a vehicular GPS dataset (Vehicles) with 10 thousand vehicles; (iv) an electronic toll collection transaction dataset (ETC) with 156 thousand users, and a target dataset, i.e., a camera dataset (Camera) with 248 cameras. The results show that our model outperforms the state-of-the-art methods in terms of RMSE and MAE. Our work also provides valuable insights and implications on mobility data privacy risk assessment for both current and future large-scale services. 
    more » « less
  4. Urban anomalies have a large impact on passengers' travel behavior and city infrastructures, which can cause uncertainty on travel time estimation. Understanding the impact of urban anomalies on travel time is of great value for various applications such as urban planning, human mobility studies and navigation systems. Most existing studies on travel time have been focused on the total riding time between two locations on an individual transportation modality. However, passengers often take different modes of transportation, e.g., taxis, subways, buses or private vehicles, and a significant portion of the travel time is spent in the uncertain waiting. In this paper, we study the fine-grained travel time patterns in multiple transportation systems under the impact of urban anomalies. Specifically, (i) we investigate implicit components, including waiting and riding time, in multiple transportation systems; (ii) we measure the impact of real-world anomalies on travel time components; (iii) we design a learning-based model for travel time component prediction with anomalies. Different from existing studies, we implement and evaluate our measurement framework on multiple data sources including four city-scale transportation systems, which are (i) a 14-thousand taxicab network, (ii) a 13-thousand bus network, (iii) a 10-thousand private vehicle network, and (iv) an automatic fare collection system for a public transit network (i.e., subway and bus) with 5 million smart cards. 
    more » « less
  5. Although a great deal of research has examined interventions to help users protect their own information online, less work has examined methods for reducing interdependent privacy (IDP) violations on social media (i.e., sharing of other people's information). This study tested the effectiveness of concept-based (i.e., general information), fact-based (i.e., statistics), and narrative-based (i.e., stories) educational videos in altering IDP-relevant attitudes and multimedia sharing behaviors. Our study revealed concept and fact videos reduced sharing of social media content that portrayed people negatively. The narrative intervention backfired and increased sharing among participants who did not believe IDP violations to be especially serious; however, the narrative intervention decreased sharing for participants who rated IDP violations as more serious. Notably, our study found participants preferred narrative-based interventions with real world examples, despite other strategies more effectively reducing sharing. Implications for narrative transportation theory and advancing bottom-up (i.e., user-centered) psychosocial interventions are discussed. 
    more » « less