With the rapid growth of online social media and ubiquitous Internet connectivity, social sensing has emerged as a new crowdsourcing application paradigm of collecting observations (often called claims) about the physical environment from humans or devices on their behalf. A fundamental problem in social sensing applications lies in effectively ascertaining the correctness of claims and the reliability of data sources without knowing either of them a priori, which is referred to as truth discovery. While significant progress has been made to solve the truth discovery problem, some important challenges have not been well addressed yet. First, existing truth discovery solutions did not fully solve the dynamic truth discovery problem where the ground truth of claims changes over time. Second, many current solutions are not scalable to large-scale social sensing events because of the centralized nature of their truth discovery algorithms. Third, the heterogeneity and unpredictability of the social sensing data traffic pose additional challenges to the resource allocation and system responsiveness. In this paper, we developed a Scalable Streaming Truth Discovery (SSTD) solution to address the above challenges. In particular, we first developed a dynamic truth discovery scheme based on Hidden Markov Models (HMM) to effectively infer the evolving truth of reported claims. We further developed a distributed framework to imple- ment the dynamic truth discovery scheme using Work Queue in HTCondor system. We also integrated the SSTD scheme with an optimal workload allocation mechanism to dynamically allocate the resources (e.g., cores, memories) to the truth discovery tasks based on their computation requirements. We evaluated SSTD through real world social sensing applications using Twitter data feeds. The evaluation results on three real-world data traces (i.e., Boston Bombing, Paris Shooting and College Football) show that the SSTD scheme is scalable and outperforms the state-of-the- art truth discovery methods in terms of both effectiveness and efficiency.
more »
« less
Robust Truth Discovery against Data Poisoning in Mobile Crowdsensing
Nowadays most mobile devices are equipped with advanced sensors, enabling the measurement of information about surrounding environment or social settings. The ubiquity of mobile devices makes them the perfect platform for massive data collection, which motivates the emergence of mobile crowdsensing paradigm. However, due to the inherent noisy nature of the sensing process and the limited capability of low-cost commodity sensors, crowdsensed information tends to be less reliable compared with sensing results through dedicated sensing hardware, and multiple crowdsensing sources may conflict with each other. Thus, it is important to resolve conflicts in the collected data and discover the underlying truth. Traditional truth discovery approaches usually estimate the reliability of data sources and predict the truth value based on source reliability. However, recent data poisoning attacks greatly degrade the performance of existing truth discovery algorithms, where attackers aim to maximize the utility loss. In this paper, we investigate the data poisoning attacks on truth discovery and propose a robust approach against such attacks through additional source estimation and source filtering before data aggregation. Based on real-world data, we simulate our approach and evaluate its performance under data poisoning attacks, demonstrating the robustness of our approach.
more »
« less
- Award ID(s):
- 1850523
- PAR ID:
- 10183060
- Date Published:
- Journal Name:
- 2019 IEEE Global Communications Conference (GLOBECOM)
- Page Range / eLocation ID:
- 1 to 6
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Spot-level parking availability information (the availability of each spot in a parking lot) is in great demand, as it can help reduce time and energy waste while searching for a parking spot. In this article, we propose a crowdsensing system called SpotE that can provide spot-level availability in a parking lot using drivers’ smartphone sensors. SpotE only requires the sensor data from drivers’ smartphones, which avoids the high cost of installing additional sensors and enables large-scale outdoor deployment. We propose a new model that can use the parking search trajectory and final destination (e.g., an exit of the parking lot) of a single driver in a parking lot to generate the probability profile that contains the probability of each spot being occupied in a parking lot. To deal with conflicting estimation results generated from different drivers, due to the variance in different drivers’ parking behaviors, a novel aggregation approach SpotE-TD is proposed. The proposed aggregation method is based on truth discovery techniques and can handle the variety in Quality of Information of different vehicles. We evaluate our proposed method through a real-life deployment study. Results show that SpotE-TD can efficiently provide spot-level parking availability information with a 20% higher accuracy than the state-of-the-art.more » « less
-
Rapid growth of sensors and the Internet of Things is transforming society, the economy and the quality of life. Many devices at the extreme edge collect and transmit sensitive information wirelessly for remote computing. The device behavior can be monitored through side-channel emissions, including power consumption and electromagnetic (EM) emissions. This study presents a holistic self-testing approach incorporating nanoscale EM sensing devices and an energy-efficient learning module to detect security threats and malicious attacks directly at the front-end sensors. The built-in threat detection approach using the intelligent EM sensors distributed on the power lines is developed to detect abnormal data activities without degrading the performance while achieving good energy efficiency. The minimal usage of energy and space can allow the energy-constrained wireless devices to have an on-chip detection system to predict malicious attacks rapidly in the front line.more » « less
-
Today’s smart-grids have seen a clear rise in new ways of energy generation, transmission, and storage. This has not only introduced a huge degree of variability, but also a continual shift away from traditionally centralized generation and storage to distributed energy resources (DERs). In addition, the distributed sensors, energy generators and storage devices, and networking have led to a huge increase in attack vectors that make the grid vulnerable to a variety of attacks. The interconnection between computational and physical components through a largely open, IP-based communication network enables an attacker to cause physical damage through remote cyber-attacks or attack on software-controlled grid operations via physical- or cyber-attacks. Transactive Energy (TE) is an emerging approach for managing increasing DERs in the smart-grids through economic and control techniques. Transactive Smart-Grids use the TE approach to improve grid reliability and efficiency. However, skepticism remains in their full-scale viability for ensuring grid reliability. In addition, different TE approaches, in specific situations, can lead to very different outcomes in grid operations. In this paper, we present a comprehensive web-based platform for evaluating resilience of smart-grids against a variety of cyber- and physical-attacks and evaluating impact of various TE approaches on grid performance. We also provide several case-studies demonstrating evaluation of TE approaches as well as grid resilience against cyber and physical attacks.more » « less
-
While prior studies have designed incentive mechanisms to attract the public to share their collected data, they tend to ignore information asymmetry between data requesters and collectors. In reality, the sensing costs information (time cost, battery drainage, bandwidth occupation of mobile devices, and so on) is the private information of collectors, which is unknown by the data requester. In this article, we model the strategic interactions between health-data requester and collectors using a bilevel optimization model. Considering that the crowdsensing market is open and the participants are equal, we propose a Walrasian equilibrium-based pricing mechanism to coordinate the interest conflicts between health-data requesters and collectors. Specifically, based on the exchange economic theory, we transform the bilevel optimization problem into a social welfare maximization problem with the constraint condition that the balance between supply and demand, and dual decomposition is then employed to divide the social welfare maximization problem into a set of subproblems that can be solved by health-data requesters and collectors. We prove that the optimal task price is equal to the marginal utility generated by the collector's health data. To avoid obtaining the collector's private information, a distributed iterative algorithm is then designed to obtain the optimal task pricing strategy. Furthermore, we conduct computational experiments to evaluate the performance of the proposed pricing mechanism and analyze the effects of intrinsic rewards, sensing costs on optimal task prices, and collectors' health-data supplies.more » « less
An official website of the United States government

