skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.


Title: CellSense: Human Mobility Recovery via Cellular Network Data Enhancement
Data from the cellular network have been proved as one of the most promising way to understand large-scale human mobility for various ubiquitous computing applications due to the high penetration of cellphones and low collection cost. Existing mobility models driven by cellular network data suffer from sparse spatial-temporal observations because user locations are recorded with cellphone activities, e.g., calls, text, or internet access. In this paper, we design a human mobility recovery system called CellSense to take the sparse cellular billing data (CBR) as input and outputs dense continuous records to recover the sensing gap when using cellular networks as sensing systems to sense the human mobility. There is limited work on this kind of recovery systems at large scale because even though it is straightforward to design a recovery system based on regression models, it is very challenging to evaluate these models at large scale due to the lack of the ground truth data. In this paper, we explore a new opportunity based on the upgrade of cellular infrastructures to obtain cellular network signaling data as the ground truth data, which log the interaction between cellphones and cellular towers at signal levels (e.g., attaching, detaching, paging) even without billable activities. Based on the signaling data, we design a system CellSense for human mobility recovery by integrating collective mobility patterns with individual mobility modeling, which achieves the 35.3% improvement over the state-of-the-art models. The key application of our recovery model is to take regular sparse CBR data that a researcher already has, and to recover the missing data due to sensing gaps of CBR data to produce a dense cellular data for them to train a machine learning model for their use cases, e.g., next location prediction.  more » « less
Award ID(s):
1849238 2047822 1951890 1952096 2003874 1932223
PAR ID:
10436083
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume:
5
Issue:
3
ISSN:
2474-9567
Page Range / eLocation ID:
1 to 22
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Producing dense 3D reconstructions from biological imaging data is a challenging instance segmentation task that requires significant ground-truth training data for effective and accurate deep learning-based models. Generating training data requires intense human effort to annotate each instance of an object across serial section images. Our focus is on the especially complicated brain neuropil, comprising an extensive interdigitation of dendritic, axonal, and glial processes visualized through serial section electron microscopy. We developed a novel deep learning-based method to generate dense 3D segmentations rapidly from sparse 2D annotations of a few objects on single sections. Models trained on the rapidly generated segmentations achieved similar accuracy as those trained on expert dense ground-truth annotations. Human time to generate annotations was reduced by three orders of magnitude and could be produced by non-expert annotators. This capability will democratize generation of training data for large image volumes needed to achieve brain circuits and measures of circuit strengths. 
    more » « less
  2. Human mobility modeling has many applications in location-based services, mobile networking, city management, and epidemiology. Previous sensing approaches for human mobility are mainly categorized into two types: stationary sensing systems (e.g., surveillance cameras and toll booths) and mobile sensing systems (e.g., smartphone apps and vehicle tracking devices). However, stationary sensing systems only provide mobility information of human in limited coverage (e.g., camera-equipped roads) and mobile sensing systems only capture a limited number of people (e.g., people using a particular smartphone app). In this work, we design a novel system Mohen to model human mobility with a heterogeneous sensing system. The key novelty of Mohen is to fundamentally extend the sensing coverage of a large-scale stationary sensing system with a small-scale sensing system. Based on the evaluation on data from real-world urban sensing systems, our system outperforms them by 35% and achieves a competitive result to an Oracle method.

     
    more » « less
  3. Accurate and up-to-date digital road maps are the foundation of many mobile applications, such as navigation and autonomous driving. A manually-created map suffers from the high cost for creation and maintenance due to constant road network updating. Recently, the ubiquity of GPS devices in vehicular systems has led to an unprecedented amount of vehicle sensing data for map inference. Unfortunately, accurate map inference based on vehicle GPS is challenging for two reasons. First, it is challenging to infer complete road structures due to the sensing deviation, sparse coverage, and low sampling rate of GPS of a fleet of vehicles with similar mobility patterns, e.g., taxis. Second, a road map requires various road properties such as road categories, which is challenging to be inferred by just GPS locations of vehicles. In this paper, we design a map inference system called coMap by considering multiple fleets of vehicles with Complementary Mobility Features. coMap has two key components: a graph-based map sketching component, a learning-based map painting component. We implement coMap with the data from four type-aware vehicular sensing systems in one city, which consists of 18 thousand taxis, 10 thousand private vehicles, 6 thousand trucks, and 14 thousand buses. We conduct a comprehensive evaluation of coMap with two state-of-the-art baselines along with ground truth based on OpenStreetMap and a commercial map provider, i.e., Baidu Maps. The results show that (i) for the map sketching, our work improves the performance by 15.9%; (ii) for the map painting, our work achieves 74.58% of average accuracy on road category classification. 
    more » « less
  4. Abstract

    Wide-scale sensing of natural and human-made events is critical for protecting against environmental disasters and reducing the monetary losses associated with telecommunication service downtime. However, achieving dense sensing coverage is difficult, given the high deployment overhead of modern sensor networks. Here we offer an in-depth exploration of state-of-polarization sensing over fiber-optic networks using unmodified optical transceivers to establish a strong correlation with ground truth distributed acoustic sensing. To validate our sensing methodology, we collect 85 days of polarization and distributed acoustic sensing measurements along two colocated, 50 km fiber-optic cables in Southern California. We then examine how polarization sensing can improve network reliability by accurately modeling overall network health and preemptively detecting traffic loss. Finally, we explore the feasibility of wide-scale seismic monitoring with polarization sensing, showcasing the polarization perturbations following low-intensity earthquakes and the potential to more than double seismic monitoring coverage in Southern California alone.

     
    more » « less
  5. Sparse coding, which refers to modeling a signal as sparse linear combinations of the elements of a learned dictionary, has proven to be a successful (and interpretable) approach in applications such as signal processing, computer vision, and medical imaging. While this success has spurred much work on provable guarantees for dictionary recovery when the learned dictionary is the same size as the ground-truth dictionary, work on the setting where the learned dictionary is larger (or over-realized) with respect to the ground truth is comparatively nascent. Existing theoretical results in this setting have been constrained to the case of noise-less data. We show in this work that, in the presence of noise, minimizing the standard dictionary learning objective can fail to recover the elements of the ground-truth dictionary in the over-realized regime, regardless of the magnitude of the signal in the data-generating process. Furthermore, drawing from the growing body of work on self-supervised learning, we propose a novel masking objective for which recovering the ground-truth dictionary is in fact optimal as the signal increases for a large class of data-generating processes. We corroborate our theoretical results with experiments across several parameter regimes showing that our proposed objective also enjoys better empirical performance than the standard reconstruction objective. 
    more » « less