Human mobility modeling has many applications in location-based services, mobile networking, city management, and epidemiology. Previous sensing approaches for human mobility are mainly categorized into two types: stationary sensing systems (e.g., surveillance cameras and toll booths) and mobile sensing systems (e.g., smartphone apps and vehicle tracking devices). However, stationary sensing systems only provide mobility information of human in limited coverage (e.g., camera-equipped roads) and mobile sensing systems only capture a limited number of people (e.g., people using a particular smartphone app). In this work, we design a novel system Mohen to model human mobility with a heterogeneous sensing system. The key novelty of Mohen is to fundamentally extend the sensing coverage of a large-scale stationary sensing system with a small-scale sensing system. Based on the evaluation on data from real-world urban sensing systems, our system outperforms them by 35% and achieves a competitive result to an Oracle method. 
                        more » 
                        « less   
                    
                            
                            CellSense: Human Mobility Recovery via Cellular Network Data Enhancement
                        
                    
    
            Data from the cellular network have been proved as one of the most promising way to understand large-scale human mobility for various ubiquitous computing applications due to the high penetration of cellphones and low collection cost. Existing mobility models driven by cellular network data suffer from sparse spatial-temporal observations because user locations are recorded with cellphone activities, e.g., calls, text, or internet access. In this paper, we design a human mobility recovery system called CellSense to take the sparse cellular billing data (CBR) as input and outputs dense continuous records to recover the sensing gap when using cellular networks as sensing systems to sense the human mobility. There is limited work on this kind of recovery systems at large scale because even though it is straightforward to design a recovery system based on regression models, it is very challenging to evaluate these models at large scale due to the lack of the ground truth data. In this paper, we explore a new opportunity based on the upgrade of cellular infrastructures to obtain cellular network signaling data as the ground truth data, which log the interaction between cellphones and cellular towers at signal levels (e.g., attaching, detaching, paging) even without billable activities. Based on the signaling data, we design a system CellSense for human mobility recovery by integrating collective mobility patterns with individual mobility modeling, which achieves the 35.3% improvement over the state-of-the-art models. The key application of our recovery model is to take regular sparse CBR data that a researcher already has, and to recover the missing data due to sensing gaps of CBR data to produce a dense cellular data for them to train a machine learning model for their use cases, e.g., next location prediction. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10436083
- Date Published:
- Journal Name:
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Volume:
- 5
- Issue:
- 3
- ISSN:
- 2474-9567
- Page Range / eLocation ID:
- 1 to 22
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Accurate and up-to-date digital road maps are the foundation of many mobile applications, such as navigation and autonomous driving. A manually-created map suffers from the high cost for creation and maintenance due to constant road network updating. Recently, the ubiquity of GPS devices in vehicular systems has led to an unprecedented amount of vehicle sensing data for map inference. Unfortunately, accurate map inference based on vehicle GPS is challenging for two reasons. First, it is challenging to infer complete road structures due to the sensing deviation, sparse coverage, and low sampling rate of GPS of a fleet of vehicles with similar mobility patterns, e.g., taxis. Second, a road map requires various road properties such as road categories, which is challenging to be inferred by just GPS locations of vehicles. In this paper, we design a map inference system called coMap by considering multiple fleets of vehicles with Complementary Mobility Features. coMap has two key components: a graph-based map sketching component, a learning-based map painting component. We implement coMap with the data from four type-aware vehicular sensing systems in one city, which consists of 18 thousand taxis, 10 thousand private vehicles, 6 thousand trucks, and 14 thousand buses. We conduct a comprehensive evaluation of coMap with two state-of-the-art baselines along with ground truth based on OpenStreetMap and a commercial map provider, i.e., Baidu Maps. The results show that (i) for the map sketching, our work improves the performance by 15.9%; (ii) for the map painting, our work achieves 74.58% of average accuracy on road category classification.more » « less
- 
            Producing dense 3D reconstructions from biological imaging data is a challenging instance segmentation task that requires significant ground-truth training data for effective and accurate deep learning-based models. Generating training data requires intense human effort to annotate each instance of an object across serial section images. Our focus is on the especially complicated brain neuropil, comprising an extensive interdigitation of dendritic, axonal, and glial processes visualized through serial section electron microscopy. We developed a novel deep learning-based method to generate dense 3D segmentations rapidly from sparse 2D annotations of a few objects on single sections. Models trained on the rapidly generated segmentations achieved similar accuracy as those trained on expert dense ground-truth annotations. Human time to generate annotations was reduced by three orders of magnitude and could be produced by non-expert annotators. This capability will democratize generation of training data for large image volumes needed to achieve brain circuits and measures of circuit strengths.more » « less
- 
            Sparse coding, which refers to modeling a signal as sparse linear combinations of the elements of a learned dictionary, has proven to be a successful (and interpretable) approach in applications such as signal processing, computer vision, and medical imaging. While this success has spurred much work on provable guarantees for dictionary recovery when the learned dictionary is the same size as the ground-truth dictionary, work on the setting where the learned dictionary is larger (or over-realized) with respect to the ground truth is comparatively nascent. Existing theoretical results in this setting have been constrained to the case of noise-less data. We show in this work that, in the presence of noise, minimizing the standard dictionary learning objective can fail to recover the elements of the ground-truth dictionary in the over-realized regime, regardless of the magnitude of the signal in the data-generating process. Furthermore, drawing from the growing body of work on self-supervised learning, we propose a novel masking objective for which recovering the ground-truth dictionary is in fact optimal as the signal increases for a large class of data-generating processes. We corroborate our theoretical results with experiments across several parameter regimes showing that our proposed objective also enjoys better empirical performance than the standard reconstruction objective.more » « less
- 
            Abstract Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima—those around which the loss grows slowly—appear to generalize well. This work takes a step towards understanding this phenomenon by focusing on the simplest class of overparameterized nonlinear models: those arising in low-rank matrix recovery. We analyse overparameterized matrix and bilinear sensing, robust principal component analysis, covariance matrix estimation and single hidden layer neural networks with quadratic activation functions. In all cases, we show that flat minima, measured by the trace of the Hessian, exactly recover the ground truth under standard statistical assumptions. For matrix completion, we establish weak recovery, although empirical evidence suggests exact recovery holds here as well. We complete the paper with synthetic experiments that illustrate our findings.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    