The availability of trajectory data combined with various real-life practical applications has sparked the interest of the research community to design a plethora of algorithms for various trajectory analysis techniques. However, there is an apparent lack of full-fledged systems that provide the infrastructure support for trajectory analysis techniques, which hinders the applicability of most of the designed algorithms. Inspired by the tremendous success of the Bidirectional Encoder Representations from Transformers (BERT) deep learning model in solving various Natural Language Processing tasks, our vision is to have a BERT-like system for trajectory analysis tasks. We envision that in a few years, we will have such system where no one needs to worry again about each specific trajectory analysis operation. Whether it is trajectory imputation, similarity, clustering, or whatever, it would be one system that researchers, developers, and practitioners can deploy to get high accuracy for their trajectory operations. Our vision stands on a solid ground that trajectories in a space are highly analogous to statements in a language. We outline the challenges and the road to our vision. Exploratory results confirm the promise and possibility of our vision.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available June 30, 2025
-
Though data cleaning systems have earned great success and wide spread in both academia and industry, they fall short when trying to clean spatial data. The main reason is that state-of-the-art data cleaning systems mainly rely on functional dependency rules where there is sufficient co-occurrence of value pairs to learn that a certain value of an attribute leads to a corresponding value of another attribute. However, for spatial attributes that represent locations, there is very little chance that two records would have the same exact coordinates, and hence co-occurrence is unlikely to exist. This paper presents Sparcle (SPatially-AwaRe CLEaning); a novel framework that injects spatial awareness into the core engine of rule-based data cleaning systems through two main concepts: (1)
Spatial Neighborhood , where co-occurrence is relaxed to be within a certain spatial proximity rather than same exact value, and (2)Distance Weighting , where records are given different weights of whether they satisfy a dependency rule, based on their relative distance. Experimental results using a real deployment of Sparcle inside a state-of-the-art data cleaning system, and real and synthetic datasets, show that Sparcle significantly boosts the accuracy of data cleaning systems when dealing with spatial data.Free, publicly-accessible full text available May 1, 2025 -
Free, publicly-accessible full text available June 24, 2025
-
Free, publicly-accessible full text available May 13, 2025
-
Numerous important applications rely on detailed trajectory data. Yet, unfortunately, trajectory datasets are typically sparse with large spatial and temporal gaps between each two points, which is a major hurdle for their accuracy. This paper presents Kamel; a scalable trajectory imputation system that inserts additional realistic trajectory points, boosting the accuracy of trajectory applications. Kamel maps the trajectory imputation problem to
finding the missing word problem; a classical problem in the natural language processing (NLP) community. This allows employing the widely used BERT model for trajectory imputation. However, BERT, as is, does not lend itself to the special characteristics of trajectories. Hence, Kamel starts from BERT, but then adds spatial-awareness to its operations, adjusts trajectory data to be closer to the nature of language data, and adds multipoint imputation ability to it; all encapsulated in one system. Experimental results based on real datasets show that Kamel significantly outperforms its competitors and is applicable to city-scale trajectories, large gaps, and tight accuracy thresholds.Free, publicly-accessible full text available November 1, 2024