This paper develops the partial trajectory method to align the views from successive fixed cameras that are used for video-based vehicle tracking across multiple camera views. The method is envisioned to serve as a validation tool of whatever alignment has already been performed between the cameras to ensure high fidelity with the actual vehicle movements as they cross the boundaries between cameras. The strength of the method is that it operates on the output of vehicle tracking in each camera rather than secondary features visible in the camera view that are unrelated to the traffic dynamics (e.g., fixed fiducial points). Thereby providing a direct feedback path from the tracking to ensure the quality of the alignment in the context of the traffic dynamics. The method uses vehicle trajectories within successive camera views along a freeway to deduce the presence of an overlap or a gap between those cameras and quantify how large the overlap or gap is. The partial trajectory method can also detect scale factor errors between successive cameras. If any error is detected, ideally one would redo the original camera alignment, if that is not possible, one could use the calculations from the algorithm to post hoc address the existing alignment. This research manually re-extracted the individual vehicle trajectories within each of the seven camera views from the NGSIM I-80 dataset. These trajectories are simply an input to the algorithm. The resulting method transcends the dataset and should be applicable to most methods that seek to extract vehicle trajectories across successive cameras. That said, the results reveal fundamental errors in the NGSIM dataset, including unaccounted for overlap at the boundaries between successive cameras, which leads to systematic speed and acceleration errors at the six camera interfaces. This method also found scale factor errors in the original NGSIM homographies. In response to these findings, we identified a new aerial photo of the NGSIM site and generated new homographies. To evaluate the impact of the partial trajectory method on the actual trajectory data, the manually re-extracted data were projected into the new coordinate system and smoothed. The re-extracted data shows much greater fidelity to the actual vehicle motion. The re-extracted data also tracks the vehicles over a 14% longer distance and adds 23% more vehicles compared to the original NGSIM dataset. As of publication, the re-extracted data from this paper will be released to the research community.
more »
« less
Fast Vehicle Identification in Surveillance via Ranked Semantic Sampling Based Embedding
Identifying vehicles across cameras in traffic surveillance is fundamentally important for public safety purposes. However, despite some preliminary work, the rapid vehicle search in large-scale datasets has not been investigated. Moreover, modelling a view-invariant similarity between vehicle images from different views is still highly challenging. To address the problems, in this paper, we propose a Ranked Semantic Sampling (RSS) guided binary embedding method for fast cross-view vehicle Re-IDentification (Re-ID). The search can be conducted by efficiently computing similarities in the projected space. Unlike previous methods using random sampling, we design tree-structured attributes to guide the mini-batch sampling. The ranked pairs of hard samples in the mini-batch can improve the convergence of optimization. By minimizing a novel ranked semantic distance loss defined according to the structure, the learned Hamming distance is view-invariant, which enables cross-view Re-ID. The experimental results demonstrate that RSS outperforms the state-of-the-art approaches and the learned embedding from one dataset can be transferred to achieve the task of vehicle Re-ID on another dataset.
more »
« less
- Award ID(s):
- 1633753
- PAR ID:
- 10074628
- Date Published:
- Journal Name:
- 27th International Joint Conference on Artificial Intelligence (IJCAI 2018)
- Page Range / eLocation ID:
- 3697 to 3703
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In natural language processing, most models try to learn semantic representations merely from texts. The learned representations encode the “distributional semantics” but fail to connect to any knowledge about the physical world. In contrast, humans learn language by grounding concepts in perception and action and the brain encodes “grounded semantics” for cognition. Inspired by this notion and recent work in vision-language learning, we design a two-stream model for grounding language learning in vision. The model includes a VGG-based visual stream and a Bert-based language stream. The two streams merge into a joint representational space. Through cross-modal contrastive learning, the model first learns to align visual and language representations with the MS COCO dataset. The model further learns to retrieve visual objects with language queries through a cross-modal attention module and to infer the visual relations between the retrieved objects through a bilinear operator with the Visual Genome dataset. After training, the model’s language stream is a stand-alone language model capable of embedding concepts in a visually grounded semantic space. This semantic space manifests principal dimensions explainable with human intuition and neurobiological knowledge. Word embeddings in this semantic space are predictive of human-defined norms of semantic features and are segregated into perceptually distinctive clusters. Furthermore, the visually grounded language model also enables compositional language understanding based on visual knowledge and multimodal image search with queries based on images, texts, or their combinations.more » « less
-
ABSTRACT Retrieval and recommendation are two essential tasks in modern search tools. This paper introduces a novel retrieval‐reranking framework leveraging large language models to enhance the spatiotemporal and semantic associated mining and recommendation of relevant, unusual climate and environmental events described in news articles and web posts. This framework uses advanced natural language processing techniques to address the limitations of traditional manual curation methods in terms of high labor costs and lack of scalability. Specifically, we explore an optimized solution to employ cutting‐edge embedding models for semantically analyzing spatiotemporal events (news) and propose a Geo‐Time Re‐ranking strategy that integrates multi‐faceted criteria including spatial proximity, temporal association, semantic similarity, and category‐instructed similarity to rank and identify similar spatiotemporal events. We apply the proposed framework to a dataset of four thousand local environmental observer network events, achieving top performance on recommending similar events among multiple cutting‐edge dense retrieval models. The search and recommendation pipeline can be applied to a wide range of similar data search tasks dealing with geospatial and temporal data. We hope that by linking relevant events, we can better aid the general public to gain enhanced understanding on climate change and its impact on different communities.more » « less
-
Table search aims to answer a query with a ranked list of tables. Unfortunately, current test corpora have focused mostly on needle- in-the-haystack tasks, where only a few tables are expected to exactly match the query intent. Instead, table search tasks often arise in response to the need for retrieving new datasets or augment- ing existing ones, e.g., for data augmentation within data science or machine learning pipelines. Existing table repositories and bench- marks are limited in their ability to test retrieval methods for table search tasks. Thus, to close this gap, we introduce a novel dataset for query-by-example Semantic Table Search. This novel dataset con- sists of two snapshots of the large-scale Wikipedia tables collection from 2013 and 2019 with two important additions: (1) a page and topic aware ground truth relevance judgment and (2) a large-scale DBpedia entity linking annotation. Moreover, we generate a novel set of entity-centric queries that allows testing existing methods under a novel search scenario: semantic exploratory search. The resulting resource consists of 9,296 novel queries, 610,553 query- table relevance annotations, and 238,038 entity-linked tables from the 2013 snapshot. Similarly, on the 2019 snapshot, the resource consists of 2,560 queries, 958,214 relevance annotations, and 457,714 total tables. This makes our resource the largest annotated table- search corpus to date (97 times more queries and 956 times more annotated tables than any existing benchmark). We perform a user study among domain experts and prove that these annotators agree with the automatically generated relevance annotations. As a re- sult, we can re-evaluate some basic assumptions behind existing table search approaches identifying their shortcomings along with promising novel research directions.more » « less
-
null (Ed.)In search applications, autonomous unmanned vehicles must be able to efficiently reacquire and localize mobile targets that can remain out of view for long periods of time in large spaces. As such, all available information sources must be actively leveraged - including imprecise but readily available semantic observations provided by humans. To achieve this, this work develops and validates a novel collaborative human-machine sensing solution for dynamic target search. Our approach uses continuous partially observable Markov decision process (CPOMDP) planning to generate vehicle trajectories that optimally exploit imperfect detection data from onboard sensors, as well as semantic natural language observations that can be specifically requested from human sensors. The key innovation is a scalable hierarchical Gaussian mixture model formulation for efficiently solving CPOMDPs with semantic observations in continuous dynamic state spaces. The approach is demonstrated and validated with a real human-robot team engaged in dynamic indoor target search and capture scenarios on a custom testbed.more » « less
An official website of the United States government

