Existing methods for pedestrian motion trajectory prediction are learning and predicting the trajectories in the 2D image space. In this work, we observe that it is much more efficient to learn and predict pedestrian trajectories in the 3D space since the human motion occurs in the 3D physical world and and their behavior patterns are better represented in the 3D space. To this end, we use a stereo camera system to detect and track the human pose with deep neural networks. During pose estimation, these twin deep neural networks satisfy the stereo consistence constraint. We adapt the existing SocialGAN method to perform pedestrian motion trajectory prediction from the 2D to the 3D space. Our extensive experimental results demonstrate that our proposed method significantly improves the pedestrian trajectory prediction performance, outperforming existing state-of-the-art methods.
more »
« less
Learning a Pedestrian Social Behavior Dictionary
Understanding pedestrian behavior patterns is key for building autonomous agents that can navigate among humans. We seek a learned dictionary of pedestrian behavior to obtain a semantic description of pedestrian trajectories. Supervised methods for dictionary learning are often impractical since pedestrian behaviors may be unknown a priori and manually generating behavior labels is prohibitively time consuming. We utilize a novel, unsupervised framework to create a taxonomy of pedestrian behavior observed in a specific space. First, we learn a trajectory latent space that enables unsupervised clustering to create an interpretable pedestrian behavior dictionary. Then, we show the utility of this dictionary for building pedestrian behavior maps to visualize space usage patterns and for computing distributions of behaviors in a space. We demonstrate a simple but effective trajectory prediction by conditioning on these behavior labels. While many trajectory analysis methods rely on RNNs or transformers, we develop a lightweight, low-parameter approach and show results outperforming SOTA on the ETH and UCY datasets.
more »
« less
- Award ID(s):
- 2021628
- PAR ID:
- 10524561
- Publisher / Repository:
- British Machine Vision Conference 2023
- Date Published:
- Subject(s) / Keyword(s):
- computer vision, robotics, AI, social navigation, pedestrian trajectory, space usage
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G.M.; Hassner, T. (Ed.)Predicting pedestrian movement is critical for human behavior analysis and also for safe and efficient human-agent interactions. However, despite significant advancements, it is still challenging for existing approaches to capture the uncertainty and multimodality of human navigation decision making. In this paper, we propose SocialVAE, a novel approach for human trajectory prediction. The core of SocialVAE is a timewise variational autoencoder architecture that exploits stochastic recurrent neural networks to perform prediction, combined with a social attention mechanism and a backward posterior approximation to allow for better extraction of pedestrian navigation strategies. We show that SocialVAE improves current state-of-the-art performance on several pedestrian trajectory prediction benchmarks, including the ETH/UCY benchmark, Stanford Drone Dataset, and SportVU NBA movement dataset.more » « less
-
Predicting the future trajectories of multiple interacting pedestrians within a scene has increasingly gained importance in various fields, e.g., autonomous driving, human–robot interaction, and so on. The complexity of this problem is heightened due to the social dynamics among different pedestrians and their heterogeneous implicit preferences. In this paper, we present Information Maximizing Spatial-Temporal Graph Convolutional Attention Network (InfoSTGCAN), which takes into account both pedestrian interactions and heterogeneous behavior choice modeling. To effectively capture the complex interactions among pedestrians, we integrate spatial-temporal graph convolution and spatial-temporal graph attention. For grasping the heterogeneity in pedestrians’ behavior choices, our model goes a step further by learning to predict an individual-level latent code for each pedestrian. Each latent code represents a distinct pattern of movement choice. Finally, based on the observed historical trajectory and the learned latent code, the proposed method is trained to cover the ground-truth future trajectory of this pedestrian with a bi-variate Gaussian distribution. We evaluate the proposed method through a comprehensive list of experiments and demonstrate that our method outperforms all baseline methods on the commonly used metrics, Average Displacement Error and Final Displacement Error. Notably, visualizations of the generated trajectories reveal our method’s capacity to handle different scenarios.more » « less
-
Abstract Background Inertial measurement units (IMUs) with high-resolution sensors such as accelerometers are now used extensively to study fine-scale behavior in a wide range of marine and terrestrial animals. Robust and practical methods are required for the computationally-demanding analysis of the resulting large datasets, particularly for automating classification routines that construct behavioral time series and time-activity budgets. Magnetometers are used increasingly to study behavior, but it is not clear how these sensors contribute to the accuracy of behavioral classification methods. Development of effective classification methodology is key to understanding energetic and life-history implications of foraging and other behaviors. Methods We deployed accelerometers and magnetometers on four species of free-ranging albatrosses and evaluated the ability of unsupervised hidden Markov models (HMMs) to identify three major modalities in their behavior: ‘flapping flight’, ‘soaring flight’, and ‘on-water’. The relative contribution of each sensor to classification accuracy was measured by comparing HMM-inferred states with expert classifications identified from stereotypic patterns observed in sensor data. Results HMMs provided a flexible and easily interpretable means of classifying behavior from sensor data. Model accuracy was high overall (92%), but varied across behavioral states (87.6, 93.1 and 91.7% for ‘flapping flight’, ‘soaring flight’ and ‘on-water’, respectively). Models built on accelerometer data alone were as accurate as those that also included magnetometer data; however, the latter were useful for investigating slow and periodic behaviors such as dynamic soaring at a fine scale. Conclusions The use of IMUs in behavioral studies produces large data sets, necessitating the development of computationally-efficient methods to automate behavioral classification in order to synthesize and interpret underlying patterns. HMMs provide an accessible and robust framework for analyzing complex IMU datasets and comparing behavioral variation among taxa across habitats, time and space.more » « less
-
A search engine's ability to retrieve desirable datasets is important for data sharing and reuse. Existing dataset search engines typically rely on matching queries to dataset descriptions. However, a user may not have enough prior knowledge to write a query using terms that match with description text. We propose a novel schema label generation model which generates possible schema labels based on dataset table content. We incorporate the generated schema labels into a mixed ranking model which not only considers the relevance between the query and dataset metadata but also the similarity between the query and generated schema labels. To evaluate our method on real-world datasets, we create a new benchmark specifically for the dataset retrieval task. Experiments show that our approach can effectively improve the precision and NDCG scores of the dataset retrieval task compared with baseline methods. We also test on a collection of Wikipedia tables to show that the features generated from schema labels can improve the unsupervised and supervised web table retrieval task as well.more » « less