skip to main content


Search for: All records

Editors contains: "Farinella. G.M."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Avidan, S. ; Brostow, G. ; Cissé, M. ; Farinella. G.M. ; Hassner, T. (Ed.)
    Graph-based representations are becoming increasingly popular for representing and analyzing video data, especially in object tracking and scene understanding applications. Accordingly, an essential tool in this approach is to generate statistical inferences for graphical time series associated with videos. This paper develops a Kalman-smoothing method for estimating graphs from noisy, cluttered, and incomplete data. The main challenge here is to find and preserve the registration of nodes (salient detected objects) across time frames when the data has noise and clutter due to false and missing nodes. First, we introduce a quotient-space representation of graphs that incorporates temporal registration of nodes, and we use that metric structure to impose a dynamical model on graph evolution. Then, we derive a Kalman smoother, adapted to the quotient space geometry, to estimate dense, smooth trajectories of graphs. We demonstrate this framework using simulated data and actual video graphs extracted from the Multiview Extended Video with Activities (MEVA) dataset. This framework successfully estimates graphs despite the noise, clutter, and missed detections. 
    more » « less
  2. Avidan, S. ; Brostow, G. ; Cissé, M. ; Farinella, G.M. ; Hassner, T. (Ed.)
    Predicting pedestrian movement is critical for human behavior analysis and also for safe and efficient human-agent interactions. However, despite significant advancements, it is still challenging for existing approaches to capture the uncertainty and multimodality of human navigation decision making. In this paper, we propose SocialVAE, a novel approach for human trajectory prediction. The core of SocialVAE is a timewise variational autoencoder architecture that exploits stochastic recurrent neural networks to perform prediction, combined with a social attention mechanism and a backward posterior approximation to allow for better extraction of pedestrian navigation strategies. We show that SocialVAE improves current state-of-the-art performance on several pedestrian trajectory prediction benchmarks, including the ETH/UCY benchmark, Stanford Drone Dataset, and SportVU NBA movement dataset. 
    more » « less
  3. Avidan, S. ; Brostow, G. ; Cissé, M ; Farinella, G.M. ; Hassner, T. (Ed.)
    Event perception tasks such as recognizing and localizing actions in streaming videos are essential for scaling to real-world application contexts. We tackle the problem of learning actor-centered representations through the notion of continual hierarchical predictive learning to localize actions in streaming videos without the need for training labels and outlines for the objects in the video. We propose a framework driven by the notion of hierarchical predictive learning to construct actor-centered features by attention-based contextualization. The key idea is that predictable features or objects do not attract attention and hence do not contribute to the action of interest. Experiments on three benchmark datasets show that the approach can learn robust representations for localizing actions using only one epoch of training, i.e., a single pass through the streaming video. We show that the proposed approach outperforms unsupervised and weakly supervised baselines while offering competitive performance to fully supervised approaches. Additionally, we extend the model to multi-actor settings to recognize group activities while localizing the multiple, plausible actors. We also show that it generalizes to out-of-domain data with limited performance degradation. 
    more » « less
  4. Avidan, S. ; Brostow, G. ; Cissé, M. ; Farinella, G.M. ; Hassner, T. (Ed.)
    Wen, S., Wang, H., Metaxas, D. (2022). Social ODE: Multi-agent Trajectory Forecasting with Neural Ordinary Differential Equations. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13682. Springer, Cham. https://doi.org/10.1007/978-3-031-20047-2_13 Multi-agent trajectory forecasting has recently attracted a lot of attention due to its widespread applications including autonomous driving. Most previous methods use RNNs or Transformers to model agent dynamics in the temporal dimension and social pooling or GNNs to model interactions with other agents; these approaches usually fail to learn the underlying continuous temporal dynamics and agent interactions explicitly. To address these problems, we propose Social ODE which explicitly models temporal agent dynamics and agent interactions. Our approach leverages Neural ODEs to model continuous temporal dynamics, and incorporates distance, interaction intensity, and aggressiveness estimation into agent interaction modeling in latent space. We show in extensive experiments that our Social ODE approach compares favorably with state-of-the-art, and more importantly, can successfully avoid sudden obstacles and effectively control the motion of the agent, while previous methods often fail in such cases. 
    more » « less