skip to main content


Search for: All records

Award ID contains: 1650994

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Bartoli, A ; Fusiello, A (Ed.)
    We propose an improved discriminative model prediction method for robust long-term tracking based on a pre-trained short-term tracker. The baseline pre-trained short-term tracker is SuperDiMP which combines the bounding-box regressor of PrDiMP with the standard DiMP classifier. Our tracker RLT-DiMP improves SuperDiMP in the follow- ing three aspects: (1) Uncertainty reduction using random erasing: To make our model robust, we exploit an agreement from multiple im- ages after erasing random small rectangular areas as a certainty. And then, we correct the tracking state of our model accordingly. (2) Ran- dom search with spatio-temporal constraints: we propose a robust ran- dom search method with a score penalty applied to prevent the prob- lem of sudden detection at a distance. (3) Background augmentation for more discriminative feature learning: We augment various backgrounds that are not included in the search area to train a more robust model in the background clutter. In experiments on the VOT-LT2020 bench- mark dataset, the proposed method achieves comparable performance to the state-of-the-art long-term trackers. The source code is available at: https://github.com/bismex/RLT-DIMP. 
    more » « less
  2. Vedaldi, Andrea ; Bischof, Horst ; Brox, Thomas ; Frahm, Jan-Michael (Ed.)
    This paper focuses on the problem of predicting future trajectories of people in unseen scenarios and camera views. We propose a method to efficiently utilize multi-view 3D simulation data for training. Our approach finds the hardest camera view to mix up with adversarial data from the original camera view in training, thus enabling the model to learn robust representations that can generalize to unseen camera views. We refer to our method as SimAug. We show that SimAug achieves best results on three out-of-domain real-world benchmarks, as well as getting state-of-the-art in the Stanford Drone and the VIRAT/ActEV dataset with in-domain training data. We will release our models and code. 
    more » « less
  3. null (Ed.)
    This paper studies the problem of predicting the distribution over multiple possible future paths of people as they move through various visual scenes. We make two main contributions. The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals. This provides the first benchmark for quantitative evaluation of the models to predict multi-future trajectories. The second contribution is a new model to generate multiple plausible future trajectories, which contains novel designs of using multi-scale location encodings and convolutional RNNs over graphs. We refer to our model as Multiverse. We show that our model achieves the best results on our dataset, as well as on the real-world VIRAT/ActEV dataset (which just contains one possible future). 
    more » « less