Incorporating Scalability in Unsupervised Spatio-Temporal Feature Learning

Paul, S.; Roy, S.; Roy-Chowdhury, A.

Citation Details

Deep neural networks are efficient learning machines which leverage upon a large amount of manually labeled data for learning discriminative features. However, acquiring substantial amount of supervised data, especially for videos can be a tedious job across various computer vision tasks. This necessitates learning of visual features from videos in an unsupervised setting. In this paper, we propose a computationally simple, yet effective, framework to learn spatio-temporal feature embedding from unlabeled videos. We train a Convolutional 3D Siamese network using positive and negative pairs mined from videos under certain probabilistic assumptions. Experimental results on three datasets demonstrate that our proposed framework is able to learn weights which can be used for same as well as cross dataset and tasks. more »

Award ID(s):: 1724341

PAR ID:: 10067976

Author(s) / Creator(s):: Paul, S.; Roy, S.; Roy-Chowdhury, A.

Date Published:: 2018-04-01

Journal Name:: IEEE Intl. Conf. on Acoustics, Speech and Signal Processing

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this