Skeleton-Based Methods for Speaker Action Classification on Lecture Videos

Xu, Fei; Davila, Kenny; Setlur, Srirangaraj; Govindaraju, Venu.

doi:10.1007/978-3-030-68799-1_18

Citation Details

Skeleton-Based Methods for Speaker Action Classification on Lecture Videos

The volume of online lecture videos is growing at a frenetic pace. This has led to an increased focus on methods for automated lecture video analysis to make these resources more accessible. These methods consider multiple information channels including the actions of the lecture speaker. In this work, we analyze two methods that use spatio-temporal features of the speaker skeleton for action classification in lecture videos. The first method is the AM Pose model which is based on Random Forests with motion-based features. The second is a state-of-the-art action classifier based on a two-stream adaptive graph convolutional network (2S-AGCN) that uses features of both joints and bones of the speaker skeleton. Each video is divided into fixed-length temporal segments. Then, the speaker skeleton is estimated on every frame in order to build a representation for each segment for further classification. Our experiments used the AccessMath dataset and a novel extension which will be publicly released. We compared four state-of-the-art pose estimators: OpenPose, Deep High Resolution, AlphaPose and Detectron2. We found that AlphaPose is the most robust to the encoding noise found in online videos. We also observed that 2S-AGCN outperforms the AM Pose model by using the right domain adaptations. more »

Award ID(s):: 1640867

NSF-PAR ID:: 10292324

Author(s) / Creator(s):: Xu, Fei; Davila, Kenny; Setlur, Srirangaraj; Govindaraju, Venu.

Editor(s):: Del Bimbo, Alberto; Cucchiara, Rita; Sclaroff, Stan; Farinella, Giovanni M; Mei, Tao; Bertini, Marco; Escalante, Hugo J; Vezzani, Roberto.

Date Published:: 2021-03-05

Journal Name:: Lecture notes in computer science

Volume:: 12664

ISSN:: 0302-9743

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1007/978-3-030-68799-1_18

More Like this