FFNet: Video Fast-Forwarding via Reinforcement Learning

Lan, S.; Panda, R.; Zhu, Q.; Roy-Chowdhury, A.

For many applications with limited computation, com- munication, storage and energy resources, there is an im- perative need of computer vision methods that could select an informative subset of the input video for efficient pro- cessing at or near real time. In the literature, there are two relevant groups of approaches: generating a “trailer” for a video or fast-forwarding while watching/processing the video. The first group is supported by video summa- rization techniques, which require processing of the entire video to select an important subset for showing to users. In the second group, current fast-forwarding methods de- pend on either manual control or automatic adaptation of playback speed, which often do not present an accurate rep- resentation and may still require processing of every frame. In this paper, we introduce FastForwardNet (FFNet), a re- inforcement learning agent that gets inspiration from video summarization and does fast-forwarding differently. It is an online framework that automatically fast-forwards a video and presents a representative subset of frames to users on the fly. It does not require processing the entire video, but just the portion that is selected by the fast-forward agent, which makes the process very computationally efficient. The online nature of our proposed method also enables the users to begin fast-forwarding at any point of the video. Experiments on two real-world datasets demonstrate that our method can provide better representation of the input video (about 6%-20% improvement on coverage of impor- tant frames) with much less processing requirement (more than 80% reduction in the number of frames processed).

More Like this