Autonomous Aesthetic Views Finding in Indoor Scenes with Deep Reinforcement Learning

Xie, D.; Hu, P.; Sun, X.; Pirk, S.; Zhang, J.; Mech, R.; Kaufman, A. E.

Citation Details

This content will become publicly available on October 1, 2024

Autonomous Aesthetic Views Finding in Indoor Scenes with Deep Reinforcement Learning

Placing and orienting a camera to compose aesthetically meaningful shots of a scene is not only a key objective in real-world photography and cinematography but also for virtual content creation. The framing of a camera often significantly contributes to the story telling in movies, games, and mixed reality applications. Generating single camera poses or even contiguous trajectories either requires a significant amount of manual labor or requires solving highdimensional optimization problems, which can be computationally demanding and error-prone. In this paper, we introduce GAIT, a Deep Reinforcement Learning (DRL) agent, that learns to automatically control a camera to generate a sequence of aesthetically meaningful views for synthetic 3D indoor scenes. To generate sequences of frames with high aesthetic value, GAIT relies on a neural aesthetics estimator, which is trained on a crowed-sourced dataset. Additionally, we introduce regularization techniques for diversity and smoothness to generate visually interesting trajectories for a 3D environment, and to constrain agent acceleration in the reward function to generate a smooth sequence of camera frames. We validated our method by comparing it to baseline algorithms, based on a perceptual user study, and through ablation studies. The source code of our method will be released with the final version of our paper. more »

Award ID(s):: 2107224

NSF-PAR ID:: 10435201

Author(s) / Creator(s):: Xie, D.; Hu, P.; Sun, X.; Pirk, S.; Zhang, J.; Mech, R.; Kaufman, A. E.

Date Published:: 2023-10-01

Journal Name:: International Conference on Computer Vision

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on October 1, 2024
Conference Paper:
The DOI is not currently available.

More Like this