Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Majumder, S; Nagarajan, T; Al-Halah, Z; Grauman, K

Citation Details

This content will become publicly available on April 22, 2026

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

We introduce SWITCH-A-VIEW, a model that learns to automatically select the viewpoint to display at each timepoint when creating a how-to video. The key insight of our approach is how to train such a model from unlabeled -- but human-edited -- video samples. We pose a pretext task that pseudo-labels segments in the training videos for their primary viewpoint (egocentric or exocentric), and then discovers the patterns between the visual and spoken content in a how-to video on the one hand and its view-switch moments on the other hand. Armed with this predictor, our model can be applied to new multi-view video settings for orchestrating which viewpoint should be displayed when, even when such settings come with limited labels. We demonstrate our idea on a variety of real-world videos from HowTo100M and Ego-Exo4D, and rigorously validate its advantages. more »

Award ID(s):: 2505865

PAR ID:: 10631522

Author(s) / Creator(s):: Majumder, S; Nagarajan, T; Al-Halah, Z; Grauman, K

Publisher / Repository:: https://doi.org/10.48550/arXiv.2412.18386

Date Published:: 2025-04-22

ISSN:: 2412.18386

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on April 22, 2026
Conference Paper:
The DOI is not currently available.

More Like this