HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

Xue, Zihui; Luo, Mi; Chen, Changan; Grauman, Kristen

Citation Details

This content will become publicly available on November 8, 2025

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

This paper addresses the challenge of precisely swapping objects in videos, particularly those involved in hand-object interactions (HOI), using a single user-provided reference object image. While diffusion models have advanced video editing, they struggle with the complexities of HOI, often failing to generate realistic edits when object swaps involve changes in shape or functionality. To overcome this, the authors propose HOI-Swap, a novel diffusion-based video editing framework trained in a self-supervised manner. The framework operates in two stages: (1) single-frame object swapping with HOI awareness, where the model learns to adjust interaction patterns (e.g., hand grasp) based on object property changes; and (2) sequence-wide extension, where motion alignment is achieved by warping a sequence from the edited frame using sampled motion points and conditioning generation on the warped sequence. Extensive qualitative and quantitative evaluations demonstrate that HOI-Swap significantly outperforms prior methods, producing high-quality, realistic HOI video edits. more »

Award ID(s):: 2505865

PAR ID:: 10631939

Author(s) / Creator(s):: Xue, Zihui; Luo, Mi; Chen, Changan; Grauman, Kristen

Publisher / Repository:: https://doi.org/10.48550/arXiv.2406.07754

Date Published:: 2024-11-08

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on November 8, 2025
Conference Paper:
The DOI is not currently available.

More Like this