FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation

Chen, Kefan; Min, Chaerin; Zhang, Linguang; Hampali, Shreyas; Keskin, Cem; Sridhar, Srinath

Citation Details

This content will become publicly available on June 16, 2026

FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation

FoundHand is trained on our large-scale FoundHand-10M dataset which contains automatically extracted 2D keypoints and segmentation mask annotations (top left). FoundHand is formulated as a 2D pose-conditioned image-to-image diffusion model that enables precise hand pose and camera viewpoint control (top right). Optionally, we can condition the generation with a reference image to preserve its style (top right). Our model demonstrates exceptional in-the-wild generalization across hand-centric applications and has core capabilities. such as gesture transfer, domain transfer, and novel view synthesis (middle row). This endows FoundHand with zero-shot applications to fix malformed hand images and synthesize coherent hand and hand-object videos, without explicitly giving object cues (bottom row). more »

Award ID(s):: 2143576

PAR ID:: 10577629

Author(s) / Creator(s):: Chen, Kefan; Min, Chaerin; Zhang, Linguang; Hampali, Shreyas; Keskin, Cem; Sridhar, Srinath

Publisher / Repository:: CVPR 2025

Date Published:: 2025-06-16

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 16, 2026
Conference Paper:
The DOI is not currently available.

More Like this