Search for: All records

Creators/Authors contains: "Yang, Jimei"

« Prev Next »

Total Resources

3

Resource Type
Conference Paper

3

Conference Proceeding

0

Dataset

0

Journal Article

0

Workshop Report

0

Availability
Full Text / Resource Available

2

Citation Only

1

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Normal-guided Garment UV Prediction for Human Re-texturing

https://doi.org/10.1109/CVPR52729.2023.00449

Jafarian, Yasamin ; Wang, Tuanfeng Y. ; Ceylan, Duygu ; Yang, Jimei ; Carr, Nathan ; Zhou, Yi ; Park, Hyun Soo ( June 2023 , 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))

Free, publicly-accessible full text available June 1, 2024
HuMoR: 3D Human Motion Model for Robust Pose Estimation

https://doi.org/10.1109/ICCV48922.2021.01129

Rempe, Davis ; Birdal, Tolga ; Hertzmann, Aaron ; Yang, Jimei ; Sridhar, Srinath ; Guibas, Leonidas J. ( October 2021 , 2021 IEEE/CVF International Conference on Computer Vision (ICCV))

We introduce HuMoR: a 3D Human Motion Model for Robust Estimation of temporal pose and shape. Though substantial progress has been made in estimating 3D human motion and shape from dynamic observations, recovering plausible pose sequences in the presence of noise and occlusions remains a challenge. For this purpose, we propose an expressive generative model in the form of a conditional variational autoencoder, which learns a distribution of the change in pose at each step of a motion sequence. Furthermore, we introduce a flexible optimization-based approach that leverages HuMoR as a motion prior to robustly estimate plausible pose and shape from ambiguous observations. Through extensive evaluations, we demonstrate that our model generalizes to diverse motions and body shapes after training on a large motion capture dataset, and enables motion reconstruction from multiple input modalities including 3D keypoints and RGB(-D) videos. See the project page at geometry.stanford.edu/projects/humor.
more » « less
Full Text Available
MAttNet: Modular Attention Network for Referring Expression Comprehension

Yu, Licheng ; Lin, Zhe ; Shen, Xiaohui ; Yang, Jimei ; Lu, Xin ; Bansal, Mohit ; Berg, Tamara L. ( June 2018 , IEEE Conference on Computer Vision and Pattern Recognition)

In this paper, we address referring expression comprehension: localizing an image region described by a natural language expression. While most recent work treats expressions as a single unit, we propose to decompose them into three modular components related to subject appearance, location, and relationship to other objects. This allows us to flexibly adapt to expressions containing different types of information in an end-to-end framework. In our model, which we call the Modular Attention Network (MAttNet), two types of attention are utilized: language-based attention that learns the module weights as well as the word/phrase attention that each module should focus on; and visual attention that allows the subject and relationship modules to focus on relevant image components. Module weights combine scores from all three modules dynamically to output an overall score. Experiments show that MAttNet outperforms previous state-of-the-art methods by a large margin on both bounding-box-level and pixel-level comprehension tasks. Demo and code are provided.
more » « less
Full Text Available