Portrait synthesis creates realistic digital avatars which enable users to interact with others in a compelling way. Recent advances in StyleGAN and its extensions have shown promising results in synthesizing photorealistic and accurate reconstruction of human faces. However, previous methods often focus on frontal face synthesis and most methods are not able to handle large head rotations due to the training data distribution of StyleGAN. In this work, our goal is to take as input a monocular video of a face, and create an editable dynamic portrait able to handle extreme head poses. The user can create novel viewpoints, edit the appearance, and animate the face. Our method utilizes pivotal tuning inversion (PTI) to learn a personalized video prior from a monocular video sequence. Then we can input pose and expression coefficients to MLPs and manipulate the latent vectors to synthesize different viewpoints and expressions of the subject. We also propose novel loss functions to further disentangle pose and expression in the latent space. Our algorithm shows much better performance over previous approaches on monocular video datasets, and it is also capable of running in real‐time at 54 FPS on an RTX 3080.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of TransformersIII, Hal Daumé (Ed.)Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.more » « less
-
Free, publicly-accessible full text available June 1, 2024
-
Free, publicly-accessible full text available January 1, 2024
-
Free, publicly-accessible full text available January 1, 2024
-
Abstract The MicroBooNE liquid argon time projection chamber (LArTPC) maintains a high level of liquid argon purity through the use of a filtration system that removes electronegative contaminants in continuously-circulated liquid, recondensed boil off, and externally supplied argon gas. We use the MicroBooNE LArTPC to reconstruct MeV-scale radiological decays. Using this technique we measure the liquid argon filtration system's efficacy at removing radon. This is studied by placing a 500 kBq 222 Rn source upstream of the filters and searching for a time-dependent increase in the number of radiological decays in the LArTPC. In the context of two models for radon mitigation via a liquid argon filtration system, a slowing mechanism and a trapping mechanism, MicroBooNE data supports a radon reduction factor of greater than 97% or 99.999%, respectively. Furthermore, a radiological survey of the filters found that the copper-based filter material was the primary medium that removed the 222 Rn. This is the first observation of radon mitigation in liquid argon with a large-scale copper-based filter and could offer a radon mitigation solution for future large LArTPCs.more » « less