Given an isolated garment image in a canonical product view and a separate image of a person, the virtual try-on task aims to generate a new image of the person wearing the target garment. Prior virtual try-on works face two major challenges in achieving this goal: a) the paired (human, garment) training data has limited availability; b) generating textures on the human that perfectly match that of the prompted garment is difficult, often resulting in distorted text and faded textures. Our work explores ways to tackle these issues through both synthetic data as well as model refinement. We introduce a garment extraction model that generates (human, synthetic garment) pairs from a single image of a clothed individual. The synthetic pairs can then be used to augment the training of virtual try-on. We also propose an Error-Aware Refinement-based Schrödinger Bridge (EARSB) that surgically targets localized generation errors for correcting the output of a base virtual try-on model. To identify likely errors, we propose a weakly-supervised error classifier that localizes regions for refinement, subsequently augmenting the Schrödinger Bridge's noise schedule with its confidence heatmap. Experiments on VITON-HD and DressCode-Upper demonstrate that our synthetic data augmentation enhances the performance of prior work, while EARSB improves the overall image quality. In user studies, our model is preferred by the users in an average of 59% of cases.
more »
« less
One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns
Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit natural random variation. Many different types of noise exist, each produced by a separate algorithm. In this paper, we present a single generative model which can learn to generate multiple types of noise as well as blend between them. In addition, it is capable of producing spatially-varying noise blends despite not having access to such data for training. These features are enabled by training a denoising diffusion model using a novel combination of data augmentation and network conditioning techniques. Like procedural noise generators, the model's behavior is controllable via interpretable parameters plus a source of randomness. We use our model to produce a variety of visually compelling noise textures. We also present an application of our model to improving inverse procedural material design; using our model in place of fixed-type noise nodes in a procedural material graph results in higher-fidelity material reconstructions without needing to know the type of noise in advance. Open-sourced materials can be found at https://armanmaesumi.github.io/onenoise/
more »
« less
- Award ID(s):
- 1941808
- PAR ID:
- 10607398
- Publisher / Repository:
- Association for Computing Machinery (ACM)
- Date Published:
- Journal Name:
- ACM Transactions on Graphics
- Volume:
- 43
- Issue:
- 4
- ISSN:
- 0730-0301
- Format(s):
- Medium: X Size: p. 1-21
- Size(s):
- p. 1-21
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Procedural modeling is now the de facto standard of material modeling in industry. Procedural models can be edited and are easily extended, unlike pixel-based representations of captured materials. In this article, we present a semi-automatic pipeline for general material proceduralization. Given Spatially Varying Bidirectional Reflectance Distribution Functions (SVBRDFs) represented as sets of pixel maps, our pipeline decomposes them into a tree of sub-materials whose spatial distributions are encoded by their associated mask maps. This semi-automatic decomposition of material maps progresses hierarchically, driven by our new spectrum-aware material matting and instance-based decomposition methods. Each decomposed sub-material is proceduralized by a novel multi-layer noise model to capture local variations at different scales. Spatial distributions of these sub-materials are modeled either by a by-example inverse synthesis method recovering Point Process Texture Basis Functions (PPTBF) [ 30 ] or via random sampling. To reconstruct procedural material maps, we propose a differentiable rendering-based optimization that recomposes all generated procedures together to maximize the similarity between our procedural models and the input material pixel maps. We evaluate our pipeline on a variety of synthetic and real materials. We demonstrate our method’s capacity to process a wide range of material types, eliminating the need for artist designed material graphs required in previous work [ 38 , 53 ]. As fully procedural models, our results expand to arbitrary resolution and enable high-level user control of appearance.more » « less
-
Classification is one of the fundamental tasks in machine learning. The quality of data is important in con- structing any machine learning model with good prediction performance. Real-world data often suffer from noise which is usually referred to as errors, irregularities, and corruptions in a dataset. However, we have no control over the quality of data used in classification tasks. The presence of noise in a dataset poses three major negative consequences, viz. (i) a decrease in the classification accuracy (ii) an increase in the complexity of the induced classifier (iii) an increase in the training time. Therefore, it is important to systematically explore the effects of noise in classification performance. Even though there have been published studies on the effect of noise either for some particular learner or for some particular noise type, there is a lack of study where the impact of different noise on different learners has been investigated. In this work, we focus on both scenar- ios: various learners and various noise types and provide a detailed analysis of their effects on the prediction performance. We use five different classifiers (J48, Naive Bayes, Support Vector Machine, k-Nearest Neigh- bor, Random Forest) and 10 benchmark datasets from the UCI machine learning repository and three publicly available image datasets. Our results can be used to guide the development of noise handling mechanisms.more » « less
-
Abstract Pulsar timing arrays (PTAs) are galactic-scale gravitational wave (GW) detectors. Each individual arm, composed of a millisecond pulsar, a radio telescope, and a kiloparsecs-long path, differs in its properties but, in aggregate, can be used to extract low-frequency GW signals. We present a noise and sensitivity analysis to accompany the NANOGrav 15 yr data release and associated papers, along with an in-depth introduction to PTA noise models. As a first step in our analysis, we characterize each individual pulsar data set with three types of white-noise parameters and two red-noise parameters. These parameters, along with the timing model and, particularly, a piecewise-constant model for the time-variable dispersion measure, determine the sensitivity curve over the low-frequency GW band we are searching. We tabulate information for all of the pulsars in this data release and present some representative sensitivity curves. We then combine the individual pulsar sensitivities using a signal-to-noise ratio statistic to calculate the global sensitivity of the PTA to a stochastic background of GWs, obtaining a minimum noise characteristic strain of 7 × 10−15at 5 nHz. A power-law-integrated analysis shows rough agreement with the amplitudes recovered in NANOGrav’s 15 yr GW background analysis. While our phenomenological noise model does not model all known physical effects explicitly, it provides an accurate characterization of the noise in the data while preserving sensitivity to multiple classes of GW signals.more » « less
-
Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other regularization techniques that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise.more » « less
An official website of the United States government
