 NSFPAR ID:
 10040280
 Date Published:
 Journal Name:
 IEEE International Conference on Computer Vision
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

An unsupervised imagetoimage translation (UI2I) task deals with learning a mapping between two domains without paired images. While existing UI2I methods usually require numerous unpaired images from different domains for training, there are many scenarios where training data is quite limited. In this paper, we argue that even if each domain contains a single image, UI2I can still be achieved. To this end, we propose TuiGAN, a generative model that is trained on only two unpaired images and amounts to oneshot unsupervised learning. With TuiGAN, an image is translated in a coarsetofine manner where the generated image is gradually refined from global structures to local details. We conduct extensive experiments to verify that our versatile method can outperform strong baselines on a wide variety of UI2I tasks. Moreover, TuiGAN is capable of achieving comparable performance with the stateoftheart UI2I models trained with sufficient data.more » « less

This paper introduces a novel generative encoder (GE) framework for generative imaging and image processing tasks like image reconstruction, compression, denoising, inpainting, deblurring, and superresolution. GE unifies the generative capacity of GANs and the stability of AEs in an optimization framework instead of stacking GANs and AEs into a single network or combining their loss functions as in existing literature. GE provides a novel approach to visualizing relationships between latent spaces and the data space. The GE framework is made up of a pretraining phase and a solving phase. In the former, a GAN with generator
capturing the data distribution of a given image set, and an AE network with encoder\begin{document}$ G $\end{document} that compresses images following the estimated distribution by\begin{document}$ E $\end{document} are trained separately, resulting in two latent representations of the data, denoted as the generative and encoding latent space respectively. In the solving phase, given noisy image\begin{document}$ G $\end{document} , where\begin{document}$ x = \mathcal{P}(x^*) $\end{document} is the target unknown image,\begin{document}$ x^* $\end{document} is an operator adding an addictive, or multiplicative, or convolutional noise, or equivalently given such an image\begin{document}$ \mathcal{P} $\end{document} in the compressed domain, i.e., given\begin{document}$ x $\end{document} , the two latent spaces are unified via solving the optimization problem\begin{document}$ m = E(x) $\end{document} and the image
is recovered in a generative way via\begin{document}$ x^* $\end{document} , where\begin{document}$ \hat{x}: = G(z^*)\approx x^* $\end{document} is a hyperparameter. The unification of the two spaces allows improved performance against corresponding GAN and AE networks while visualizing interesting properties in each latent space.\begin{document}$ \lambda>0 $\end{document} 
A function f∶{0,1}n→ {0,1} is called an approximate ANDhomomorphism if choosing x,y∈n uniformly at random, we have that f(x∧ y) = f(x)∧ f(y) with probability at least 1−ε, where x∧ y = (x1∧ y1,…,xn∧ yn). We prove that if f∶ {0,1}n → {0,1} is an approximate ANDhomomorphism, then f is δclose to either a constant function or an AND function, where δ(ε) → 0 as ε→ 0. This improves on a result of Nehama, who proved a similar statement in which δ depends on n. Our theorem implies a strong result on judgement aggregation in computational social choice. In the language of social choice, our result shows that if f is εclose to satisfying judgement aggregation, then it is δ(ε)close to an oligarchy (the name for the AND function in social choice theory). This improves on Nehama’s result, in which δ decays polynomially with n. Our result follows from a more general one, in which we characterize approximate solutions to the eigenvalue equation f = λ g, where is the downwards noise operator f(x) = y[f(x ∧ y)], f is [0,1]valued, and g is {0,1}valued. We identify all exact solutions to this equation, and show that any approximate solution in which f and λ g are close is close to an exact solution.more » « less

Unpaired imagetoimage translation (I2I) is an illposed problem, as an infinite number of translation functions can map the source domain distribution to the target distribution. Therefore, much effort has been put into designing suitable constraints, e.g., cycle consistency (CycleGAN), geometry consistency (GCGAN), and contrastive learningbased constraints (CUTGAN), that help better pose the problem. However, these wellknown constraints have limitations: (1) they are either too restrictive or too weak for specific I2I tasks; (2) these methods result in content distortion when there is a significant spatial variation between the source and target domains. This paper proposes a universal regularization technique called maximum spatial perturbation consistency (MSPC), which enforces a spatial perturbation function (T) and the translation operator (G) to be commutative (i.e., T \circ G = G \circ T ). In addition, we introduce two adversarial training components for learning the spatial perturbation function. The first one lets T compete with G to achieve maximum perturbation. The second one lets G and T compete with discriminators to align the spatial variations caused by the change of object size, object distortion, background interruptions, etc. Our method outperforms the stateoftheart methods on most I2I benchmarks. We also introduce a new benchmark, namely the front face to profile face dataset, to emphasize the underlying challenges of I2I for realworld applications. We finally perform ablation experiments to study the sensitivity of our method to the severity of spatial perturbation and its effectiveness for distribution alignment.more » « less

Abstract Let $$V_*\otimes V\rightarrow {\mathbb {C}}$$ V ∗ ⊗ V → C be a nondegenerate pairing of countabledimensional complex vector spaces V and $$V_*$$ V ∗ . The Mackey Lie algebra $${\mathfrak {g}}=\mathfrak {gl}^M(V,V_*)$$ g = gl M ( V , V ∗ ) corresponding to this pairing consists of all endomorphisms $$\varphi $$ φ of V for which the space $$V_*$$ V ∗ is stable under the dual endomorphism $$\varphi ^*: V^*\rightarrow V^*$$ φ ∗ : V ∗ → V ∗ . We study the tensor Grothendieck category $${\mathbb {T}}$$ T generated by the $${\mathfrak {g}}$$ g modules V , $$V_*$$ V ∗ and their algebraic duals $$V^*$$ V ∗ and $$V^*_*$$ V ∗ ∗ . The category $${{\mathbb {T}}}$$ T is an analogue of categories considered in prior literature, the main difference being that the trivial module $${\mathbb {C}}$$ C is no longer injective in $${\mathbb {T}}$$ T . We describe the injective hull I of $${\mathbb {C}}$$ C in $${\mathbb {T}}$$ T , and show that the category $${\mathbb {T}}$$ T is Koszul. In addition, we prove that I is endowed with a natural structure of commutative algebra. We then define another category $$_I{\mathbb {T}}$$ I T of objects in $${\mathbb {T}}$$ T which are free as I modules. Our main result is that the category $${}_I{\mathbb {T}}$$ I T is also Koszul, and moreover that $${}_I{\mathbb {T}}$$ I T is universal among abelian $${\mathbb {C}}$$ C linear tensor categories generated by two objects X , Y with fixed subobjects $$X'\hookrightarrow X$$ X ′ ↪ X , $$Y'\hookrightarrow Y$$ Y ′ ↪ Y and a pairing $$X\otimes Y\rightarrow {\mathbf{1 }}$$ X ⊗ Y → 1 where 1 is the monoidal unit. We conclude the paper by discussing the orthogonal and symplectic analogues of the categories $${\mathbb {T}}$$ T and $${}_I{\mathbb {T}}$$ I T .more » « less