NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

In-Context Learning Unlocked for Diffusion Models

Wang, Zhendong; Jiang, Yifan; Lu, Yadong; Shen, Yelong; He, Pengcheng; Chen, Weizhu; Wang, Zhangyang; Zhou, Mingyuan (December 2023, Neural Information Processing Systems)

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our model automatically understands the underlying task and performs the same task on a new query image following the text guidance. To achieve this, we propose a vision-language prompt that can model a wide range of vision-language tasks and a diffusion model that takes it as input. The diffusion model is trained jointly on six different tasks using these prompts. The resulting Prompt Diffusion model becomes the first diffusion-based vision-language foundation model capable of in-context learning. It demonstrates high-quality in-context generation for the trained tasks and effectively generalizes to new, unseen vision tasks using their respective prompts. Our model also shows compelling text-guided image editing results. Our framework aims to facilitate research into in-context learning for computer vision. We share our code and pre-trained models at https://github. com/Zhendong-Wang/Prompt-Diffusion.
more » « less
Full Text Available
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

Wang, Zhendong; Jiang, Yifan; Zheng, Huangjie; Wang, Peihao; He, Pengcheng; Wang, Zhangyang; Chen, Weizhu; Zhou, Mingyuan (December 2023, Neural Information Processing Systems)

Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which thus helps democratize diffusion model training to broader users. At the core of our innovations is a new conditional score function at the patch level, where the patch location in the original image is included as additional coordinate channels, while the patch size is randomized and diversified throughout training to encode the cross-region dependency at multiple scales. Sampling with our method is as easy as in the original diffusion model. Through Patch Diffusion, we could achieve ≥2× faster training, while maintaining comparable or better generation quality. Patch Diffusion meanwhile improves the performance of diffusion models trained on relatively small datasets, e.g., as few as 5,000 images to train from scratch. We achieve outstanding FID scores in line with state-of-the-art benchmarks: 1.77 on CelebA-64×64, 1.93 on AFHQv2-Wild-64×64, and 2.72 on ImageNet-256×256. We share our code and pre-trained models in GitHub.
more » « less
Full Text Available
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders

Zheng, Huangjie; He, Pengcheng; Chen, Weizhu; Zhou, Mingyuan (May 2023, International Conference on Learning Representations)
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

Li, Yixiao; Yu, Yifan; Zhang, Qingru; Liang, Chen; He, Pengcheng; Chen, Weizhu; Zhao, Tuo (July 2023, International Conference on Machine Learning)

Full Text Available
Diffusion-GAN: Training GANs with Diffusion

Wang, Zhendong; Zheng, Huangjie; He, Pengcheng; Chen, Weizhu; Zhou, Mingyuan (May 2023, International Conference on Learning Representations)
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Zhang, Qingru; Chen, Minshuo; Bukharin, Alexander; He, Pengcheng; Cheng, Yu; Chen, Weizhu; Zhao, Tuo. (May 2023, International Conference on Learning Representations)

Full Text Available
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation

https://doi.org/10.18653/v1/2022.naacl-main.116

Zuo, Simiao; Zhang, Qingru; Liang, Chen; He, Pengcheng; Zhao, Tuo; Chen, Weizhu (January 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Full Text Available
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization

https://doi.org/10.18653/v1/2021.acl-long.510

Liang, Chen; Zuo, Simiao; Chen, Minshuo; Jiang, Haoming; Liu, Xiaodong; He, Pengcheng; Zhao, Tuo; Chen, Weizhu (July 2021, Annual Meeting of the Association for Computational Linguistics)

Full Text Available
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

https://doi.org/10.18653/v1/2020.acl-main.197

Jiang, Haoming; He, Pengcheng; Chen, Weizhu; Liu, Xiaodong; Gao, Jianfeng; Zhao, Tuo. (July 2020, Annual Meeting of the Association for Computational Linguistics)

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning of- ten causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data. To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance. The pro- posed framework contains two important in- gredients: 1. Smoothness-inducing regulariza- tion, which effectively manages the complex- ity of the model; 2. Bregman proximal point optimization, which is an instance of trust- region methods and can prevent aggressive up- dating. Our experiments show that the pro- posed framework achieves new state-of-the-art performance on a number of NLP tasks includ- ing GLUE, SNLI, SciTail and ANLI. More- over, it also outperforms the state-of-the-art T5 model, which is the largest pre-trained model containing 11 billion parameters, on GLUE.
more » « less
Full Text Available

Search for: All records