NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Think While You Generate: Discrete Diffusion with Planned Denoising

Liu, Sulin; Nam, Juno; Campbell, Andrew; Stärk, Hannes; Xu, Yilun; Jaakkola, Tommi; Gómez-Bombarelli, Rafael (April 2025, ICLR 2025)

Discrete diffusion has achieved state-of-the-art performance, outperforming or approaching autoregressive models on standard benchmarks. In this work, we introduce Discrete Diffusion with Planned Denoising (DDPD), a novel framework that separates the generation process into two models: a planner and a denoiser. At inference time, the planner selects which positions to denoise next by identifying the most corrupted positions in need of denoising, including both initially corrupted and those requiring additional refinement. This plan-and-denoise approach enables more efficient reconstruction during generation by iteratively identifying and denoising corruptions in the optimal order. DDPD outperforms traditional denoiser-only mask diffusion methods, achieving superior results on language modeling benchmarks such as text8, OpenWebText, and token-based image generation on ImageNet 256×256. Notably, in language modeling, DDPD significantly reduces the performance gap between diffusion-based and autoregressive methods in terms of generative perplexity.
more » « less
Free, publicly-accessible full text available April 24, 2026
Think While You Generate: Discrete Diffusion with Planned Denoising

Liu, Sulin; Nam, Juno; Campbell, Andrew; Stärk, Hannes; Xu, Yilun; Jaakkola, Tommi; Gómez-Bombarelli, Rafael (October 2024, ArXiv)

Discrete diffusion has achieved state-of-the-art performance, outperforming or approaching autoregressive models on standard benchmarks. In this work, we introduce Discrete Diffusion with Planned Denoising (DDPD), a novel framework that separates the generation process into two models: a planner and a denoiser. At inference time, the planner selects which positions to denoise next by identifying the most corrupted positions in need of denoising, including both initially corrupted and those requiring additional refinement. This plan-and-denoise approach enables more efficient reconstruction during generation by iteratively identifying and denoising corruptions in the optimal order. DDPD outperforms traditional denoiser-only mask diffusion methods, achieving superior results on language modeling benchmarks such as text8, OpenWebText, and token-based image generation on ImageNet 256×256. Notably, in language modeling, DDPD significantly reduces the performance gap between diffusion-based and autoregressive methods in terms of generative perplexity.
more » « less
Full Text Available
Boltz-1: Democratizing Biomolecular Interaction Modeling

https://doi.org/10.1101/2024.11.19.624167

Wohlwend, Jeremy; Corso, Gabriele; Passaro, Saro; Reveiz, Mateo; Leidal, Ken; Swiderski, Wojtek; Portnoi, Tally; Chinn, Itamar; Silterra, Jacob; Jaakkola, Tommi; et al (November 2024, bioRxiv)

Abstract Understanding biomolecular interactions is fundamental to advancing fields like drug discovery and protein design. In this paper, we introduce Boltz-1, an open-source deep learning model incorporating innovations in model architecture, speed optimization, and data processing achieving AlphaFold3-level accuracy in predicting the 3D structures of biomolecular complexes. Boltz-1 demonstrates a performance on-par with state-of-the-art commercial models on a range of diverse benchmarks, setting a new benchmark for commercially accessible tools in structural biology. By releasing the training and inference code, model weights, datasets, and benchmarks under the MIT open license, we aim to foster global collaboration, accelerate discoveries, and provide a robust platform for advancing biomolecular modeling.
more » « less
Free, publicly-accessible full text available November 20, 2025
Revisiting Who’s Harry Potter: Towards Targeted Unlearning from a Causal Intervention Perspective

https://doi.org/10.18653/v1/2024.emnlp-main.495

Liu, Yujian; Zhang, Yang; Jaakkola, Tommi; Chang, Shiyu (January 2024, Association for Computational Linguistics)

Full Text Available
Virtual node graph neural network for full phonon prediction

https://doi.org/10.1038/s43588-024-00661-0

Okabe, Ryotaro; Chotrattanapituk, Abhijatmedhi; Boonkird, Artittaya; Andrejevic, Nina; Fu, Xiang; Jaakkola, Tommi S; Song, Qichen; Nguyen, Thanh; Drucker, Nathan; Mu, Sai; et al (July 2024, Nature Computational Science)

Full Text Available
Blind protein-ligand docking with diffusion-based deep generative models

https://doi.org/10.1016/j.bpj.2022.11.937

Corso, Gabriele; Jing, Bowen; Stark, Hannes; Barzilay, Regina; Jaakkola, Tommi (February 2023, Biophysical Journal)

Full Text Available
Fundamental Limits and Tradeoffs in Invariant Representation Learning

Zhao, Han; Dan, Chen; Aragam, Bryon; Jaakkola, Tommi S.; Gordon, Geoffrey J.; Ravikumar, Pradeep (November 2022, Journal of machine learning research)
Adrian Weller (Ed.)
A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protected features (e.g. for fairness, privacy, etc). Despite their wide applicability, theoretical understanding of the optimal tradeoffs — with respect to accuracy, and invariance — achievable by invariant representations is still severely lacking. In this paper, we provide an information theoretic analysis of such tradeoffs under both classification and regression settings. More precisely, we provide a geometric characterization of the accuracy and invariance achievable by any representation of the data; we term this feasible region the information plane. We provide an inner bound for this feasible region for the classification case, and an exact characterization for the regression case, which allows us to either bound or exactly characterize the Pareto optimal frontier between accuracy and invariance. Although our contributions are mainly theoretical, a key practical application of our results is in certifying the potential sub-optimality of any given representation learning algorithm for either classification or regression tasks. Our results shed new light on the fundamental interplay between accuracy and invariance, and may be useful in guiding the design of future representation learning algorithms.
more » « less
Full Text Available
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

https://doi.org/10.1561/2200000115

Zhang, Xuan; Wang, Limei; Helwig, Jacob; Luo, Youzhi; Fu, Cong; Xie, Yaochen; Liu, Meng; Lin, Yuchao; Xu, Zhao; Yan, Keqiang; et al (January 2025, Foundations and Trends® in Machine Learning)

Free, publicly-accessible full text available January 1, 2026
Learning Task Informed Abstractions

Fu, Xiang; Yang, Ge; Agrawal, Pulkit; Jaakkola, Tommi (July 2021, Proceedings of the 38 th International Conference on Machine Learning)

Full Text Available
Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking

Ganea, Octavian-Eugen; Huang, Xinyuan; Bunne, Charlotte; Bian, Yatao; Barzilay, Regina; Jaakkola, Tommi; Krause, Andreas (January 2022, International Conference on Learning Representations)

Protein complex formation is a central problem in biology, being involved in most of the cell's processes, and essential for applications, e.g. drug design or protein engineering. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no conformational change within the proteins happens during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position relative to the second protein. We mathematically guarantee a basic principle: the predicted complex is always identical regardless of the initial locations and orientations of the two structures. Our model, named EquiDock, approximates the binding pockets and predicts the docking poses using keypoint matching and alignment, achieved through optimal transport and a differentiable Kabsch algorithm. Empirically, we achieve significant running time improvements and often outperform existing docking software despite not relying on heavy candidate sampling, structure refinement, or templates.
more » « less
Full Text Available

« Prev Next »

Search for: All records