Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We introduce Quantized Language-Image Pretraining (QLIP), a visual tokenization method that combines state-of-the-art reconstruction quality with state-of-the-art zero-shot image understanding. QLIP trains a binary-spherical-quantization-based autoencoder with reconstruction and language-image alignment objectives. We are the first to show that the two objectives do not need to be at odds. We balance the two loss terms dynamically during training and show that a two-stage training pipeline effectively mixes the large-batch requirements of image-language pre-training with the memory bottleneck imposed by the reconstruction objective. We validate the effectiveness of QLIP for multimodal understanding and text-conditioned image generation with a single model. Specifically, QLIP serves as a drop-in replacement for the visual encoder for LLaVA and the image tokenizer for LlamaGen with comparable or even better performance. Finally, we demonstrate that QLIP enables a unified mixed-modality auto-regressive model for understanding and generation.more » « lessFree, publicly-accessible full text available February 7, 2026
-
A fundamental goal of photochemistry is to understand how structural features of a chromophore can make specific bonds within a molecule prone to cleavage by light, or photolabile. The meta effect is an example of a regiochemical explanation for photolability, in which electron donating groups on an aromatic ring cause photolability selectively at the meta position. Here, we show, using a chromophore containing one ring with a meta-methoxy group and one ring with a para-methoxy group, that two stereoisomers of the same compounds can react with light differently, based simply on the three-dimensional positioning of a meta anisyl ring. The result is that the stereoisomers of the compound with the same configuration at both stereogenic centers are photolabile while the stereoisomers with opposite configuration do not react with light. Furthermore, time-dependent density functional theory (TD-DFT) calculations show distinct excitation pathways for each stereoisomer.more » « less
An official website of the United States government
