Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Free, publicly-accessible full text available December 1, 2026
- 
            The trajectory of a molecular system undergoing a reversible reaction A ⇌ B and crossing and recrossing a transition state separating the reactant and product consists of loops, i.e., excursions from the transition state to either side and back to the transition state. Motivated by recent experimental observations of loops, here, we discuss some of their statistical properties. In particular, we highlight that the transition-state rate is not only an upper bound on the true reaction rate but also a physical property of the loops. We further find that loops can be unambiguously divided into two sub-ensembles. Those consist of short loops, which are brief excursions from the transition state, and long loops that get trapped in the reactant or product wells before eventually returning to the barrier. Finally, we show that the loop time distribution contains information about both the reaction rate coefficients and their transition-state-theory counterparts.more » « lessFree, publicly-accessible full text available August 7, 2026
- 
            Data selection can reduce the amount of training data needed to finetune LLMs; however, the efficacy of data selection scales directly with its compute. Motivated by the practical challenge of compute-constrained finetuning, we consider the setting in which both the cost of selecting data and training are budgeted for. We first formalize the problem of data selection with a cost-aware utility function, and model the data selection problem as trading off initial-selection cost for training gain. We run a comprehensive sweep of experiments across multiple tasks, varying compute budget by scaling finetuning tokens, model sizes, and data selection compute. Interestingly we find that many powerful data selection methods are almost never compute-optimal, and that cheaper data selection alternatives dominate both from a theoretical and empirical perspective. For compute-optimal training, we find that perplexity and gradient data selection require training-to-selection model size ratios of 5x and 10x, respectivelymore » « lessFree, publicly-accessible full text available April 24, 2026
- 
            Dense document embeddings are central to neural retrieval. The dominant paradigm is to train and construct embeddings by running encoders directly on individual documents. In this work, we argue that these embeddings, while effective, are implicitly out-of-context for targeted use cases of retrieval, and that a contextualized document embedding should take into account both the document and neighboring documents in context - analogous to contextualized word embeddings. We propose two complementary methods for contextualized document embeddings: first, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss; second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation. Results show that both methods achieve better performance than biencoders in several settings, with differences especially pronounced out-of-domain. We achieve state-of-the-art results on the MTEB benchmark with no hard negative mining, score distillation, dataset-specific instructions, intra-GPU example-sharing, or extremely large batch sizes. Our method can be applied to improve performance on any contrastive learning dataset and any biencoder.more » « lessFree, publicly-accessible full text available April 24, 2026
- 
            Free, publicly-accessible full text available April 29, 2026
- 
            Free, publicly-accessible full text available April 29, 2026
- 
            Free, publicly-accessible full text available April 30, 2026
- 
            Free, publicly-accessible full text available December 6, 2025
- 
            In silicoexamination of 13P,N-ligated Au(iii) OACs determined the key mechanistic factors governing Au(iii)-mediatedS-arylation. Three complexes were synthesized which exhibited bimolecular coordination rate constants as high as 20 200 M−1s−1.more » « lessFree, publicly-accessible full text available February 26, 2026
- 
            Free, publicly-accessible full text available February 26, 2026
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
