skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Extracting structured data from organic synthesis procedures using a fine-tuned large language model
An open-source fine-tuned large language model can extract reaction information from organic synthesis procedure text into structured data that follows the Open Reaction Database (ORD) schema.  more » « less
Award ID(s):
2308979
PAR ID:
10549991
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
RCS
Date Published:
Journal Name:
Digital Discovery
Volume:
3
Issue:
9
ISSN:
2635-098X
Page Range / eLocation ID:
1822 to 1831
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract An N‐heterocyclic‐carbene‐ligated 3‐benzoborepin with a bridged structure has been synthesized by double radicaltrans‐hydroboration of benzo[3,4]cycloundec‐3‐ene‐1,5‐diyne with an N‐heterocyclic carbene borane. The thermal reaction of the NHC‐ligated borepin at 150 °C gives an isolable NHC‐boranorcaradiene. Experiments and density functional theory calculations support a mechanism whereby the borepin initially rearranges to a boranorcaradiene by a thermal 6π‐electrocyclic reaction. This is followed by 1,5‐boron shift to give a rearranged boranorcaradiene. This shift occurs with stereoinversion at boron through a transition state with open‐shell diradical character. This is the first example of the isolation of a boranorcaradiene from a thermal reaction of a borepin. 
    more » « less
  2. Full control of molecular interactions, including reactive losses, would open new frontiers in quantum science. We demonstrate extreme tunability of ultracold chemical reaction rates by inducing resonant dipolar interactions by means of an external electric field. We prepared fermionic potassium-rubidium molecules in their first excited rotational state and observed a modulation of the chemical reaction rate by three orders of magnitude as we tuned the electric field strength by a few percent across resonance. In a quasi–two-dimensional geometry, we accurately determined the contributions from the three dominant angular momentum projections of the collisions. Using the resonant features, we shielded the molecules from loss and suppressed the reaction rate by an order of magnitude below the background value, thereby realizing a long-lived sample of polar molecules in large electric fields. 
    more » « less
  3. Hoffman, C (Ed.)
    Abstract A simple, broadly applicable method was developed using an in vitro transposition reaction followed by transformation into Escherichia coli and screening plates for fluorescent colonies. The transposition reaction catalyzes the random insertion of a fluorescent protein open reading frame into a target gene on a plasmid. The transposition reaction is employed directly in an E. coli transformation with no further procedures. Plating at high colony density yields fluorescent colonies. Plasmids purified from fluorescent colonies contain random, in-frame fusion proteins into the target gene. The plate screen also results in expressed, stable proteins. A large library of chimeric proteins was produced, which was useful for downstream research. The effect of using different fluorescent proteins was investigated as well as the dependence of the linker sequence between the target and fluorescent protein open reading frames. The utility and simplicity of the method were demonstrated by the fact that it has been employed in an undergraduate biology laboratory class without failure over dozens of class sections. This suggests that the method will be useful in high-impact research at small liberal arts colleges with limited resources. However, in-frame fusion proteins were obtained from 8 different targets suggesting that the method is broadly applicable in any research setting. 
    more » « less
  4. Regulatory networks depict promoting or inhibiting interactions between molecules in a biochemical system. We introduce a category-theoretic formalism for regulatory networks, using signed graphs to model the networks and signed functors to describe occurrences of one network in another, especially occurrences of network motifs. With this foundation, we establish functorial mappings between regulatory networks and other mathematical models in biochemistry. We construct a functor from reaction networks, modeled as Petri nets with signed links, to regulatory networks, enabling us to precisely define when a reaction network could be a physical mechanism underlying a regulatory network. Turning to quantitative models, we associate a regulatory network with a Lotka-Volterra system of differential equations, defining a functor from the category of signed graphs to a category of parameterized dynamical systems. We extend this result from closed to open systems, demonstrating that Lotka-Volterra dynamics respects not only inclusions and collapsings of regulatory networks, but also the process of building up complex regulatory networks by gluing together simpler pieces. Formally, we use the theory of structured cospans to produce a lax double functor from the double category of open signed graphs to that of open parameterized dynamical systems. Throughout the paper, we ground the categorical formalism in examples inspired by systems biology. 
    more » « less
  5. Abstract In this study, the occurrence of Diels–Alder reaction of cyclopentadiene yielding dicyclopentadiene within a confined closed space provided by octa acid (OA) in water at room temperature is established. The Diels–Alder reaction within the OA capsule occurs at least 2000 times faster than in water. Catalysis of Diels–Alder reaction by hosts such as cyclodextrin, cucurbituril, and Fujita's Pd nano–host occurs in water. Despite their similarity, these three hosts provide an open environment where the reactant molecules are exposed to aqueous environment. The onlyfullyclosed host known to catalyze the Diels–Alder reaction in water is OA. Although Rebek's host is established to catalyze Diels–Alder reaction it occurs in an organic solvent. The closed environment explored in this presentation provides an opportunity to better understand the origin of non–covalent catalysis in a restricted space and in water. Because the product binds stronger than the reactant, disappointingly, the capsule can't be recycled. We recognize that this aspect needs to be addressed for the OA capsule to become synthetically useful. We are in the process of understanding the origin of catalysis and finding ways to make reaction recyclable. 
    more » « less