skip to main content


Search for: All records

Award ID contains: 2202693

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    A key challenge in synthetic chemistry is the selection of high‐performing ligands for cross‐coupling reactions. To address this challenge, this work presents a classification workflow to identify physicochemical descriptors that bin monophosphine ligands as active or inactive in Ni‐catalyzed Suzuki‐Miyaura coupling reactions. Using five previously published high‐throughput experimentation datasets for training, we found that a binary classifier using a phosphine's minimum buried volume and Boltzmann‐averaged minimum electrostatic potential is most effective at distinguishing high and low‐yielding ligands. Experimental validations are also presented. Using the two physicochemical descriptors from the binary classifier to represent the chemical space of monophosphine ligands leads to a more predictive guide for structure‐reactivity relationships compared with classic chemical space representations.

     
    more » « less
    Free, publicly-accessible full text available September 11, 2025
  2. Abstract

    Skeletal modifications enable elegant and rapid access to various derivatives of a compound that would otherwise be difficult to prepare. They are therefore a powerful tool, especially in the synthesis of natural products or drug discovery, to explore different natural products or to improve the properties of a drug candidate starting from a common intermediate. Inspired by the biosynthesis of the cephalotane natural products, we report here a single-atom insertion into the framework of the benzenoid subfamily, providing access to the troponoid congeners — representing the reverse of the proposed biosynthesis (i.e., a contra-biosynthesis approach). Computational evaluation of our designed transformation prompted us to investigate a Büchner–Curtius–Schlotterbeck reaction of ap-quinol methylether, which ultimately results in the synthesis of harringtonolide in two steps from cephanolide A, which we had previously prepared. Additional computational studies reveal that unconventional selectivity outcomes are driven by the choice of a Lewis acid and the nucleophile, which should inform further developments of these types of reactions.

     
    more » « less
  3. Abstract

    “How strong is this Lewis acid?” is a question researchers often approach by calculating its fluoride ion affinity (FIA) with quantum chemistry. Here, we present FIA49k, an extensive FIA dataset with 48,986 data points calculated at the RI‐DSD‐BLYP‐D3(BJ)/def2‐QZVPP//PBEh‐3c level of theory, including 13 differentp‐block atoms as the fluoride accepting site. The FIA49k dataset was used to train FIA‐GNN, two message‐passing graph neural networks, which predict gas and solution phase FIA values of molecules excluded from training with a mean absolute error of 14 kJ mol−1(r2=0.93) from the SMILES string of the Lewis acid as the only input. The level of accuracy is notable, given the wide energetic range of 750 kJ mol−1spanned by FIA49k. The model's value was demonstrated with four case studies, including predictions for molecules extracted from the Cambridge Structural Database and by reproducing results from catalysis research available in the literature. Weaknesses of the model are evaluated and interpreted chemically. FIA‐GNN and the FIA49k dataset can be reached via a free web app (www.grebgroup.de/fia‐gnn).

     
    more » « less
    Free, publicly-accessible full text available March 19, 2025
  4. Abstract

    Transformer-based large language models are making significant strides in various fields, such as natural language processing1–5, biology6,7, chemistry8–10and computer programming11,12. Here, we show the development and capabilities of Coscientist, an artificial intelligence system driven by GPT-4 that autonomously designs, plans and performs complex experiments by incorporating large language models empowered by tools such as internet and documentation search, code execution and experimental automation. Coscientist showcases its potential for accelerating research across six diverse tasks, including the successful reaction optimization of palladium-catalysed cross-couplings, while exhibiting advanced capabilities for (semi-)autonomous experimental design and execution. Our findings demonstrate the versatility, efficacy and explainability of artificial intelligence systems like Coscientist in advancing research.

     
    more » « less
    Free, publicly-accessible full text available December 20, 2024
  5. Abstract

    Molecular quantum mechanical modeling, accelerated by machine learning, has opened the door to high‐throughput screening campaigns of complex properties, such as the activation energies of chemical reactions and absorption/emission spectra of materials and molecules;in silico. Here, we present an overview of the main principles, concepts, and design considerations involved in such hybrid computational quantum chemistry/machine learning screening workflows, with a special emphasis on some recent examples of their successful application. We end with a brief outlook of further advances that will benefit the field.

     
    more » « less
  6. Chemical reaction data has existed and still largely exists in unstructured forms. But curating such information into datasets suitable for tasks such as yield and reaction outcome prediction is impractical via manual curation and not possible to automate through programmatic means alone. Large language models (LLMs) have emerged as potent tools, showcasing remarkable capabilities in processing textual information and therefore could be extremely useful in automating this process. To address the challenge of unstructured data, we manually curated a dataset of structured chemical reaction data to fine-tune and evaluate LLMs. We propose a paradigm that leverages prompt-tuning, fine-tuning techniques, and a verifier to check the extracted information. We evaluate the capabilities of various LLMs, including LLAMA-2 and GPT models with different parameter counts, on the data extraction task. Our results show that prompt tuning of GPT-4 yields the best accuracy and evaluation results. Fine-tuning LLAMA-2 models with hundreds of samples does enable them and organize scientific material according to user-defined schemas better though. This workflow shows an adaptable approach for chemical reaction data extraction but also highlights the challenges associated with nuance in chemical information. We open-sourced our code at GitHub. 
    more » « less
    Free, publicly-accessible full text available October 21, 2025
  7. Free, publicly-accessible full text available October 15, 2025
  8. Free, publicly-accessible full text available October 2, 2025
  9. Free, publicly-accessible full text available September 9, 2025