skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Computer-aided key step generation in alkaloid total synthesis
Efficient chemical synthesis is critical to satisfying future demands for medicines, materials, and agrochemicals. Retrosynthetic analysis of modestly complex molecules has been automated over the course of decades, but the combinatorial explosion of route possibilities has challenged computer hardware and software until only recently. Here, we explore a computational strategy that merges computer-aided synthesis planning with molecular graph editing to minimize the number of synthetic steps required to produce alkaloids. Our study culminated in an enantioselective three-step synthesis of (–)-stemoamide by leveraging high-impact key steps, which could be identified in computer-generated retrosynthesis plans using graph edit distances.  more » « less
Award ID(s):
2236215
PAR ID:
10489489
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Association for the Advancement of Science
Date Published:
Journal Name:
Science
Volume:
379
Issue:
6631
ISSN:
0036-8075
Page Range / eLocation ID:
453 to 457
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The arXiv has collected 1.5 million pre-print articles over 28 years, hosting literature from scientific fields including Physics, Mathematics, and Computer Science. Each pre-print features text, figures, authors, citations, categories, and other metadata. These rich, multi-modal features, combined with the natural graph structure—created by citation, affiliation, and co-authorship—makes the arXiv an exciting candidate for benchmarking next-generation models. Here we take the first necessary steps toward this goal, by providing a pipeline which standardizes and simplifies access to the arXiv’s publicly available data. We use this pipeline to extract and analyze a 6.7 million edge citation graph, with an 11 billion word cor- pus of full-text research articles. We present some baseline classification results, and motivate application of more exciting generative graph models. 
    more » « less
  2. Syntax-guided synthesis has been a prevalent theme in various computer-aided programming systems. However, the domain of bit-vector synthesis poses several unique challenges that have not yet been sufficiently addressed and resolved. In this paper, we propose a novel synthesis approach that incorporates a distinct enumeration strategy based on various factors. Technically, this approach weighs in subexpression recurrence by term-graph-based enumeration, avoids useless candidates by example-guided filtration, prioritizes valuable components identified by large language models. This approach also incorporates a bottom-up deduction step to enhance the enumeration algorithm by considering subproblems that contribute to the deductive resolution. We implement all the enhanced enumeration techniques in our SyGuS solver DryadSynth, which outperforms state-of-the-art solvers in terms of the number of solved problems, execution time, and solution size. Notably, DryadSynthsuccessfully solved 31 synthesis problems for the first time, including 5 renowned Hacker’s Delight problems. 
    more » « less
  3. We consider single-source shortest path algorithms that perform a sequence of relaxation steps whose ordering depends only on the input graph structure and not on its weights or the results of prior steps. Each step examines one edge of the graph, and replaces the tentative distance to the endpoint of the edge by its minimum with the tentative distance to the start of the edge, plus the edge length. As we prove, among such algorithms, the Bellman-Ford algorithm has optimal complexity for dense graphs and near-optimal complexity for sparse graphs, as a function of the number of edges and vertices in the given graph. Our analysis holds both for deterministic algorithms and for randomized algorithms that find shortest path distances with high probability. 
    more » « less
  4. Cabello, Sergio; Chen, Danny Z. (Ed.)
    In this paper, we consider the Visibility Graph Recognition and Reconstruction problems in the context of terrains. Here, we are given a graph G with labeled vertices v₀, v₁, …, v_{n-1} such that the labeling corresponds with a Hamiltonian path H. G also may contain other edges. We are interested in determining if there is a terrain T with vertices p₀, p₁, …, p_{n-1} such that G is the visibility graph of T and the boundary of T corresponds with H. G is said to be persistent if and only if it satisfies the so-called X-property and Bar-property. It is known that every "pseudo-terrain" has a persistent visibility graph and that every persistent graph is the visibility graph for some pseudo-terrain. The connection is not as clear for (geometric) terrains. It is known that the visibility graph of any terrain T is persistent, but it has been unclear whether every persistent graph G has a terrain T such that G is the visibility graph of T. There actually have been several papers that claim this to be the case (although no formal proof has ever been published), and recent works made steps towards building a terrain reconstruction algorithm for any persistent graph. In this paper, we show that there exists a persistent graph G that is not the visibility graph for any terrain T. This means persistence is not enough by itself to characterize the visibility graphs of terrains, and implies that pseudo-terrains are not stretchable. 
    more » « less
  5. Large language models (LLMs) have recently taken the world by storm. They can generate coherent text, hold meaningful conversations, and be taught concepts and basic sets of instructions—such as the steps of an algorithm. In this context, we are interested in exploring the application of LLMs to graph drawing algorithms by performing experiments on ChatGPT. These algorithms are used to improve the readability of graph visualizations. The probabilistic nature of LLMs presents challenges to implementing algorithms correctly, but we believe that LLMs’ ability to learn from vast amounts of data and apply complex operations may lead to interesting graph drawing results. For example, we could enable users with limited coding backgrounds to use simple natural language to create effective graph visualizations. Natural language specification would make data visualization more accessible and user-friendly for a wider range of users. Exploring LLMs’ capabilities for graph drawing can also help us better understand how to formulate complex algorithms for LLMs; a type of knowledge that could transfer to other areas of computer science. Overall, our goal is to shed light on the exciting possibilities of using LLMs for graph drawing while providing a balanced assessment of the challenges and opportunities they present. A free copy of this paper with all supplemental materials to reproduce our results is available at https://osf.io/n5rxd/. 
    more » « less