NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ligand engineering of tetra N-heterocyclic carbenes for boosting catalytic aziridination

https://doi.org/10.1039/D4DT01084A

Smith, Brett A; Hakimov, Somon; Jenkins, David M; Vogiatzis, Konstantinos D (September 2024, Dalton Transactions)

A comprehensive computational study on the underlying reactivity of iron tetra-NHC complexes for C₂+ N₁aziridination catalysis is presented.
more » « less
Full Text Available
Disparate reactivity of a chiral iron( ii ) tetracarbene complex with organic azides

https://doi.org/10.1039/D4DT01422G

Russell, Jerred J; DeJesus, Joseph F; Smith, Brett A; Nalaoh, Phattananawee; Vogiatzis, Konstantinos D; Jenkins, David M (July 2024, Dalton Transactions)

A chiral tetra-NHC iron(ii) complex and its disparate reactivity with multiple organic azides is reported.
more » « less
Full Text Available
Biomedical knowledge graph-optimized prompt generation for large language models

https://doi.org/10.1093/bioinformatics/btae560

Soman, Karthik; Rose, Peter_W; Morris, John_H; Akbas, Rabia_E; Smith, Brett; Peetoom, Braian; Villouta-Reyes, Catalina; Cerono, Gabriel; Shi, Yongmei; Rizk-Jackson, Angela; et al (September 2024, Bioinformatics)

Abstract MotivationLarge language models (LLMs) are being adopted at an unprecedented rate, yet still face challenges in knowledge-intensive domains such as biomedicine. Solutions such as pretraining and domain-specific fine-tuning add substantial computational overhead, requiring further domain-expertise. Here, we introduce a token-optimized and robust Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging a massive biomedical KG (SPOKE) with LLMs such as Llama-2-13b, GPT-3.5-Turbo, and GPT-4, to generate meaningful biomedical text rooted in established knowledge. ResultsCompared to the existing RAG technique for Knowledge Graphs, the proposed method utilizes minimal graph schema for context extraction and uses embedding methods for context pruning. This optimization in context extraction results in more than 50% reduction in token consumption without compromising the accuracy, making a cost-effective and robust RAG implementation on proprietary LLMs. KG-RAG consistently enhanced the performance of LLMs across diverse biomedical prompts by generating responses rooted in established knowledge, accompanied by accurate provenance and statistical evidence (if available) to substantiate the claims. Further benchmarking on human curated datasets, such as biomedical true/false and multiple-choice questions (MCQ), showed a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework’s capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 and GPT-4. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM in a token optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a cost-effective fashion. Availability and implementationSPOKE KG can be accessed at https://spoke.rbvi.ucsf.edu/neighborhood.html. It can also be accessed using REST-API (https://spoke.rbvi.ucsf.edu/swagger/). KG-RAG code is made available at https://github.com/BaranziniLab/KG_RAG. Biomedical benchmark datasets used in this study are made available to the research community in the same GitHub repository.
more » « less
Data-driven ligand field exploration of Fe( iv )–oxo sites for C–H activation

https://doi.org/10.1039/D2QI01961B

Jones, Grier M.; Smith, Brett A.; Kirkland, Justin K.; Vogiatzis, Konstantinos D. (February 2023, Inorganic Chemistry Frontiers)

We have explored the ligand topology of high-valent Fe(iv)–oxo complexes for screening a large molecular database with machine learning.
more » « less
Full Text Available
Staring at the Sun with the Keck Planet Finder: An Autonomous Solar Calibrator for High Signal-to-noise Sun-as-a-star Spectra

https://doi.org/10.1088/1538-3873/ad0b30

Rubenzahl, Ryan A; Halverson, Samuel; Walawender, Josh; Hill, Grant M; Howard, Andrew W; Brown, Matthew; Ida, Evan; Tehero, Jerez; Fulton, Benjamin J; Gibson, Steven R; et al (December 2023, Publications of the Astronomical Society of the Pacific)

Abstract Extreme precision radial velocity (EPRV) measurements contend with internal noise (instrumental systematics) and external noise (intrinsic stellar variability) on the road to 10 cm s⁻¹“exo-Earth” sensitivity. Both of these noise sources are well-probed using “Sun-as-a-star” RVs and cross-instrument comparisons. We built the Solar Calibrator (SoCal), an autonomous system that feeds stable, disk-integrated sunlight to the recently commissioned Keck Planet Finder (KPF) at the W. M. Keck Observatory. With SoCal, KPF acquires signal-to-noise ratio (S/N) ∼ 1200,R= 98,000 optical (445–870 nm) spectra of the Sun in 5 s exposures at unprecedented cadence for an EPRV facility using KPF’s fast readout mode (<16 s between exposures). Daily autonomous operation is achieved by defining an operations loop using state machine logic. Data affected by clouds are automatically flagged using a reliable quality control metric derived from simultaneous irradiance measurements. Comparing solar data across the growing global network of EPRV spectrographs with solar feeds will allow EPRV teams to disentangle internal and external noise sources and benchmark spectrograph performance. To facilitate this, all SoCal data products are immediately available to the public on the Keck Observatory Archive. We compared SoCal RVs to contemporaneous RVs from NEID, the only other immediately public EPRV solar data set. We find agreement at the 30–40 cm s⁻¹level on timescales of several hours, which is comparable to the combined photon-limited precision. Data from SoCal were also used to assess a detector problem and wavelength calibration inaccuracies associated with KPF during early operations. Long-term SoCal operations will collect upwards of 1000 solar spectra per six-hour day using KPF’s fast readout mode, enabling stellar activity studies at high S/N on our nearest solar-type star.
more » « less
Full Text Available
σ-Donation and π-Backdonation Effects in Dative Bonds of Main-Group Elements

https://doi.org/10.1021/acs.jpca.1c05956

Smith, Brett A.; Vogiatzis, Konstantinos D. (September 2021, The Journal of Physical Chemistry A)

Full Text Available
Nature of the Short Rh–Li Contact between Lithium and the Rhodium ω-Alkenyl Complex [Rh(CH ₂ CMe ₂ CH ₂ CH═CH ₂ ) ₂ ] ⁻

https://doi.org/10.1021/acs.inorgchem.1c00737

Liu, Sumeng; Smith, Brett A.; Kirkland, Justin K.; Vogiatzis, Konstantinos D.; Girolami, Gregory S. (June 2021, Inorganic Chemistry)
null (Ed.)
Full Text Available
System design of the Keck Planet Finder

https://doi.org/10.1117/12.3017841

Gibson, Steven R; Howard, Andrew W; Rider, Kodi; Halverson, Samuel P; Roy, Arpita; Baker, Ashley D; Edelstein, Jerry; Smith, Christopher; Fulton, Benjamin; Walawender, Josh; et al (July 2024, SPIE)
Vernet, Joël R; Bryant, Julia J; Motohara, Kentaro (Ed.)
The Keck Planet Finder (KPF) is a fiber-fed, high-resolution, echelle spectrometer that specializes in the discovery and characterization of exoplanets using Doppler spectroscopy. In designing KPF, the guiding principles were high throughput to promote survey speed and access to faint targets, and high stability to keep uncalibrated systematic Doppler measurement errors below 30 cm s−1. KPF achieves optical illumination stability with a tip-tilt injection system, octagonal cross-section optical fibers, a double scrambler, and active fiber agitation. The optical bench and optics with integral mounts are made of Zerodur to provide thermo-mechanical stability. The spectrometer includes a slicer to reformat the optical input, green and red channels (445-600 nm and 600-870 nm), and achieves a resolving power of ∼97,000. Additional subsystems include a separate, medium-resolution UV spectrometer (383-402 nm) to record the Ca II H & K lines, an exposure meter for real-time flux monitoring, a solar feed for sunlight injection, and a calibration system with a laser frequency comb and etalon for wavelength calibration. KPF was installed and commissioned at the W. M. Keck Observatory in late 2022 and early 2023 and is now in regular use for scientific observations. This paper presents an overview of the as-built KPF instrument and its subsystems, design considerations, and initial on-sky performance.
more » « less
Full Text Available
Synthesis and Characterization of Cu(II) and Mixed-Valence Cu(I)Cu(II) Clusters Supported by Pyridylamide Ligands

https://doi.org/10.1021/acs.inorgchem.0c00008

Schneider, Joseph D.; Smith, Brett A.; Williams, Grant A.; Powell, Douglas R.; Perez, Felio; Rowe, Gerard T.; Yang, Lei (April 2020, Inorganic Chemistry)

Full Text Available
A biomedical open knowledge network harnesses the power of AI to understand deep human biology

https://doi.org/10.1002/aaai.12037

Baranzini, Sergio E.; Börner, Katy; Morris, John; Nelson, Charlotte A.; Soman, Karthik; Schleimer, Erica; Keiser, Michael; Musen, Mark; Pearce, Roger; Reza, Tahsin; et al (March 2022, AI Magazine)

Abstract Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article, we describe concrete uses of Scalable PrecisiOn Medicine Knowledge Engine (SPOKE), an open knowledge network that connects curated information from thirty‐seven specialized and human‐curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID‐19 research and chronic disease diagnosis, and management.
more » « less

Search for: All records