skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Progress toward a universal biomedical data translator
Abstract Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well‐being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline‐specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph‐based “Translator” system capable of integrating existing biomedical data sets and “translating” those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system’s architecture, performance, and quality of results. We apply Translator to several real‐world use cases developed in collaboration with subject‐matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state‐of‐the‐art, biomedical graph‐based question‐answering systems.  more » « less
Award ID(s):
2033569
PAR ID:
10477228
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; « less
Corporate Creator(s):
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Clinical and Translational Science
Volume:
15
Issue:
8
ISSN:
1752-8054
Page Range / eLocation ID:
1838 to 1847
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper examines the effect of translational research on knowledge production and biomedical entrepreneurship across U.S. regions. Researchers have earlier investigated the outputs of translational research by focusing on academic publications. Little attention has been paid to linking translational research to biomedical entrepreneurship. We construct an analytical model based on the knowledge spillover theory of entrepreneurship and the entrepreneurial ecosystem approach to examine the relationship between translational research, biomedical patents, clinical trials, and biomedical entrepreneurship. We test the model across 381 U.S. metropolitan statistical areas using 10 years of panel data related to the NIH Clinical and Translational Science Awards (CTSA) program. CTSA appears to increase the number of biomedical patents and biomedical entrepreneurship as proxied by the NIH Small Business Innovation Research (SBIR) grants. However, the magnitudes of the effects are relatively small. Path analysis shows that the effect of translational research on regional biomedical entrepreneurship is not strongly conveyed through biomedical patents or clinical trials. 
    more » « less
  2. Foundation Models (FMs) are gaining increasing attention in the biomedical artificial intelligence (AI) ecosystem due to their ability to represent and contextualize multimodal biomedical data. These capabilities make FMs a valuable tool for a variety of tasks, including biomedical reasoning, hypothesis generation, and interpreting complex imaging data. In this review paper, we address the unique challenges associated with establishing an ethical and trustworthy biomedical AI ecosystem, with a particular focus on the development of FMs and their downstream applications. We explore strategies that can be implemented throughout the biomedical AI pipeline to effectively tackle these challenges, ensuring that these FMs are translated responsibly into clinical and translational settings. Additionally, we emphasize the importance of key stewardship and co-design principles that not only ensure robust regulation but also guarantee that the interests of all stakeholders—especially those involved in or affected by these clinical and translational applications—are adequately represented. We aim to empower the biomedical AI community to harness these models responsibly and effectively. As we navigate this exciting frontier, our collective commitment to ethical stewardship, co-design, and responsible translation will be instrumental in ensuring that the evolution of FMs truly enhances patient care and medical decision-making, ultimately leading to a more equitable and trustworthy biomedical AI ecosystem. 
    more » « less
  3. Abstract ObjectiveEarly identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient’s health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on “black box” algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph. Materials and MethodsA modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease. ResultsOur model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS. ConclusionUsing data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state. 
    more » « less
  4. Abstract Biomedical terminologies play a vital role in managing biomedical data. Missing IS-A relations in a biomedical terminology could be detrimental to its downstream usages. In this paper, we investigate an approach combining logical definitions and lexical features to discover missing IS-A relations in two biomedical terminologies: SNOMED CT and the National Cancer Institute (NCI) thesaurus. The method is applied to unrelated concept-pairs within non-lattice subgraphs: graph fragments within a terminology likely to contain various inconsistencies. Our approach first compares whether the logical definition of a concept is more general than  that of the other concept. Then, we check whether the lexical features of the concept are contained in those of the other concept. If both constraints are satisfied, we suggest a potentially missing IS-A relation between the two concepts. The method identified 982 potential missing IS-A relations for SNOMED CT and 100 for NCI thesaurus. In order to assess the efficacy of our approach, a random sample of results belonging to the “Clinical Findings” and “Procedure” subhierarchies of SNOMED CT and results belonging to the “Drug, Food, Chemical or Biomedical Material” subhierarchy of the NCI thesaurus were evaluated by domain experts. The evaluation results revealed that 118 out of 150 suggestions are valid for SNOMED CT and 17 out of 20 are valid for NCI thesaurus. 
    more » « less
  5. Abstract Physics laboratory courses (PLC) have been recently the topic of several research studies examining their effectiveness at reaching their goals. As a result, a discussion about the effectiveness of traditional PLC for students’ content knowledge, skills, and “expert-thinking” acquisition has developed. Critical for the investigation of students learning in those settings has been the development of research-based assessments tools. An example of those is the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS). Recently, we translated the E-CLASS into German and set up a centralized survey administration system for instructors, allowing data acquisition and automated data analysis. Previously, we described this process and presented the preliminary results of the study of the introductory PLC at the University of Potsdam (UP). Here, we present an extended study that allows us to make stronger conclusions about students’ views about experimental physics at the UP. Overall, we find that students at US institutions have a higher level of “expert-like” views than students at the UP. 
    more » « less