NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Knowledge Graph-Empowered Materials Discovery

https://doi.org/10.1109/BigData52589.2021.9671503

Zhao, Xintong; Greenberg, Jane; McClellan, Scott; Hu, Yong-Jie; Lopez, Steven; Saikin, Semion K.; Hu, Xiaohua; An, Yuan (December 2021, 2021 IEEE International Conference on Big Data (Big Data))

In this position paper, we describe research on knowledge graph-empowered materials science prediction and discovery. The research consists of several key components including ontology mapping, materials data annotation, and information extraction from unstructured scholarly articles. We argue that although big data generated by simulations and experiments have motivated and accelerated the data-driven science, the distribution and heterogeneity of materials science-related big data hinders major advancements in the field. Knowledge graphs, as semantic hubs, integrate disparate data and provide a feasible solution to addressing this challenge. We design a knowledge-graph based approach for data discovery, extraction, and integration in materials science.
more » « less
Full Text Available
Exploring Pre-Trained Language Models to Build Knowledge Graph for Metal-Organic Frameworks (MOFs)

https://doi.org/10.1109/BigData55660.2022.10020568

An, Yuan; Greenberg, Jane; Hu, Xiaohua; Kalinowski, Alex; Fang, Xiao; Zhao, Xintong; McCLellan, Scott; Uribe-Romo, Fernando J.; Langlois, Kyle; Furst, Jacob; et al (December 2022, 2022 IEEE International Conference on Big Data (Big Data))

Building a knowledge graph is a time-consuming and costly process which often applies complex natural language processing (NLP) methods for extracting knowledge graph triples from text corpora. Pre-trained large Language Models (PLM) have emerged as a crucial type of approach that provides readily available knowledge for a range of AI applications. However, it is unclear whether it is feasible to construct domain-specific knowledge graphs from PLMs. Motivated by the capacity of knowledge graphs to accelerate data-driven materials discovery, we explored a set of state-of-the-art pre-trained general-purpose and domain-specific language models to extract knowledge triples for metal-organic frameworks (MOFs). We created a knowledge graph benchmark with 7 relations for 1248 published MOF synonyms. Our experimental results showed that domain-specific PLMs consistently outperformed the general-purpose PLMs for predicting MOF related triples. The overall benchmarking results, however, show that using the present PLMs to create domain-specific knowledge graphs is still far from being practical, motivating the need to develop more capable and knowledgeable pre-trained language models for particular applications in materials science.
more » « less
Full Text Available
Molecular Emission near Metal Interfaces: The Polaritonic Regime

https://doi.org/10.1021/acs.jpclett.8b02980

Yuen-Zhou, Joel; Saikin, Semion K.; Menon, Vinod M. (October 2018, The Journal of Physical Chemistry Letters)

Full Text Available
Autonomous experimentation systems for materials development: A community perspective

https://doi.org/10.1016/j.matt.2021.06.036

Stach, Eric; DeCost, Brian; Kusne, A. Gilad; Hattrick-Simpers, Jason; Brown, Keith A.; Reyes, Kristofer G.; Schrier, Joshua; Billinge, Simon; Buonassisi, Tonio; Foster, Ian; et al (July 2021, Matter)
null (Ed.)
Full Text Available
Text to Insight: Accelerating Organic Materials Knowledge Extraction via Deep Learning

https://doi.org/10.1002/pra2.497

Zhao, Xintong; Lopez, Steven; Saikin, Semion; Hu, Xiaohua; Greenberg, Jane (October 2021, Proceedings of the Association for Information Science and Technology)

Abstract Scientific literature is one of the most significant resources for sharing knowledge. Researchers turn to scientific literature as a first step in designing an experiment. Given the extensive and growing volume of literature, the common approach of reading and manually extracting knowledge is too time consuming, creating a bottleneck in the research cycle. This challenge spans nearly every scientific domain. For the materials science, experimental data distributed across millions of publications are extremely helpful for predicting materials properties and the design of novel materials. However, only recently researchers have explored computational approaches for knowledge extraction primarily for inorganic materials. This study aims to explore knowledge extraction for organic materials. We built a research dataset composed of 855 annotated and 708,376 unannotated sentences drawn from 92,667 abstracts. We used named‐entity‐recognition (NER) with BiLSTM‐CNN‐CRF deep learning model to automatically extract key knowledge from literature. Early‐phase results show a high potential for automated knowledge extraction. The paper presents our findings and a framework for supervised knowledge extraction that can be adapted to other scientific domains.
more » « less
Mapping Forbidden Emission to Structure in Self-Assembled Organic Nanoparticles

https://doi.org/10.1021/jacs.8b09149

Hinton, Daniel A.; Ng, James D.; Sun, Jian; Lee, Stephen; Saikin, Semion K.; Logsdon, Jenna; White, David S.; Marquard, Angela N.; Cavell, Andrew C.; Krasecki, Veronica K.; et al (November 2018, Journal of the American Chemical Society)

Full Text Available

Search for: All records