In this position paper, we describe research on knowledge graph-empowered materials science prediction and discovery. The research consists of several key components including ontology mapping, materials data annotation, and information extraction from unstructured scholarly articles. We argue that although big data generated by simulations and experiments have motivated and accelerated the data-driven science, the distribution and heterogeneity of materials science-related big data hinders major advancements in the field. Knowledge graphs, as semantic hubs, integrate disparate data and provide a feasible solution to addressing this challenge. We design a knowledge-graph based approach for data discovery, extraction, and integration in materials science.
more »
« less
Data-centric science for materials innovation
With the development of high-speed computers, networks, and huge storage, researchers can utilize a large volume and wide variety of materials data generated by experimental facilities and computations. The emergence of these big data and advanced analytical techniques has opened unprecedented opportunities for materials research. The discovery of many kinds of materials, such as energy-harvesting materials, structural materials, catalysts, optoelectronic materials, and magnetic materials, have been greatly accelerated through high-throughput screening. The utility of data-centric science for materials research is likely to grow significantly in the future. Unraveling the complexities inherent in big data could lead to novel design rules as well as new materials and functionalities.
more »
« less
- Award ID(s):
- 1640867
- PAR ID:
- 10113239
- Date Published:
- Journal Name:
- MRS Bulletin
- Volume:
- 43
- Issue:
- 9
- ISSN:
- 0883-7694
- Page Range / eLocation ID:
- 659 to 663
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
"Knowledge is power" is an old adage that has been found to be true in today's information age. Knowledge is derived from having access to information. The ability to gather information from large volumes of data has become an issue of relative importance. Big Data Analytics (BDA) is the term coined by researchers to describe the art of processing, storing and gathering large amounts of data for future examination. Data is being produced at an alarming rate. The rapid growth of the Internet, Internet of Things (IoT) and other technological advances are the main culprits behind this sustained growth. The data generated is a reflection of the environment it is produced out of, thus we can use the data we get out of systems to figure out the inner workings of that system. This has become an important feature in cybersecurity where the goal is to protect assets. Furthermore, the growing value of data has made big data a high value target. In this paper, we explore recent research works in cybersecurity in relation to big data. We highlight how big data is protected and how big data can also be used as a tool for cybersecurity. We summarize recent works in the form of tables and have presented trends, open research challenges and problems. With this paper, readers can have a more thorough understanding of cybersecurity in the big data era, as well as research trends and open challenges in this active research area.more » « less
-
This NSF Research Experience for Teachers (RET)“Research Experience for Teachers in Big Data and Data Science”(award number: 1801513) engaged four middle/high school science teachers in summer 2022 with research related to big data and data science, with follow-up school year implementation of related curriculum. These teachers developed curriculum related to their summer research experience in big data and data science that spanned a range of student ages and topics: middle school science, 9th grade biology, 9th grade health, and 11th grade chemistry. Despite the wide range of student ages, curricular content, and instructional goals, all teachers found rich and varied curriculum applications that fit within their existing curriculum constraints.more » « less
-
The Data-Enabled Advanced Computational Training Program for Cybersecurity Research and Education (DeapSECURE) is a non-degree training consisting of six modules covering a broad range of cyberinfrastructure techniques, including high performance computing, big data, machine learning and advanced cryptography, aimed at reducing the gap between current cybersecurity curricula and requirements needed for advanced research and industrial projects. Since 2020, these lesson modules have been updated and retooled to suit fully-online delivery. Hands-on activities were reformatted to accommodate self-paced learning. In this paper, we summarize the four years of the project comparing in-person and on-line only instruction methods as well as outlining lessons learned. The module content and hands-on materials are being released as open-source educational resources. We also indicate our future direction to scale up and increase adoption of the DeapSECURE training program to benefit cybersecurity research everywhere.more » « less
-
Abstract Driven by the big data science, material informatics has attracted enormous research interests recently along with many recognized achievements. To acquire knowledge of materials by previous experience, both feature descriptors and databases are essential for training machine learning (ML) models with high accuracy. In this regard, the electronic charge density ρ ( r ), which in principle determines the properties of materials at their ground state, can be considered as one of the most appropriate descriptors. However, the systematic electronic charge density ρ ( r ) database of inorganic materials is still in its infancy due to the difficulties in collecting raw data in experiment and the expensive first-principles based computational cost in theory. Herein, a real space electronic charge density ρ ( r ) database of 17,418 cubic inorganic materials is constructed by performing high-throughput density functional theory calculations. The displayed ρ ( r ) patterns show good agreements with those reported in previous studies, which validates our computations. Further statistical analysis reveals that it possesses abundant and diverse data, which could accelerate ρ ( r ) related machine learning studies. Moreover, the electronic charge density database will also assists chemical bonding identifications and promotes new crystal discovery in experiments.more » « less
An official website of the United States government

