skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Materials characterization: Can artificial intelligence be used to address reproducibility challenges?
Material characterization techniques are widely used to characterize the physical and chemical properties of materials at the nanoscale and, thus, play central roles in material scientific discoveries. However, the large and complex datasets generated by these techniques often require significant human effort to interpret and extract meaningful physicochemical insights. Artificial intelligence (AI) techniques such as machine learning (ML) have the potential to improve the efficiency and accuracy of surface analysis by automating data analysis and interpretation. In this perspective paper, we review the current role of AI in surface analysis and discuss its future potential to accelerate discoveries in surface science, materials science, and interface science. We highlight several applications where AI has already been used to analyze surface analysis data, including the identification of crystal structures from XRD data, analysis of XPS spectra for surface composition, and the interpretation of TEM and SEM images for particle morphology and size. We also discuss the challenges and opportunities associated with the integration of AI into surface analysis workflows. These include the need for large and diverse datasets for training ML models, the importance of feature selection and representation, and the potential for ML to enable new insights and discoveries by identifying patterns and relationships in complex datasets. Most importantly, AI analyzed data must not just find the best mathematical description of the data, but it must find the most physical and chemically meaningful results. In addition, the need for reproducibility in scientific research has become increasingly important in recent years. The advancement of AI, including both conventional and the increasing popular deep learning, is showing promise in addressing those challenges by enabling the execution and verification of scientific progress. By training models on large experimental datasets and providing automated analysis and data interpretation, AI can help to ensure that scientific results are reproducible and reliable. Although integration of knowledge and AI models must be considered for the transparency and interpretability of models, the incorporation of AI into the data collection and processing workflow will significantly enhance the efficiency and accuracy of various surface analysis techniques and deepen our understanding at an accelerated pace.  more » « less
Award ID(s):
2213494
PAR ID:
10621556
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Journal of Vacuum Science and Technology
Date Published:
Journal Name:
Journal of Vacuum Science & Technology A
Volume:
41
Issue:
6
ISSN:
0734-2101
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. New technologies have led to vast troves of large and complex data sets across many scientific domains and industries. People routinely use machine learning techniques not only to process, visualize, and make predictions from these big data, but also to make data-driven discoveries. These discoveries are often made using interpretable machine learning, or machine learning models and techniques that yield human-understandable insights. In this article, we discuss and review the field of interpretable machine learning, focusing especially on the techniques, as they are often employed to generate new knowledge or make discoveries from large data sets. We outline the types of discoveries that can be made using interpretable machine learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner, which promotes trust in machine learning systems and reproducibility in science. We discuss validation both from a practical perspective, reviewing approaches based on data-splitting and stability, as well as from a theoretical perspective, reviewing statistical results on model selection consistency and uncertainty quantification via statistical inference. Finally, we conclude byhighlighting open challenges in using interpretable machine learning techniques to make discoveries, including gaps between theory and practice for validating data-driven discoveries. 
    more » « less
  2. Digital pathology has transformed the traditional pathology practice of analyzing tissue under a microscope into a computer vision workflow. Whole-slide imaging allows pathologists to view and analyze microscopic images on a computer monitor, enabling computational pathology. By leveraging artificial intelligence (AI) and machine learning (ML), computational pathology has emerged as a promising field in recent years. Recently, task-specific AI/ML (eg, convolutional neural networks) has risen to the forefront, achieving above-human performance in many image-processing and computer vision tasks. The performance of task-specific AI/ML models depends on the availability of many annotated training datasets, which presents a rate-limiting factor for AI/ML development in pathology. Task-specific AI/ML models cannot benefit from multimodal data and lack generalization, eg, the AI models often struggle to generalize to new datasets or unseen variations in image acquisition, staining techniques, or tissue types. The 2020s are witnessing the rise of foundation models and generative AI. A foundation model is a large AI model trained using sizable data, which is later adapted (or fine-tuned) to perform different tasks using a modest amount of task-specific annotated data. These AI models provide in-context learning, can self-correct mistakes, and promptly adjust to user feedback. In this review, we provide a brief overview of recent advances in computational pathology enabled by task-specific AI, their challenges and limitations, and then introduce various foundation models. We propose to create a pathology-specific generative AI based on multimodal foundation models and present its potentially transformative role in digital pathology. We describe different use cases, delineating how it could serve as an expert companion of pathologists and help them efficiently and objectively perform routine laboratory tasks, including quantifying image analysis, generating pathology reports, diagnosis, and prognosis. We also outline the potential role that foundation models and generative AI can play in standardizing the pathology laboratory workflow, education, and training. 
    more » « less
  3. Most attention in K-12 artificial intelligence and machine learning (AI/ML) education has been given to having youths train models, with much less attention to the equally important testing of models when creating machine learning applications. Testing ML applications allows for the evaluation of models against predictions and can help creators of applications identify and address failure and edge cases that could negatively impact user experiences. We investigate how testing each other's projects supported youths to take perspective about functionality, performance, and potential issues in their own projects. We analyzed testing worksheets, audio and video recordings collected during a two week workshop in which 11 high school youths created physical computing projects that included (audio, pose, and image) ML classifiers. We found that through peer-testing youths reflected on the size of their training datasets, the diversity of their training data, the design of their classes and the contexts in which they produced training data. We discuss future directions for research on peer-testing in AI/ML education and current limitations for these kinds of activities. 
    more » « less
  4. In this community review report, we discuss applications and techniques for fast machine learning (ML) in science—the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs. 
    more » « less
  5. null (Ed.)
    Abstract Machine learning and artificial intelligence (ML/AI) methods have been used successfully in recent years to solve problems in many areas, including image recognition, unsupervised and supervised classification, game-playing, system identification and prediction, and autonomous vehicle control. Data-driven machine learning methods have also been applied to fusion energy research for over 2 decades, including significant advances in the areas of disruption prediction, surrogate model generation, and experimental planning. The advent of powerful and dedicated computers specialized for large-scale parallel computation, as well as advances in statistical inference algorithms, have greatly enhanced the capabilities of these computational approaches to extract scientific knowledge and bridge gaps between theoretical models and practical implementations. Large-scale commercial success of various ML/AI applications in recent years, including robotics, industrial processes, online image recognition, financial system prediction, and autonomous vehicles, have further demonstrated the potential for data-driven methods to produce dramatic transformations in many fields. These advances, along with the urgency of need to bridge key gaps in knowledge for design and operation of reactors such as ITER, have driven planned expansion of efforts in ML/AI within the US government and around the world. The Department of Energy (DOE) Office of Science programs in Fusion Energy Sciences (FES) and Advanced Scientific Computing Research (ASCR) have organized several activities to identify best strategies and approaches for applying ML/AI methods to fusion energy research. This paper describes the results of a joint FES/ASCR DOE-sponsored Research Needs Workshop on Advancing Fusion with Machine Learning, held April 30–May 2, 2019, in Gaithersburg, MD (full report available at https://science.osti.gov/-/media/fes/pdf/workshop-reports/FES_ASCR_Machine_Learning_Report.pdf ). The workshop drew on broad representation from both FES and ASCR scientific communities, and identified seven Priority Research Opportunities (PRO’s) with high potential for advancing fusion energy. In addition to the PRO topics themselves, the workshop identified research guidelines to maximize the effectiveness of ML/AI methods in fusion energy science, which include focusing on uncertainty quantification, methods for quantifying regions of validity of models and algorithms, and applying highly integrated teams of ML/AI mathematicians, computer scientists, and fusion energy scientists with domain expertise in the relevant areas. 
    more » « less