Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Abstract One compelling vision of the future of materials discovery and design involves the use of machine learning (ML) models to predict materials properties and then rapidly find materials tailored for specific applications. However, realizing this vision requires both providing detailed uncertainty quantification (model prediction errors and domain of applicability) and making models readily usable. At present, it is common practice in the community to assess ML model performance only in terms of prediction accuracy (e.g. mean absolute error), while neglecting detailed uncertainty quantification and robust model accessibility and usability. Here, we demonstrate a practical method for realizing both uncertainty and accessibility features with a large set of models. We develop random forest ML models for 33 materials properties spanning an array of data sources (computational and experimental) and property types (electrical, mechanical, thermodynamic, etc). All models have calibrated ensemble error bars to quantify prediction uncertainty and domain of applicability guidance enabled by kernel-density-estimate-based feature distance measures. All data and models are publicly hosted on the Garden-AI infrastructure, which provides an easy-to-use, persistent interface for model dissemination that permits models to be invoked with only a few lines of Python code. We demonstrate the power of this approach by using our models to conduct a fully ML-based materials discovery exercise to search for new stable, highly active perovskite oxide catalyst materials.more » « less
- 
            Abstract Electron counting can be performed algorithmically for monolithic active pixel sensor direct electron detectors to eliminate readout noise and Landau noise arising from the variability in the amount of deposited energy for each electron. Errors in existing counting algorithms include mistakenly counting a multielectron strike as a single electron event, and inaccurately locating the incident position of the electron due to lateral spread of deposited energy and dark noise. Here, we report a supervised deep learning (DL) approach based on Faster region-based convolutional neural network (R-CNN) to recognize single electron events at varying electron doses and voltages. The DL approach shows high accuracy according to the near-ideal modulation transfer function (MTF) and detector quantum efficiency for sparse images. It predicts, on average, 0.47 pixel deviation from the incident positions for 200 kV electrons versus 0.59 pixel using the conventional counting method. The DL approach also shows better robustness against coincidence loss as the electron dose increases, maintaining the MTF at half Nyquist frequency above 0.83 as the electron density increases to 0.06 e−/pixel. Thus, the DL model extends the advantages of counting analysis to higher dose rates than conventional methods.more » « less
- 
            Abstract A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation. Learning from this initiative, and acknowledging the impact of artificial intelligence (AI) in the practice of science and engineering, we introduce a set of practical, concise, and measurable FAIR principles for AI models. We showcase how to create and share FAIR data and AI models within a unified computational framework combining the following elements: the Advanced Photon Source at Argonne National Laboratory, the Materials Data Facility, the Data and Learning Hub for Science, and funcX, and the Argonne Leadership Computing Facility (ALCF), in particular the ThetaGPU supercomputer and the SambaNova DataScale®system at the ALCF AI Testbed. We describe how this domain-agnostic computational framework may be harnessed to enable autonomous AI-driven discovery.more » « less
- 
            Abstract Transition metal dichalcogenides (TMDs), especially in two-dimensional (2D) form, exhibit many properties desirable for device applications. However, device performance can be hindered by the presence of defects. Here, we combine state of the art experimental and computational approaches to determine formation energies and charge transition levels of defects in bulk and 2D MX2(M = Mo or W; X = S, Se, or Te). We perform deep level transient spectroscopy (DLTS) measurements of bulk TMDs. Simultaneously, we calculate formation energies and defect levels of all native point defects, which enable identification of levels observed in DLTS and extend our calculations to vacancies in 2D TMDs, for which DLTS is challenging. We find that reduction of dimensionality of TMDs to 2D has a significant impact on defect properties. This finding may explain differences in optical properties of 2D TMDs synthesized with different methods and lays foundation for future developments of more efficient TMD-based devices.more » « less
- 
            Abstract Obtaining accurate estimates of machine learning model uncertainties on newly predicted data is essential for understanding the accuracy of the model and whether its predictions can be trusted. A common approach to such uncertainty quantification is to estimate the variance from an ensemble of models, which are often generated by the generally applicable bootstrap method. In this work, we demonstrate that the direct bootstrap ensemble standard deviation is not an accurate estimate of uncertainty but that it can be simply calibrated to dramatically improve its accuracy. We demonstrate the effectiveness of this calibration method for both synthetic data and numerous physical datasets from the field of Materials Science and Engineering. The approach is motivated by applications in physical and biological science but is quite general and should be applicable for uncertainty quantification in a wide range of machine learning regression models.more » « less
- 
            Free, publicly-accessible full text available December 1, 2026
- 
            Free, publicly-accessible full text available July 1, 2026
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
