Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predictmore »
Carrier concentration optimization has been an enduring challenge when developing newly discovered semiconductors for applications (e.g., thermoelectrics, transparent conductors, photovoltaics). This barrier has been particularly pernicious in the realm of high-throughput property prediction, where the carrier concentration is often assumed to be a free parameter and the limits are not predicted due to the high computational cost. In this work, we explore the application of machine learning for high-throughput carrier concentration range prediction. Bounding the model within diamond-like semiconductors, the learning set was developed from experimental carrier concentration data on 127 compounds ranging from unary to quaternary. The data were analyzed using various statistical and machine learning methods. Accurate predictions of carrier concentration ranges in diamond-like semiconductors are made within approximately one order of magnitude on average across both
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- npj Computational Materials
- Nature Publishing Group
- Sponsoring Org:
- National Science Foundation
More Like this
Uncertainty quantification techniques for data-driven space weather modeling: thermospheric density application
Controlling thermoelectric transport via native defects in the diamond-like semiconductors Cu 2 HgGeTe 4 and Hg 2 GeTe 4Diamond like semiconductors (DLS) have emerged as candidates for thermoelectric energy conversion. Towards understanding and optimizing performance, we present a comprehensive investigation of the electronic properties of two DLS phases, quaternary Cu 2 HgGeTe 4 and related ordered vacancy compound Hg 2 GeTe 4 , including thermodynamic stability, defect chemistry, and transport properties. To establish the thermodynamic link between the related but distinct phases, the stability region for both is visualized in chemical potential space. In spite of their similar structure and bonding, we show that the two materials exhibit reciprocal behaviors for dopability. Cu 2 HgGeTe 4 is degenerately p-type in all environments despite its wide stability region, due to the presence of low-energy acceptor defects V Cu and Cu Hg and is resistant to extrinsic n-type doping. Meanwhile Hg 2 GeTe 4 has a narrow stability region and intrinsic behavior due to the relatively high formation energy of native defects, but presents an opportunity for bi-polar doping. While these two compounds have similar structure, bonding, and chemical constituents, the reciprocal nature of their dopability emerges from significant differences in band edge positions. A Brouwer band diagram approach is utilized to visualize the role of native defects on carriermore »
SCOUR: a stepwise machine learning framework for predicting metabolite-dependent regulatory interactions
The topology of metabolic networks is both well-studied and remarkably well-conserved across many species. The regulation of these networks, however, is much more poorly characterized, though it is known to be divergent across organisms—two characteristics that make it difficult to model metabolic networks accurately. While many computational methods have been built to unravel transcriptional regulation, there have been few approaches developed for systems-scale analysis and study of metabolic regulation. Here, we present a stepwise machine learning framework that applies established algorithms to identify regulatory interactions in metabolic systems based on metabolic data: stepwise classification of unknown regulation, or SCOUR.
We evaluated our framework on both noiseless and noisy data, using several models of varying sizes and topologies to show that our approach is generalizable. We found that, when testing on data under the most realistic conditions (low sampling frequency and high noise), SCOUR could identify reaction fluxes controlled only by the concentration of a single metabolite (its primary substrate) with high accuracy. The positive predictive value (PPV) for identifying reactions controlled by the concentration of two metabolites ranged from 32 to 88% for noiseless data, 9.2 to 49% for either low sampling frequency/low noise or high sampling frequency/high noisemore »
SCOUR uses a novel approach to synthetically generate the training data needed to identify regulators of reaction fluxes in a given metabolic system, enabling metabolomics and fluxomics data to be leveraged for regulatory structure inference. By identifying and triaging the most likely candidate regulatory interactions, SCOUR can drastically reduce the amount of time needed to identify and experimentally validate metabolic regulatory interactions. As high-throughput experimental methods for testing these interactions are further developed, SCOUR will provide critical impact in the development of predictive metabolic models in new organisms and pathways.
The discovery and development of ultra-wide bandgap (UWBG) semiconductors is crucial to accelerate the adoption of renewable power sources. This necessitates an UWBG semiconductor that exhibits robust doping with high carrier mobility over a wide range of carrier concentrations. Here we demonstrate that epitaxial thin films of the perovskite oxide Nd
xSr1− xSnO3(SSO) do exactly this. Nd is used as a donor to successfully modulate the carrier concentration over nearly two orders of magnitude, from 3.7 × 1018 cm−3to 2.0 × 1020 cm−3. Despite being grown on lattice-mismatched substrates and thus having relatively high structural disorder, SSO films exhibited the highest room-temperature mobility, ~70 cm2 V−1 s−1, among all known UWBG semiconductors in the range of carrier concentrations studied. The phonon-limited mobility is calculated from first principles and supplemented with a model to treat ionized impurity and Kondo scattering. This produces excellent agreement with experiment over a wide range of temperatures and carrier concentrations, and predicts the room-temperature phonon-limited mobility to be 76–99 cm2 V−1 s−1depending on carrier concentration. This work establishes a perovskite oxide as an emerging UWBG semiconductor candidate with potential for applications in power electronics.
An integrated cyberGIS and machine learning framework for fine-scale prediction of Urban Heat Island using satellite remote sensing and urban sensor network data
Due to climate change and rapid urbanization, Urban Heat Island (UHI), featuring significantly higher temperature in metropolitan areas than surrounding areas, has caused negative impacts on urban communities. Temporal granularity is often limited in UHI studies based on satellite remote sensing data that typically has multi-day frequency coverage of a particular urban area. This low temporal frequency has restricted the development of models for predicting UHI. To resolve this limitation, this study has developed a cyber-based geographic information science and systems (cyberGIS) framework encompassing multiple machine learning models for predicting UHI with high-frequency urban sensor network data combined with remote sensing data focused on Chicago, Illinois, from 2018 to 2020. Enabled by rapid advances in urban sensor network technologies and high-performance computing, this framework is designed to predict UHI in Chicago with fine spatiotemporal granularity based on environmental data collected with the Array of Things (AoT) urban sensor network and Landsat-8 remote sensing imagery. Our computational experiments revealed that a random forest regression (RFR) model outperforms other models with the prediction accuracy of 0.45 degree Celsius in 2020 and 0.8 degree Celsius in 2018 and 2019 with mean absolute error as the evaluation metric. Humidity, distance to geographic center, and PM2.5concentrationmore »