skip to main content


Title: Residual Neural Network Architectures to Improve Prediction Accuracy of Properties of Materials
Properties in material composition and crystal structures have been explored by density functional theory (DFT) calculations, using databases such as the Open Quantum Materials Database (OQMD). Databases like these have been used currently for the training of advanced machine learning and deep neural network models, the latter providing higher performance when predicting properties of materials. However, current alternatives have shown a deterioration in accuracy when increasing the number of layers in their architecture (over-fitting problem). As an alternative method to address this problem, we have implemented residual neural network architectures based on Merge and Run Networks, IRNet and UNet to improve performance while relaxing the observed network depth limitation. The evaluation of the proposed architectures include a 9:1 ratio to train and test as well as 10 fold cross validation. In the experiments we found that our proposed architectures based on IRNet and UNet are able to obtain a lower Mean Absolute Error (MAE) than current strategies. The full implementation (Python, Tensorflow and Keras) and the trained networks will be available online for community validation and advancing the state of the art from our findings.  more » « less
Award ID(s):
1750970
NSF-PAR ID:
10324031
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE International Conference on Big Data
ISSN:
2639-1589
Page Range / eLocation ID:
2915-2918
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Modern machine learning (ML) and deep learning (DL) techniques using high-dimensional data representations have helped accelerate the materials discovery process by efficiently detecting hidden patterns in existing datasets and linking input representations to output properties for a better understanding of the scientific phenomenon. While a deep neural network comprised of fully connected layers has been widely used for materials property prediction, simply creating a deeper model with a large number of layers often faces with vanishing gradient problem, causing a degradation in the performance, thereby limiting usage. In this paper, we study and propose architectural principles to address the question of improving the performance of model training and inference under fixed parametric constraints. Here, we present a general deep-learning framework based on branched residual learning (BRNet) with fully connected layers that can work with any numerical vector-based representation as input to build accurate models to predict materials properties. We perform model training for materials properties using numerical vectors representing different composition-based attributes of the respective materials and compare the performance of the proposed models against traditional ML and existing DL architectures. We find that the proposed models are significantly more accurate than the ML/DL models for all data sizes by using different composition-based attributes as input. Further, branched learning requires fewer parameters and results in faster model training due to better convergence during the training phase than existing neural networks, thereby efficiently building accurate models for predicting materials properties. 
    more » « less
  2. Global warming is one of the world’s most pressing issues. The study of its effects on the polar ice caps and other arctic environments, however, can be hindered by the often dangerous and difficult to navigate terrain found there. Multi-terrain autonomous vehicles can assist researchers by providing a mobile platform on which to collect data in these harsh environments while avoiding any risk to human life and speeding up the research process. The mechanical design and ultimate efficacy of these autonomous robotic vehicles depends largely on the specific missions they are deployed for, but terrain conditions can vary wildly geographically as well as seasonally, making mission planning for these unmanned vehicles more difficult. This paper proposes the use of various UNet-based neural network architectures to generate digital elevation maps from satellite images, and explores and compares their efficacy on a single set of training and validation datasets generated from satellite imagery. These digital elevation maps generated by the model could be used by researchers not only to track the change in arctic topography over time, but to quickly provide autonomous exploratory research rovers with the topographical information necessary to decide on optimal paths during the mission. This paper analyzes different model architectures and training schemes: a traditional UNet, a traditional UNet with data augmentation, a UNet with a single active skip-layer vision transformer (ViT), and a UNet with multiple active skip-layer ViT. Each model was trained on a dataset of satellite images and corresponding digital elevation maps of Ellesmere Island, Canada. Utilizing ViTs did not demonstrate a significant improvement in UNet performance, though this could change with longer training. This paper proposes opportunities to improve performance for these neural networks, as well as next steps for further research, including improving the diversity of images in the dataset, generating a testing dataset from a completely different geographic location, and allowing the models more time to train. 
    more » « less
  3. Data driven generative deep learning models have recently emerged as one of the most promising approaches for new materials discovery. While generator models can generate millions of candidates, it is critical to train fast and accurate machine learning models to filter out stable, synthesizable materials with the desired properties. However, such efforts to build supervised regression or classification screening models have been severely hindered by the lack of unstable or unsynthesizable samples, which usually are not collected and deposited in materials databases such as ICSD and Materials Project (MP). At the same time, there is a significant amount of unlabelled data available in these databases. Here we propose a semi-supervised deep neural network (TSDNN) model for high-performance formation energy and synthesizability prediction, which is achieved via its unique teacher-student dual network architecture and its effective exploitation of the large amount of unlabeled data. For formation energy based stability screening, our semi-supervised classifier achieves an absolute 10.3% accuracy improvement compared to the baseline CGCNN regression model. For synthesizability prediction, our model significantly increases the baseline PU learning's true positive rate from 87.9% to 92.9% using 1/49 model parameters. To further prove the effectiveness of our models, we combined our TSDNN-energy and TSDNN-synthesizability models with our CubicGAN generator to discover novel stable cubic structures. Out of the 1000 recommended candidate samples by our models, 512 of them have negative formation energies as validated by our DFT formation energy calculations. Our experimental results show that our semi-supervised deep neural networks can significantly improve the screening accuracy in large-scale generative materials design. Our source code can be accessed at https://git/hub.com/usccolumbia/tsdnn. 
    more » « less
  4. Abstract

    Exploring new techniques to improve the prediction of tropical cyclone (TC) formation is essential for operational practice. Using convolutional neural networks, this study shows that deep learning can provide a promising capability for predicting TC formation from a given set of large-scale environments at certain forecast lead times. Specifically, two common deep-learning architectures including the residual net (ResNet) and UNet are used to examine TC formation in the Pacific Ocean. With a set of large-scale environments extracted from the NCEP–NCAR reanalysis during 2008–21 as input and the TC labels obtained from the best track data, we show that both ResNet and UNet reach their maximum forecast skill at the 12–18-h forecast lead time. Moreover, both architectures perform best when using a large domain covering most of the Pacific Ocean for input data, as compared to a smaller subdomain in the western Pacific. Given its ability to provide additional information about TC formation location, UNet performs generally worse than ResNet across the accuracy metrics. The deep learning approach in this study presents an alternative way to predict TC formation beyond the traditional vortex-tracking methods in the current numerical weather prediction.

    Significance Statement

    This study presents a new approach for predicting tropical cyclone (TC) formation based on deep learning (DL). Using two common DL architectures in visualization research and a set of large-scale environments in the Pacific Ocean extracted from the reanalysis data, we show that DL has an optimal capability of predicting TC formation at the 12–18-h lead time. Examining the DL performance for different domain sizes shows that the use of a large domain size for input data can help capture some far-field information needed for predicting TCG. The DL approach in this study demonstrates an alternative way to predict or detect TC formation beyond the traditional vortex-tracking methods used in the current numerical weather prediction.

     
    more » « less
  5. In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging including brain tumor segmentation. We investigate that segmenting a brain tumor is facing to the imbalanced data problem where the number of pixels belonging to the background class (non tumor pixel) is much larger than the number of pixels belonging to the foreground class (tumor pixel). To address this problem, we propose a multitask network which is formed as a cascaded structure. Our model consists of two targets, i.e., (i) effectively differentiate the brain tumor regions and (ii) estimate the brain tumor mask. The first objective is performed by our proposed contextual brain tumor detection network, which plays a role of an attention gate and focuses on the region around brain tumor only while ignoring the far neighbor background which is less correlated to the tumor. Different from other existing object detection networks which process every pixel, our contextual brain tumor detection network only processes contextual regions around ground-truth instances and this strategy aims at producing meaningful regions proposals. The second objective is built upon a 3D atrous residual network and under an encode-decode network in order to effectively segment both large and small objects (brain tumor). Our 3D atrous residual network is designed with a skip connection to enables the gradient from the deep layers to be directly propagated to shallow layers, thus, features of different depths are preserved and used for refining each other. In order to incorporate larger contextual information from volume MRI data, our network utilizes the 3D atrous convolution with various kernel sizes, which enlarges the receptive field of filters. Our proposed network has been evaluated on various datasets including BRATS2015, BRATS2017 and BRATS2018 datasets with both validation set and testing set. Our performance has been benchmarked by both regionbased metrics and surface-based metrics. We also have conducted comparisons against state-of-the-art approaches 
    more » « less