Abstract Modern machine learning (ML) and deep learning (DL) techniques using high-dimensional data representations have helped accelerate the materials discovery process by efficiently detecting hidden patterns in existing datasets and linking input representations to output properties for a better understanding of the scientific phenomenon. While a deep neural network comprised of fully connected layers has been widely used for materials property prediction, simply creating a deeper model with a large number of layers often faces with vanishing gradient problem, causing a degradation in the performance, thereby limiting usage. In this paper, we study and propose architectural principles to address the question of improving the performance of model training and inference under fixed parametric constraints. Here, we present a general deep-learning framework based on branched residual learning (BRNet) with fully connected layers that can work with any numerical vector-based representation as input to build accurate models to predict materials properties. We perform model training for materials properties using numerical vectors representing different composition-based attributes of the respective materials and compare the performance of the proposed models against traditional ML and existing DL architectures. We find that the proposed models are significantly more accurate than the ML/DL models for all data sizes by using different composition-based attributes as input. Further, branched learning requires fewer parameters and results in faster model training due to better convergence during the training phase than existing neural networks, thereby efficiently building accurate models for predicting materials properties.
more »
« less
Residual Neural Network Architectures to Improve Prediction Accuracy of Properties of Materials
Properties in material composition and crystal structures have been explored by density functional theory (DFT) calculations, using databases such as the Open Quantum Materials Database (OQMD). Databases like these have been used currently for the training of advanced machine learning and deep neural network models, the latter providing higher performance when predicting properties of materials. However, current alternatives have shown a deterioration in accuracy when increasing the number of layers in their architecture (over-fitting problem). As an alternative method to address this problem, we have implemented residual neural network architectures based on Merge and Run Networks, IRNet and UNet to improve performance while relaxing the observed network depth limitation. The evaluation of the proposed architectures include a 9:1 ratio to train and test as well as 10 fold cross validation. In the experiments we found that our proposed architectures based on IRNet and UNet are able to obtain a lower Mean Absolute Error (MAE) than current strategies. The full implementation (Python, Tensorflow and Keras) and the trained networks will be available online for community validation and advancing the state of the art from our findings.
more »
« less
- Award ID(s):
- 1750970
- PAR ID:
- 10324031
- Date Published:
- Journal Name:
- IEEE International Conference on Big Data
- ISSN:
- 2639-1589
- Page Range / eLocation ID:
- 2915-2918
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Global warming is one of the world’s most pressing issues. The study of its effects on the polar ice caps and other arctic environments, however, can be hindered by the often dangerous and difficult to navigate terrain found there. Multi-terrain autonomous vehicles can assist researchers by providing a mobile platform on which to collect data in these harsh environments while avoiding any risk to human life and speeding up the research process. The mechanical design and ultimate efficacy of these autonomous robotic vehicles depends largely on the specific missions they are deployed for, but terrain conditions can vary wildly geographically as well as seasonally, making mission planning for these unmanned vehicles more difficult. This paper proposes the use of various UNet-based neural network architectures to generate digital elevation maps from satellite images, and explores and compares their efficacy on a single set of training and validation datasets generated from satellite imagery. These digital elevation maps generated by the model could be used by researchers not only to track the change in arctic topography over time, but to quickly provide autonomous exploratory research rovers with the topographical information necessary to decide on optimal paths during the mission. This paper analyzes different model architectures and training schemes: a traditional UNet, a traditional UNet with data augmentation, a UNet with a single active skip-layer vision transformer (ViT), and a UNet with multiple active skip-layer ViT. Each model was trained on a dataset of satellite images and corresponding digital elevation maps of Ellesmere Island, Canada. Utilizing ViTs did not demonstrate a significant improvement in UNet performance, though this could change with longer training. This paper proposes opportunities to improve performance for these neural networks, as well as next steps for further research, including improving the diversity of images in the dataset, generating a testing dataset from a completely different geographic location, and allowing the models more time to train.more » « less
-
Deep learning algorithms have been successfully adopted to extract meaningful information from digital images, yet many of them have been untapped in the semantic image segmentation of histopathology images. In this paper, we propose a deep convolutional neural network model that strengthens Atrous separable convolutions with a high rate within spatial pyramid pooling for histopathology image segmentation. A well-known model called DeepLabV3Plus was used for the encoder and decoder process. ResNet50 was adopted for the encoder block of the model which provides us the advantage of attenuating the problem of the increased depth of the network by using skip connections. Three Atrous separable convolutions with higher rates were added to the existing Atrous separable convolutions. We conducted a performance evaluation on three tissue types: tumor, tumor-infiltrating lymphocytes, and stroma for comparing the proposed model with the eight state-of-the-art deep learning models: DeepLabV3, DeepLabV3Plus, LinkNet, MANet, PAN, PSPnet, UNet, and UNet++. The performance results show that the proposed model outperforms the eight models on mIOU (0.8058/0.7792) and FSCR (0.8525/0.8328) for both tumor and tumor-infiltrating lymphocytes.more » « less
-
State-of-the-art neural network architectures continue to scale in size and deliver impressive generalization results, although this comes at the expense of limited interpretability. In particular, a key challenge is to determine when to stop training the model, as this has a significant impact on generalization. Convolutional neural networks (ConvNets) comprise high-dimensional feature spaces formed by the aggregation of multiple channels, where analyzing intermediate data representations and the model's evolution can be challenging owing to the curse of dimensionality. We present channel-wise DeepNNK (CW-DeepNNK), a novel channel-wise generalization estimate based on non-negative kernel regression (NNK) graphs with which we perform local polytope interpolation on low-dimensional channels. This method leads to instance-based interpretability of both the learned data representations and the relationship between channels. Motivated by our observations, we use CW-DeepNNK to propose a novel early stopping criterion that (i) does not require a validation set, (ii) is based on a task performance metric, and (iii) allows stopping to be reached at different points for each channel. Our experiments demonstrate that our proposed method has advantages as compared to the standard criterion based on validation set performance.more » « less
-
The accelerated warming conditions of the high Arctic have intensified the extensive thawing of permafrost. Retrogressive thaw slumps (RTSs) are considered as the most active landforms in the Arctic permafrost. An increase in RTSs has been observed in the Arctic in recent decades. Continuous monitoring of RTSs is important to understand climate change-driven disturbances in the region. Manual detection of these landforms is extremely difficult as they occur over exceptionally large areas. Only very few studies have explored the utility of very high spatial resolution (VHSR) commercial satellite imagery in the automated mapping of RTSs. We have developed deep learning (DL) convolution neural net (CNN) based workflow to automatically detect RTSs from VHRS satellite imagery. This study systematically compared the performance of different DLCNN model architectures and varying backbones. Our candidate CNN models include: DeepLabV3+, UNet, UNet++, Multi-scale Attention Net (MA-Net), and Pyramid Attention Network (PAN) with ResNet50, ResNet101 and ResNet152 backbones. The RTS modeling experiment was conducted on Banks Island and Ellesmere Island in Canada. The UNet++ model demonstrated the highest accuracy (F1 score of 87%) with the ResNet50 backbone at the expense of training and inferencing time. PAN, DeepLabV3, MaNet, and UNet, models reported mediocre F1 scores of 72%, 75%, 80%, and 81% respectively. Our findings unravel the performances of different DLCNNs in imagery-enabled RTS mapping and provide useful insights on operationalizing the mapping application across the Arctic.more » « less
An official website of the United States government

