skip to main content

Title: A Machine Learning Tutorial for Operational Meteorology. Part II: Neural Networks and Deep Learning

Over the past decade the use of machine learning in meteorology has grown rapidly. Specifically neural networks and deep learning have been used at an unprecedented rate. To fill the dearth of resources covering neural networks with a meteorological lens, this paper discusses machine learning methods in a plain language format that is targeted to the operational meteorological community. This is the second paper in a pair that aim to serve as a machine learning resource for meteorologists. While the first paper focused on traditional machine learning methods (e.g., random forest), here a broad spectrum of neural networks and deep learning methods is discussed. Specifically, this paper covers perceptrons, artificial neural networks, convolutional neural networks, and U-networks. Like the Part I paper, this manuscript discusses the terms associated with neural networks and their training. Then the manuscript provides some intuition behind every method and concludes by showing each method used in a meteorological example of diagnosing thunderstorms from satellite images (e.g., lightning flashes). This paper is accompanied with an open-source code repository to allow readers to explore neural networks using either the dataset provided (which is used in the paper) or as a template for alternate datasets.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Weather and Forecasting
Page Range / eLocation ID:
p. 1271-1293
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An efficient feature selection method can significantly boost results in classification problems. Despite ongoing improvement, hand-designed methods often fail to extract features capturing high- and mid-level representations at effective levels. In machine learning (Deep Learning), recent developments have improved upon these hand-designed methods by utilizing automatic extraction of features. Specifically, Convolutional Neural Networks (CNNs) are a highly successful technique for image classification which can automatically extract features, with ongoing learning and classification of these features. The purpose of this study is to detect hydraulic structures (i.e., bridges and culverts) that are important to overland flow modeling and environmental applications. The dataset used in this work is a relatively small dataset derived from 1-m LiDAR-derived Digital Elevation Models (DEMs) and National Agriculture Imagery Program (NAIP) aerial imagery. The classes for our experiment consist of two groups: the ones with a bridge/culvert being present are considered "True", and those without a bridge/culvert are considered "False". In this paper, we use advanced CNN techniques, including Siamese Neural Networks (SNNs), Capsule Networks (CapsNets), and Graph Convolutional Networks (GCNs), to classify samples with similar topographic and spectral characteristics, an objective which is challenging utilizing traditional machine learning techniques, such as Support Vector Machine (SVM), Gaussian Classifier (GC), and Gaussian Mixture Model (GMM). The advanced CNN-based approaches combined with data pre-processing techniques (e.g., data augmenting) produced superior results. These approaches provide efficient, cost-effective, and innovative solutions to the identification of hydraulic structures. 
    more » « less
  2. Abstract

    Deep neural networks (DNNs) are widely used to handle many difficult tasks, such as image classification and malware detection, and achieve outstanding performance. However, recent studies on adversarial examples, which have maliciously undetectable perturbations added to their original samples that are indistinguishable by human eyes but mislead the machine learning approaches, show that machine learning models are vulnerable to security attacks. Though various adversarial retraining techniques have been developed in the past few years, none of them is scalable. In this paper, we propose a new iterative adversarial retraining approach to robustify the model and to reduce the effectiveness of adversarial inputs on DNN models. The proposed method retrains the model with both Gaussian noise augmentation and adversarial generation techniques for better generalization. Furthermore, the ensemble model is utilized during the testing phase in order to increase the robust test accuracy. The results from our extensive experiments demonstrate that the proposed approach increases the robustness of the DNN model against various adversarial attacks, specifically, fast gradient sign attack, Carlini and Wagner (C&W) attack, Projected Gradient Descent (PGD) attack, and DeepFool attack. To be precise, the robust classifier obtained by our proposed approach can maintain a performance accuracy of 99% on average on the standard test set. Moreover, we empirically evaluate the runtime of two of the most effective adversarial attacks, i.e., C&W attack and BIM attack, to find that the C&W attack can utilize GPU for faster adversarial example generation than the BIM attack can. For this reason, we further develop a parallel implementation of the proposed approach. This parallel implementation makes the proposed approach scalable for large datasets and complex models.

    more » « less
  3. Abstract

    Noncoding RNAs (ncRNAs) have recently attracted considerable attention due to their key roles in biology. The ncRNA–proteins interaction (NPI) is often explored to reveal some biological activities that ncRNA may affect, such as biological traits, diseases, etc. Traditional experimental methods can accomplish this work but are often labor-intensive and expensive. Machine learning and deep learning methods have achieved great success by exploiting sufficient sequence or structure information. Graph Neural Network (GNN)-based methods consider the topology in ncRNA–protein graphs and perform well on tasks like NPI prediction. Based on GNN, some pairwise constraint methods have been developed to apply on homogeneous networks, but not used for NPI prediction on heterogeneous networks. In this paper, we construct a pairwise constrained NPI predictor based on dual Graph Convolutional Network (GCN) called NPI-DGCN. To our knowledge, our method is the first to train a heterogeneous graph-based model using a pairwise learning strategy. Instead of binary classification, we use a rank layer to calculate the score of an ncRNA–protein pair. Moreover, our model is the first to predict NPIs on the ncRNA–protein bipartite graph rather than the homogeneous graph. We transform the original ncRNA–protein bipartite graph into two homogenous graphs on which to explore second-order implicit relationships. At the same time, we model direct interactions between two homogenous graphs to explore explicit relationships. Experimental results on the four standard datasets indicate that our method achieves competitive performance with other state-of-the-art methods. And the model is available at

    more » « less
  4. Abstract

    Submarine groundwater discharge (SGD) is an important driver of coastal biogeochemical budgets worldwide. Radon (222Rn) has been widely used as a natural geochemical tracer to quantify SGD, but field measurements are time consuming and costly. Here, we use deep learning to predict coastal seawater radon in SGD‐impacted regions. We hypothesize that deep learning could resolve radon trends and enable preliminary insights with limited field observations of groundwater tracers. Two deep learning models were trained on global coastal seawater radon observations (n = 39,238) with widely available inputs (e.g., salinity, temperature, water depth). The first model used a one‐dimensional convolutional neural network (1D‐CNN‐RNN) framework for site‐specific gap filling and producing short‐term future predictions. A second model applied a fully connected neural network (FCNN) framework to predict radon across geographically and hydrologically diverse settings. Both models can predict observed radon concentrations withr2 > 0.76. Specifically, the FCNN model offers a compelling development because synthetic radon tracer data sets can be obtained using only basic water quality and meteorological parameters. This opens opportunities to attain radon data from regions with large data gaps, such as the Global South and other remote locations, allowing for insights that can be used to predict SGD and plan field experiments. Overall, we demonstrate how field‐based measurements combined with big‐data approaches such as deep learning can be utilized to assess radon and potentially SGD beyond local scales.

    more » « less
  5. Cancer diagnostics is an important field of cancer recovery and survival with many expensive procedures needed to administer the correct treatment. Machine Learning (ML) approaches can help with the diagnostic prediction from circulating tumor cells in liquid biopsy or from a primary tumor in solid biopsy. After predicting the metastatic potential from a deep learning model, doctors in a clinical setting can administer a safe and correct treatment for a specific patient. This paper investigates the use of deep convolutional neural networks for predicting a specific cancer cell line as a tool for label free identification. Specifically, deep learning strategies for weight initialization and performance metrics are described, with transfer learning and the accuracy metric utilized in this work. The equipment used for prediction involves brightfield microscopy without the use of chemical labels, advanced instruments, or time-consuming biological techniques, giving an advantage over current diagnostic methods. In the procedure, three different binary datasets of well-known cancer cell lines were collected, each having a difference in metastatic potential. Two different classification models were adopted (EfficientNetV2 and ResNet-50) with the analysis given for each stage in the ML architecture. The training results for each model and dataset are provided and systematically compared. We found that the test set accuracy showed favorable performance for both ML models with EfficientNetV2 accuracy reaching up to 99%. These test results allowed EfficientNetV2 to outperform ResNet-50 at an average percent increase of 3.5% for each dataset. The high accuracy obtained from the predictions demonstrates that the system can be retrained on a large-scale clinical dataset.

    more » « less