This study developed a hybrid model for predicting dissolved oxygen (DO) using real-time sensor data for thirteen parameters. This novel hybrid model integrated one-dimensional convolutional neural networks (CNN) and long short-term memory (LSTM) to improve the accuracy of prediction for DO in water. The hybrid CNNLSTM model predicted DO concentration in water using soft sensor data. The primary input parameters to the model were temperature, pH, specific conductivity, salinity, density, chlorophyll, and blue-green algae. The model used 38,681 water quality data for training and testing the hybrid deep learning network. The training procedure for the model was successful. The training and test losses were both nearly zero and within a similar range. With a coefficient of determination (R2) of 0.94 and a mean squared error (MSE) of 0.12, the hybrid model indicated higher performance compared to the classical models. The normal distribution of residual errors confirmed the reliability of the DO predictions by the hybrid CNN-LSTM model. Feature importance analysis indicated pH as the most significant predictor and temperature as the second important predictor. The feature importance scores based on extreme gradient boosting (XGBoost) for the pH and temperature were 0.76 and 0.12, respectively. This study indicated that the hybrid model can outperform the classical machine learning models in the real-time prediction of DO concentration.
more »
« less
This content will become publicly available on December 1, 2025
Deep learning models for predicting plant uptake of emerging contaminants by including the role of plant macromolecular compositions
Accurate prediction of the uptake and translocation of emerging contaminants in plants has serious implications for assessing impacts on ecosystems and human health. However, traditional modeling approaches are not reliable in the prediction of transpiration stream concentration factor (TSCF) and root concentration factor (RCF). This study applied deep neural networks (DNN), recurrent neural networks (RNN), and long short-term memory (LSTM) to enhance the accuracy of predictive models for TSCF and RCF. The predictions and feature importance analysis were based on nine chemical properties and two plant root macromolecular compositions. The results indicated that deep learning models predict TSCF and RCF with improved accuracy compared to mechanistic models. The coefficient of determination (R^2) for the DNN, RNN, and LSTM models in predicting TSCF was 0.62, 0.67, and 0.56, respectively. The corresponding mean squared error (MSE) on the test set for the models was 0.055, 0.035, and 0.06, respectively. The R^2 for the DNN, RNN, and LSTM models in predicting RCF was 0.90, 0.91, and 0.84, respectively. The corresponding MSE for the models was 0.124, 0.071, and 0.126, respectively. The results of feature extraction using extreme gradient boosting underlined the importance of lipophilicity and root lipid fraction.
more »
« less
- PAR ID:
- 10575306
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Journal of Hazardous Materials
- Volume:
- 480
- Issue:
- C
- ISSN:
- 0304-3894
- Page Range / eLocation ID:
- 135921
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Predicting the transpiration stream concentration factor (TSCF) and other concentration factors is essential in understanding the plant uptake of organic contaminants. Traditional mechanistic and numerical modeling methods often fail to reliably predict the TSCF. This study developed a hybrid deep model to predict TSCF by integrating convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. This hybrid CNN-LSTM model used eight physicochemical properties of organic contaminants to predict TSCF. The training procedure for this hybrid model was successful. The results indicated the training and test losses for predicting TSCF were both in the same order and close to zero. This study showed that the hybrid CNN-LSTM model can outperform mechanistic models and have higher performances compared to classical machine learning models. Feature importance analysis using extreme gradient boosting highlighted the role and importance of lipophilicity in predicting uptake and translocation of organic contaminants.more » « less
-
In directed energy deposition (DED), accurately controlling and predicting melt pool characteristics is essential for ensuring desired material qualities and geometric accuracies. This paper introduces a robust surrogate model based on recurrent neural network (RNN) architectures—Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), and Gated Recurrent Unit (GRU). Leveraging a time series dataset from multi-physics simulations and a three-factor, three-level experimental design, the model accurately predicts melt pool peak temperatures, lengths, widths, and depths under varying conditions. RNN algorithms, particularly Bi-LSTM, demonstrate high predictive accuracy, with an R-square of 0.983 for melt pool peak temperatures. For melt pool geometry, the GRU-based model excels, achieving R-square values above 0.88 and reducing computation time by at least 29%, showcasing its accuracy and efficiency. The RNN-based surrogate model built in this research enhances understanding of melt pool dynamics and supports precise DED system setups.more » « less
-
Many coastal cities are facing frequent flooding from storm events that are made worse by sea level rise and climate change. The groundwater table level in these low relief coastal cities is an important, but often overlooked, factor in the recurrent flooding these locations face. Infiltration of stormwater and water intrusion due to tidal forcing can cause already shallow groundwater tables to quickly rise toward the land surface. This decreases available storage which increases runoff, stormwater system loads, and flooding. Groundwater table forecasts, which could help inform the modeling and management of coastal flooding, are generally unavailable. This study explores two machine learning models, Long Short-term Memory (LSTM) networks and Recurrent Neural Networks (RNN), to model and forecast groundwater table response to storm events in the flood prone coastal city of Norfolk, Virginia. To determine the effect of training data type on model accuracy, two types of datasets (i) the continuous time series and (ii) a dataset of only storm events, created from observed groundwater table, rainfall, and sea level data from 2010–2018 are used to train and test the models. Additionally, a real-time groundwater table forecasting scenario was carried out to compare the models’ abilities to predict groundwater table levels given forecast rainfall and sea level as input data. When modeling the groundwater table with observed data, LSTM networks were found to have more predictive skill than RNNs (root mean squared error (RMSE) of 0.09 m versus 0.14 m, respectively). The real-time forecast scenario showed that models trained only on storm event data outperformed models trained on the continuous time series data (RMSE of 0.07 m versus 0.66 m, respectively) and that LSTM outperformed RNN models. Because models trained with the continuous time series data had much higher RMSE values, they were not suitable for predicting the groundwater table in the real-time scenario when using forecast input data. These results demonstrate the first use of LSTM networks to create hourly forecasts of groundwater table in a coastal city and show they are well suited for creating operational forecasts in real-time. As groundwater table levels increase due to sea level rise, forecasts of groundwater table will become an increasingly valuable part of coastal flood modeling and management.more » « less
-
Machine and deep learning-based algorithms are the emerging approaches in addressing prediction problems in time series. These techniques have been shown to produce more accurate results than conventional regression-based modeling. It has been reported that artificial Recurrent Neural Networks (RNN) with memory, such as Long Short-Term Memory (LSTM), are superior compared to Autoregressive Integrated Moving Average (ARIMA) with a large margin. The LSTM-based models incorporate additional “gates” for the purpose of memorizing longer sequences of input data. The major question is that whether the gates incorporated in the LSTM architecture already offers a good prediction and whether additional training of data would be necessary to further improve the prediction. Bidirectional LSTMs (BiLSTMs) enable additional training by traversing the input data twice (i.e., 1) left-to-right, and 2) right-to-left). The research question of interest is then whether BiLSTM, with additional training capability, outperforms regular unidirectional LSTM. This paper reports a behavioral analysis and comparison of BiLSTM and LSTM models. The objective is to explore to what extend additional layers of training of data would be beneficial to tune the involved parameters. The results show that additional training of data and thus BiLSTM-based modeling offers better predictions than regular LSTM-based models. More specifically, it was observed that BiLSTM models provide better predictions compared to ARIMA and LSTM models. It was also observed that BiLSTM models reach the equilibrium much slower than LSTM-based models.more » « less
An official website of the United States government
