The objective of this study is to develop data-driven predictive models for permanent settlement of rocking shallow foundations during seismic loading using multiple machine learning algorithms and supervised learning technique. Data from a rocking foundation database consisting of dynamic base shaking experiments conducted on centrifuges and shaking tables have been used for the development of k-nearest neighbors regression, support vector regression, and random forest regression models. Based on repeated k-fold cross validation tests of models and mean absolute percentage errors in their predictions, it is found that all three models perform better than a baseline multivariate linear regression model in terms of accuracy and variance in predictions. The average mean absolute errors in predictions of all three models are around 0.005 to 0.006, indicating that the rocking induced permanent settlement can be predicted within an average error limit of 0.5% to 0.6% of the width of the footing.
more »
« less
Estimating Compressional Velocity and Bulk Density Logs in Marine Gas Hydrates Using Machine Learning
Compressional velocity (Vp) and bulk density (ρb) logs are essential for characterizing gas hydrates and near-seafloor sediments; however, it is sometimes difficult to acquire these logs due to poor borehole conditions, safety concerns, or cost-related issues. We present a machine learning approach to predict either compressional Vp or ρb logs with high accuracy and low error in near-seafloor sediments within water-saturated intervals, in intervals where hydrate fills fractures, and intervals where hydrate occupies the primary pore space. We use scientific-quality logging-while-drilling well logs, gamma ray, ρb, Vp, and resistivity to train the machine learning model to predict Vp or ρb logs. Of the six machine learning algorithms tested (multilinear regression, polynomial regression, polynomial regression with ridge regularization, K nearest neighbors, random forest, and multilayer perceptron), we find that the random forest and K nearest neighbors algorithms are best suited to predicting Vp and ρb logs based on coefficients of determination (R2) greater than 70% and mean absolute percentage errors less than 4%. Given the high accuracy and low error results for Vp and ρb prediction in both hydrate and water-saturated sediments, we argue that our model can be applied in most LWD wells to predict Vp or ρb logs in near-seafloor siliciclastic sediments on continental slopes irrespective of the presence or absence of gas hydrate.
more »
« less
- Award ID(s):
- 1752882
- PAR ID:
- 10543783
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Energies
- Volume:
- 16
- Issue:
- 23
- ISSN:
- 1996-1073
- Page Range / eLocation ID:
- 7709
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Machine Learning models have the ability to streamline the process by which Youtube video comments are filtered between legitimate comments (ham) and spam. In order to integrate machine learning models into regular usage on media-sharing platforms, recent approaches have aimed to develop models trained on Youtube comments, which have emerged as valuable tools for the classification and have enabled the identification of spam content and enhancing user experience. In this paper, eight machine learning approaches are applied to spam detection for YouTube comments. The eight machine learning models include Gaussian Naive Bayes, logistic regression, K-nearest neighbors (KNN) classifier, multi-layer perceptron (MLP), support vector machine (SVM) classifier, random forest classifier, decision tree classifier, and voting classifier. All eight models perform very well, specifically random forest approach can achieve almost perfect performance with average precision of 100% and AUC-ROC of 0.9841. The computational complexity of the eight machine learning approaches are compared.more » « less
-
The objective of this study is to develop data-driven predictive models for peak rotation and factor of safety for tipping-over failure of rocking shallow foundations during earthquake loading using multiple nonlinear machine learning (ML) algorithms and a supervised learning technique. Centrifuge and shaking table experimental results on rocking foundations have been used for the development of k-nearest neighbors regression (KNN), support vector regression (SVR), and random forest regression (RFR) models. The input features to ML models include critical contact area ratio of foundation; slenderness ratio and rocking coefficient of rocking system; peak ground acceleration and Arias intensity of earthquake motion; and a categorical binary feature that separates sandy soil foundations from clayey soil foundations. Based on repeated k-fold cross validation tests of models, we found that the overall average mean absolute percentage errors (MAPE) in predictions of all three nonlinear ML models varied between 0.46 and 0.60, outperforming a baseline multivariate linear regression ML model with corresponding MAPE of 0.68 to 0.75. The input feature importance analysis reveals that the peak rotation and tipping-over stability of rocking foundations are more sensitive to ground motion demand parameters than to rocking foundation capacity parameters or type of soil.more » « less
-
Distributed denial-of-service (DDoS) attack is a malicious cybersecurity attack that has become a global threat. Machine learning (ML) as an advanced technology has been proven to be an effective way against DDoS attacks. Feature selection is a crucial step in ML, and researchers have put endless efforts to mitigate the “Curse of Dimensionality”. Feature selection is also causing problems to ML models, such as a decrease in prediction accuracy. Four supervised classification techniques, namely, Decision Tree (DT), k-Nearest Neighbors (KNN), Logistic Regression (LR), and Random Forest (RF), are tested using mutual information score ranking to study the necessity of feature selection in DDoS detection.more » « less
-
null (Ed.)When forest conditions are mapped from empirical models, uncertainty in remotely sensed predictor variables can cause the systematic overestimation of low values, underestimation of high values, and suppression of variability. This regression dilution or attenuation bias is a well-recognized problem in remote sensing applications, with few practical solutions. Attenuation is of particular concern for applications that are responsive to prediction patterns at the high end of observed data ranges, where systematic error is typically greatest. We addressed attenuation bias in models of tree species relative abundance (percent of total aboveground live biomass) based on multitemporal Landsat and topoclimatic predictor data. We developed a multi-objective support vector regression (MOSVR) algorithm that simultaneously minimizes total prediction error and systematic error caused by attenuation bias. Applied to 13 tree species in the Acadian Forest Region of the northeastern U.S., MOSVR performed well compared to other prediction methods including single-objective SVR (SOSVR) minimizing total error, Random Forest (RF), gradient nearest neighbor (GNN), and Random Forest nearest neighbor (RFNN) algorithms. SOSVR and RF yielded the lowest total prediction error but produced the greatest systematic error, consistent with strong attenuation bias. Underestimation at high relative abundance caused strong deviations between predicted patterns of species dominance/codominance and those observed at field plots. In contrast, GNN and RFNN produced dominance/codominance patterns that deviated little from observed patterns, but predicted species relative abundance with lower accuracy and substantial systematic error. MOSVR produced the least systematic error for all species with total error often comparable to SOSVR or RF. Predicted patterns of dominance/codominance matched observations well, though not quite as well as GNN or RFNN. Overall, MOSVR provides an effective machine learning approach to the reduction of systematic prediction error and should be fully generalizable to other remote sensing applications and prediction problems.more » « less
An official website of the United States government

