skip to main content


Title: Estimating Compressional Velocity and Bulk Density Logs in Marine Gas Hydrates Using Machine Learning

Compressional velocity (Vp) and bulk density (ρb) logs are essential for characterizing gas hydrates and near-seafloor sediments; however, it is sometimes difficult to acquire these logs due to poor borehole conditions, safety concerns, or cost-related issues. We present a machine learning approach to predict either compressional Vp or ρb logs with high accuracy and low error in near-seafloor sediments within water-saturated intervals, in intervals where hydrate fills fractures, and intervals where hydrate occupies the primary pore space. We use scientific-quality logging-while-drilling well logs, gamma ray, ρb, Vp, and resistivity to train the machine learning model to predict Vp or ρb logs. Of the six machine learning algorithms tested (multilinear regression, polynomial regression, polynomial regression with ridge regularization, K nearest neighbors, random forest, and multilayer perceptron), we find that the random forest and K nearest neighbors algorithms are best suited to predicting Vp and ρb logs based on coefficients of determination (R2) greater than 70% and mean absolute percentage errors less than 4%. Given the high accuracy and low error results for Vp and ρb prediction in both hydrate and water-saturated sediments, we argue that our model can be applied in most LWD wells to predict Vp or ρb logs in near-seafloor siliciclastic sediments on continental slopes irrespective of the presence or absence of gas hydrate.

 
more » « less
Award ID(s):
1752882
PAR ID:
10543783
Author(s) / Creator(s):
; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Energies
Volume:
16
Issue:
23
ISSN:
1996-1073
Page Range / eLocation ID:
7709
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Machine learning techniques were used to predict tensile properties of material extrusion-based additively manufactured parts made with Technomelt PA 6910, a hot melt adhesive. An adaptive data generation technique, specifically an active learning process based on the Gaussian process regression algorithm, was employed to enable prediction with limited training data. After three rounds of data collection, machine learning models based on linear regression, ridge regression, Gaussian process regression, and K-nearest neighbors were tasked with predicting properties for the test dataset, which consisted of parts fabricated with five processing parameters chosen using a random number generator. Overall, linear regression and ridge regression successfully predicted output parameters, with < 10% error for 56% of predictions. K-nearest neighbors performed worse than linear regression and ridge regression, with < 10% error for 32% of predictions and 10–20% error for 60% of predictions. While Gaussian process regression performed with the lowest accuracy (< 10% error for 32% of prediction cases and 10–20% error for 40% of predictions), it benefited most from the adaptive data generation technique. This work demonstrates that machine learning models using adaptive data generation techniques can efficiently predict properties of additively manufactured structures with limited training data.

     
    more » « less
  2. The objective of this study is to develop data-driven predictive models for permanent settlement of rocking shallow foundations during seismic loading using multiple machine learning algorithms and supervised learning technique. Data from a rocking foundation database consisting of dynamic base shaking experiments conducted on centrifuges and shaking tables have been used for the development of k-nearest neighbors regression, support vector regression, and random forest regression models. Based on repeated k-fold cross validation tests of models and mean absolute percentage errors in their predictions, it is found that all three models perform better than a baseline multivariate linear regression model in terms of accuracy and variance in predictions. The average mean absolute errors in predictions of all three models are around 0.005 to 0.006, indicating that the rocking induced permanent settlement can be predicted within an average error limit of 0.5% to 0.6% of the width of the footing. 
    more » « less
  3. Machine Learning models have the ability to streamline the process by which Youtube video comments are filtered between legitimate comments (ham) and spam. In order to integrate machine learning models into regular usage on media-sharing platforms, recent approaches have aimed to develop models trained on Youtube comments, which have emerged as valuable tools for the classification and have enabled the identification of spam content and enhancing user experience. In this paper, eight machine learning approaches are applied to spam detection for YouTube comments. The eight machine learning models include Gaussian Naive Bayes, logistic regression, K-nearest neighbors (KNN) classifier, multi-layer perceptron (MLP), support vector machine (SVM) classifier, random forest classifier, decision tree classifier, and voting classifier. All eight models perform very well, specifically random forest approach can achieve almost perfect performance with average precision of 100% and AUC-ROC of 0.9841. The computational complexity of the eight machine learning approaches are compared. 
    more » « less
  4. The objective of this study is to develop data-driven predictive models for peak rotation and factor of safety for tipping-over failure of rocking shallow foundations during earthquake loading using multiple nonlinear machine learning (ML) algorithms and a supervised learning technique. Centrifuge and shaking table experimental results on rocking foundations have been used for the development of k-nearest neighbors regression (KNN), support vector regression (SVR), and random forest regression (RFR) models. The input features to ML models include critical contact area ratio of foundation; slenderness ratio and rocking coefficient of rocking system; peak ground acceleration and Arias intensity of earthquake motion; and a categorical binary feature that separates sandy soil foundations from clayey soil foundations. Based on repeated k-fold cross validation tests of models, we found that the overall average mean absolute percentage errors (MAPE) in predictions of all three nonlinear ML models varied between 0.46 and 0.60, outperforming a baseline multivariate linear regression ML model with corresponding MAPE of 0.68 to 0.75. The input feature importance analysis reveals that the peak rotation and tipping-over stability of rocking foundations are more sensitive to ground motion demand parameters than to rocking foundation capacity parameters or type of soil. 
    more » « less
  5. Distributed denial-of-service (DDoS) attack is a malicious cybersecurity attack that has become a global threat. Machine learning (ML) as an advanced technology has been proven to be an effective way against DDoS attacks. Feature selection is a crucial step in ML, and researchers have put endless efforts to mitigate the “Curse of Dimensionality”. Feature selection is also causing problems to ML models, such as a decrease in prediction accuracy. Four supervised classification techniques, namely, Decision Tree (DT), k-Nearest Neighbors (KNN), Logistic Regression (LR), and Random Forest (RF), are tested using mutual information score ranking to study the necessity of feature selection in DDoS detection. 
    more » « less