skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An unsupervised machine learning approach for ground‐motion spectra clustering and selection
Abstract Clustering analysis of sequence data continues to address many applications in engineering design, aided with the rapid growth of machine learning in applied science. This paper presents an unsupervised machine learning algorithm to extract defining characteristics of earthquake ground‐motion spectra, also called latent features, to aid in ground‐motion selection (GMS). In this context, a latent feature is a low‐dimensional machine‐discovered spectral characteristic learned through nonlinear relationships of a neural network autoencoder. Machine discovered latent features can be combined with traditionally defined intensity measures and clustering can be performed to select a representative subgroup from a large ground‐motion suite. The objective of efficient GMS is to choose characteristic records representative of what the structure will probabilistically experience in its lifetime. Three examples are presented to validate this approach, including the use of synthetic and field recorded ground‐motion datasets. The presented deep embedding clustering of ground‐motion spectra has three main advantages: (1) defining characteristics that represent the sparse spectral content of ground motions are discovered efficiently through training of the autoencoder, (2) domain knowledge is incorporated into the machine learning framework with conditional variables in the deep embedding scheme, and (3) the method results in a ground‐motion subgroup that is more representative of the original ground‐motion suite compared to traditional GMS techniques.  more » « less
Award ID(s):
2013067
PAR ID:
10479520
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Earthquake Engineering & Structural Dynamics
Volume:
53
Issue:
3
ISSN:
0098-8847
Format(s):
Medium: X Size: p. 1107-1124
Size(s):
p. 1107-1124
Sponsoring Org:
National Science Foundation
More Like this
  1. Ground motion selection has become increasingly central to the assessment of earthquake resilience. The selection of ground motion records for use in nonlinear dynamic analysis significantly affects structural response. This, in turn, will impact the outcomes of earthquake resilience analysis. This paper presents a new ground motion clustering algorithm, which can be embedded in current ground motion selection methods to properly select representative ground motion records that a structure of interest will probabilistically experience. The proposed clustering-based ground motion selection method includes four main steps: 1) leveraging domain-specific knowledge to pre-select candidate ground motions; 2) using a convolutional autoencoder to learn low-dimensional underlying characteristics of candidate ground motions’ response spectra – i.e., latent features; 3) performing k-means clustering to classify the learned latent features, equivalent to cluster the response spectra of candidate ground motions; and 4) embedding the clusters in the conditional spectra-based ground motion selection. The selected ground motions can represent a given hazard level well (by matching conditional spectra) and fully describe the complete set of candidate ground motions. Three case studies for modified, pulse-type, and non-pulse-type ground motions are designed to evaluate the performance of the proposed ground motion clustering algorithm (convolutional autoencoder + k-means). Considering the limited number of pre-selected candidate ground motions in the last two case studies, the response spectra simulation and transfer learning are used to improve the stability and reproducibility of the proposed ground motion clustering algorithm. The results of the three case studies demonstrate that the convolutional autoencoder + k-means can 1) achieve 100% accuracy in classifying ground motion response spectra, 2) correctly determine the optimal number of clusters, and 3) outperform established clustering algorithms (i.e., autoencoder + k-means, time series k-means, spectral clustering, and k-means on ground motion influence factors). Using the proposed clustering-based ground motion selection method, an application is performed to select ground motions for a structure in San Francisco, California. The developed user-friendly codes are published for practical use. 
    more » « less
  2. Abstract Advances in machine learning (ML) techniques and computational capacity have yielded state‐of‐the‐art methodologies for processing, sorting, and analyzing large seismic data sets. In this study, we consider an application of ML for automatically identifying dominant types of impulsive seismicity contained in observations from a 34‐station broadband seismic array deployed on the Ross Ice Shelf (RIS), Antarctica from 2014 to 2017. The RIS seismic data contain signals and noise generated by many glaciological processes that are useful for monitoring the integrity and dynamics of ice shelves. Deep clustering was employed to efficiently investigate these signals. Deep clustering automatically groups signals into hypothetical classes without the need for manual labeling, allowing for the comparison of their signal characteristics and spatial and temporal distribution with potential source mechanisms. The method uses spectrograms as input and encodes their salient features into a lower‐dimensional latent representation using an autoencoder, a type of deep neural network. For comparison, two clustering methods are applied to the latent data: a Gaussian mixture model (GMM) and deep embedded clustering (DEC). Eight classes of dominant seismic signals were identified and compared with environmental data such as temperature, wind speed, tides, and sea ice concentration. The greatest seismicity levels occurred at the RIS front during the 2016 El Niño summer, and near grounding zones near the front throughout the deployment. We demonstrate the spatial and temporal association of certain classes of seismicity with seasonal changes at the RIS front, and with tidally driven seismicity at Roosevelt Island. 
    more » « less
  3. In this work, we explore the performance of plasmonic biosensor designs that integrate metamaterials based on machine learning algorithms. The meta-plasmonic biosensors were designed for optimized detection of DNA with a layer of double negative metamaterial modeled by an effective medium. An iterative transfer matrix approach was employed to generate training and test sets of resonance characteristics in the parameter space for machine learning. As a machine learning-based prediction of optical characteristics of a meta-plasmonic biosensor, multilayer perceptron and autoencoder (AE) were used as an algorithm, while the clustering algorithm was constructed by dimensional reduction based on AE and t-Stochastic Neighbor Embedding (t-SNE) as well as k-means clustering. Use of meta-plasmonic structure with analysis based on machine learning has found that enhancement of detection sensitivity by more than 13 times over conventional detection should be achievable with excellent reflectance curves. Further enhancement may be attained by expanding the parameter space. 
    more » « less
  4. Ensemble clustering generally integrates basic partitions into a consensus one through a graph partitioning method, which, however, has two limitations: 1) it neglects to reuse original features; 2) obtaining consensus partition with learnable graph representations is still under-explored. In this paper, we propose a novel Adversarial Graph Auto-Encoders (AGAE) model to incorporate ensemble clustering into a deep graph embedding process. Specifically, graph convolutional network is adopted as probabilistic encoder to jointly integrate the information from feature content and consensus graph, and a simple inner product layer is used as decoder to reconstruct graph with the encoded latent variables (i.e., embedding representations). Moreover, we develop an adversarial regularizer to guide the network training with an adaptive partition-dependent prior. Experiments on eight real-world datasets are presented to show the effectiveness of AGAE over several state-of-the-art deep embedding and ensemble clustering methods. 
    more » « less
  5. Abstract Numerous single‐cell transcriptomic datasets from identical tissues or cell lines are generated from different laboratories or single‐cell RNA sequencing (scRNA‐seq) protocols. The denoising of these datasets to eliminate batch effects is crucial for data integration, ensuring accurate interpretation and comprehensive analysis of biological questions. Although many scRNA‐seq data integration methods exist, most are inefficient and/or not conducive to downstream analysis. Here, DeepBID, a novel deep learning‐based method for batch effect correction, non‐linear dimensionality reduction, embedding, and cell clustering concurrently, is introduced. DeepBID utilizes a negative binomial‐based autoencoder with dual Kullback–Leibler divergence loss functions, aligning cell points from different batches within a consistent low‐dimensional latent space and progressively mitigating batch effects through iterative clustering. Extensive validation on multiple‐batch scRNA‐seq datasets demonstrates that DeepBID surpasses existing tools in removing batch effects and achieving superior clustering accuracy. When integrating multiple scRNA‐seq datasets from patients with Alzheimer's disease, DeepBID significantly improves cell clustering, effectively annotating unidentified cells, and detecting cell‐specific differentially expressed genes. 
    more » « less