skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fault Classification of Nonlinear Small Sample Data through Feature Sub-Space Neighbor Vote
The fault classification of a small sample of high dimension is challenging, especially for a nonlinear and non-Gaussian manufacturing process. In this paper, a similarity-based feature selection and sub-space neighbor vote method is proposed to solve this problem. To capture the dynamics, nonlinearity, and non-Gaussianity in the irregular time series data, high order spectral features, and fractal dimension features are extracted, selected, and stacked in a regular matrix. To address the problem of a small sample, all labeled fault data are used for similarity decisions for a specific fault type. The distances between the new data and all fault types are calculated in their feature subspaces. The new data are classified to the nearest fault type by majority probability voting of the distances. Meanwhile, the selected features, from respective measured variables, indicate the cause of the fault. The proposed method is evaluated on a publicly available benchmark of a real semiconductor etching dataset. It is demonstrated that by using the high order spectral features and fractal dimensionality features, the proposed method can achieve more than 84% fault recognition accuracy. The resulting feature subspace can be used to match any new fault data to the fingerprint feature subspace of each fault type, and hence can pinpoint the root cause of a fault in a manufacturing process.  more » « less
Award ID(s):
1916866
PAR ID:
10265908
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Electronics
Volume:
9
Issue:
11
ISSN:
2079-9292
Page Range / eLocation ID:
1952
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second order updates within a lower-dimensional subspace, giving rise to \textit{subspace second-order} methods. However, the majority of existing subspace second-order methods randomly select subspaces, consequently resulting in slower convergence rates depending on the problem's dimension $$d$$. In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of $$\bigO\left(\frac{1}{mk}+\frac{1}{k^2}\right)$$ for solving convex optimization problems. Here, $$m$$ represents the subspace dimension, which can be significantly smaller than $$d$$. Instead of adopting a random subspace, our primary innovation involves performing the cubic regularized Newton update within the \emph{Krylov subspace} associated with the Hessian and the gradient of the objective function. This result marks the first instance of a dimension-independent convergence rate for a subspace second-order method. Furthermore, when specific spectral conditions of the Hessian are met, our method recovers the convergence rate of a full-dimensional cubic regularized Newton method. Numerical experiments show our method converges faster than existing random subspace methods, especially for high-dimensional problems. 
    more » « less
  2. Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into two separate stages of finding the similarity matrix and finding clusters. Similar to some recent works, we integrate these two steps using a joint optimization approach. We make the following contributions: (i) we estimate the reliability of the cluster assignment for each point before assigning a point to a subspace. We group the data points into two groups of “certain” and “uncertain”, with the assignment of latter group delayed until their subspace association certainty improves. (ii) We demonstrate that delayed association is better suited for clustering subspaces that have ambiguities, i.e. when subspaces intersect or data are contaminated with outliers/noise. (iii) We demonstrate experimentally that such delayed probabilistic association leads to a more accurate self-representation and final clusters. The proposed method has higher accuracy both for points that exclusively lie in one subspace, and those that are on the intersection of subspaces. (iv) We show that delayed association leads to huge reduction of computational cost, since it allows for incremental spectral clustering 
    more » « less
  3. Separating an image into meaningful underlying components is a crucial first step for both editing and understanding images. We present a method capable of selecting the regions of a photograph exhibiting the same material as an artist-chosen area. Our proposed approach is robust to shading, specular highlights, and cast shadows, enabling selection in real images. As we do not rely on semantic segmentation (different woods or metal should not be selected together), we formulate the problem as a similarity-based grouping problem based on a user-provided image location. In particular, we propose to leverage the unsupervised DINO [Caron et al. 2021] features coupled with a proposed Cross-Similarity Feature Weighting module and an MLP head to extract material similarities in an image. We train our model on a new synthetic image dataset, that we release. We show that our method generalizes well to real-world images. We carefully analyze our model's behavior on varying material properties and lighting. Additionally, we evaluate it against a hand-annotated benchmark of 50 real photographs. We further demonstrate our model on a set of applications, including material editing, in-video selection, and retrieval of object photographs with similar materials. 
    more » « less
  4. Manufacturing process signatures reflect the process stability and anomalies that potentially lead to detrimental effects on the manufactured outcomes. Sensing technologies, especially in-situ image sensors, are widely used to capture process signatures for diagnostics and prognostics. This imaging data is crucial evidence for process signature characterization and monitoring. A critical aspect of process signature analysis is identifying the unique patterns in an image that differ from the generic behavior of the manufacturing process in order to detect anomalies. It is equivalent to separating the “unique features” and process-wise (or phase-wise) “shared features” from the same image and recognizing the transient anomaly, i.e., recognizing the outlier “unique features”. In state-of-the-art literature, image-based process signature analysis relies on conventional feature extraction procedures, which limit the “view” of information to each image and cannot decouple the shared and unique features. Consequently, the features extracted are less interpretable, and the anomaly detection method cannot distinguish the abnormality in the current process signature from the process-wise evolution. Targeting this limitation, this study proposes personalized feature extraction (PFE) to decouple process-wise shared features and transient unique features from a sensor image and further develops process signature characterization and anomaly detection strategies. The PFE algorithm is designed for heterogeneous data with shared features. Supervised and unsupervised anomaly detection strategies are developed upon PFE features to remove the shared features from a process signature and examine the unique features for abnormality. The proposed method is demonstrated on two datasets (i) selected data from the 2018 AM Benchmark Test Series from the National Institute of Standards and Technology (NIST), and (ii) thermal measurements in additive manufacturing of a thin-walled structure of Ti–6Al–4V. The results highlight the power of personalized modeling in extracting features from manufacturing imaging data. 
    more » « less
  5. null (Ed.)
    Due to the growing complexity and numerous manufacturing variation in safety-critical analog and mixed-signal (AMS) circuit design, rare failure detection in the high-dimensional variational space is one of the major challenges in AMS verification. Efficient AMS failure detection is very demanding with limited samples on account of high simulation and manufacturing cost. In this work, we combine a reversible network and a gating architecture to identify essential features from datasets and reduce feature dimension for fast failure detection. While reversible residual networks (RevNets) have been actively studied for its restoration ability from output to input without the loss of information, the gating network facilitates the RevNet to aim at effective dimension reduction. We incorporate the proposed reversible gating architecture into Bayesian optimization (BO) framework to reduce the dimensionality of BO embedding important features clarified by gating fusion weights so that the failure points can be efficiently located. Furthermore, we propose a conditional density estimation of important and non-important features to extract high-dimensional original input features from the low-dimension important features, improving the efficiency of the proposed methods. The improvements of our proposed approach on rare failure detection is demonstrated in AMS data under the high-dimensional process variations. 
    more » « less