skip to main content


Title: NNK-Means: Data summarization using dictionary learning with non-negative kernel regression
An increasing number of systems are being designed by gathering significant amounts of data and then optimizing the system parameters directly using the obtained data. Often this is done without analyzing the dataset structure. As task complexity, data size, and parameters all increase to millions or even billions, data summarization is becoming a major challenge. In this work, we investigate data summarization via dictionary learning (DL), leveraging the properties of recently introduced non-negative kernel regression (NNK) graphs. Our proposed NNK-Means, unlike previous DL techniques, such as kSVD, learns geometric dictionaries with atoms that are representative of the input data space. Experiments show that summarization using NNK-Means can provide better class separation compared to linear and kernel versions of kMeans and kSVD. Moreover, NNK-Means is scalable, with runtime complexity similar to that of kMeans.  more » « less
Award ID(s):
2009032
NSF-PAR ID:
10433755
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2022 30th European Signal Processing Conference (EUSIPCO)
Page Range / eLocation ID:
2161 to 2165
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Several machine learning methods leverage the idea of locality by using k-nearest neighbor (KNN) techniques to design better pattern recognition models. However, the choice of KNN parameters such as k is often made experimentally, e.g., via cross-validation, leading to local neighborhoods without a clear geometric interpretation. In this paper, we replace KNN with our recently introduced polytope neighborhood scheme - Non Negative Kernel regression (NNK). NNK formulates neighborhood selection as a sparse signal approximation problem and is adaptive to the local distribution of samples in the neighborhood of the data point of interest. We analyze the benefits of local neighborhood construction based on NNK. In particular, we study the generalization properties of local interpolation using NNK and present data dependent bounds in the non asymptotic setting. The applicability of NNK in transductive few shot learning setting and for measuring distance between two datasets is demonstrated. NNK exhibits robust, superior performance in comparison to standard locally weighted neighborhood methods. 
    more » « less
  2. SUMMARY

    A key initial step in geophysical imaging is to devise an effective means of mapping the sensitivity of an observation to the model parameters, that is to compute its Fréchet derivatives or sensitivity kernel. In the absence of any simplifying assumptions and when faced with a large number of free parameters, the adjoint method can be an effective and efficient approach to calculating Fréchet derivatives and requires just two numerical simulations. In the Glacial Isostatic Adjustment problem, these consist of a forward simulation driven by changes in ice mass and an adjoint simulation driven by fictitious loads that are applied at the observation sites. The theoretical basis for this approach has seen considerable development over the last decade. Here, we present the final elements needed to image 3-D mantle viscosity using a dataset of palaeo sea-level observations. Developments include the calculation of viscosity Fréchet derivatives (i.e. sensitivity kernels) for relative sea-level observations, a modification to the numerical implementation of the forward and adjoint problem that permits application to 3-D viscosity structure, and a recalibration of initial sea level that ensures the forward simulation honours present-day topography. In the process of addressing these items, we build intuition concerning how absolute sea-level and relative sea-level observations sense Earth’s viscosity structure and the physical processes involved. We discuss examples for potential observations located in the near field (Andenes, Norway), far field (Seychelles), and edge of the forebulge of the Laurentide ice sheet (Barbados). Examination of these kernels: (1) reveals why 1-D estimates of mantle viscosity from far-field relative sea-level observations can be biased; (2) hints at why an appropriate differential relative sea-level observation can provide a better constraint on local mantle viscosity and (3) demonstrates that sea-level observations have non-negligible 3-D sensitivity to deep mantle viscosity structure, which is counter to the intuition gained from 1-D radial viscosity Fréchet derivatives. Finally, we explore the influence of lateral variations in viscosity on relative sea-level observations in the Amundsen Sea Embayment and at Barbados. These predictions are based on a new global 3-D viscosity inference derived from the shear-wave speeds of GLAD-M25 and an inverse calibration scheme that ensures compatibility with certain fundamental geophysical observations. Use of the 3-D viscosity inference leads to: (1) generally greater complexity within the kernel; (2) an increase in sensitivity and presence of shorter length-scale features within lower viscosity regions; (3) a zeroing out of the sensitivity kernel within high-viscosity regions where elastic deformation dominates and (4) shifting of sensitivity at a given depth towards distal regions of weaker viscosity. The tools and intuition built here provide the necessary framework to explore inversions for 3-D mantle viscosity based on palaeo sea-level data.

     
    more » « less
  3. Solar flares are characterized by sudden bursts of electromagnetic radiation from the Sun’s surface, and are caused by the changes in magnetic field states in active solar regions. Earth and its surrounding space environment can suffer from various negative impacts caused by solar flares, ranging from electronic communication disruption to radiation exposure-based health risks to astronauts. In this paper, we address the solar flare prediction problem from magnetic field parameter-based multivariate time series (MVTS) data using multiple state-of-the-art machine learning classifiers that include MINImally RandOm Convolutional KErnel Transform (MiniRocket), Support Vector Machine (SVM), Canonical Interval Forest (CIF), Multiple Representations Sequence Learner (Mr-SEQL), and a Long Short-Term Memory (LSTM)-based deep learning model. Our experiment is conducted on the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, which is a partitioned collection of MVTS data of active region magnetic field parameters spanning over nine years of operation of the Solar Dynamics Observatory (SDO). The MVTS instances of the SWAN-SF dataset are labeled by GOES X-ray flux-based flare class labels, and attributed to extreme class imbalance because of the rarity of the major flaring events (e.g., X and M). As a performance validation metric in this class-imbalanced dataset, we used the True Skill Statistic (TSS) score. Finally, we demonstrate the advantages of the MVTS learning algorithm MiniRocket, which outperformed the aforementioned classifiers without the need for essential data preprocessing steps such as normalization, statistical summarization, and class imbalance handling heuristics.

     
    more » « less
  4. Deep learning (DL) has been increasingly explored in low-dose CT image denoising. DL products have also been submitted to the FDA for premarket clearance. While having the potential to improve image quality over the filtered back projection method (FBP) and produce images quickly, generalizability of DL approaches is a major concern because the performance of a DL network can depend highly on the training data. In this work we take a residual encoder-decoder convolutional neural network (REDCNN)-based CT denoising method as an example. We investigate the effect of the scan parameters associated with the training data on the performance of this DL-based CT denoising method and identifies the scan parameters that may significantly impact its performance generalizability. This abstract particularly examines these three parameters: reconstruction kernel, dose level and slice thickness. Our preliminary results indicate that the DL network may not generalize well between FBP reconstruction kernels, but is insensitive to slice thickness for slice-wise denoising. The results also suggest that training with mixed dose levels improves denoising performance. 
    more » « less
  5. We present a novel framework to represent sets of time-varying signals as dynamic graphs using the non-negative kernel (NNK) graph construction. We extend the original NNK framework to allow explicit delays as part of the graph construction, so that unlike in NNK, two nodes can be connected with an edge corresponding to a non-zero time delay, if there is higher similarity between the signals after shifting one of them. We also propose to characterize the similarity between signals at different nodes using the node degree and clustering coefficients of their respective visibility graphs. Graph edges that can representing temporal delays, we provide a new perspective that enables us to see the effect of synchronization in graph construction for time-series signals. For both temperature and EEG datasets, we show that our proposed approach can achieve sparse and interpretable graph representations. Furthermore, the proposed method can be useful in characterizing different EEG experiments using sparsity. 
    more » « less