skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Supervised Tensor Dimension Reduction-Based Prognostic Model for Applications with Incomplete Imaging Data
Imaging data-based prognostic models focus on using an asset’s degradation images to predict its time to failure (TTF). Most image-based prognostic models have two common limitations. First, they require degradation images to be complete (i.e., images are observed continuously and regularly over time). Second, they usually employ an unsupervised dimension reduction method to extract low-dimensional features and then use the features for TTF prediction. Because unsupervised dimension reduction is conducted on the degradation images without the involvement of TTFs, there is no guarantee that the extracted features are effective for failure time prediction. To address these challenges, this article develops a supervised tensor dimension reduction-based prognostic model. The model first proposes a supervised dimension reduction method for tensor data. It uses historical TTFs to guide the detection of a tensor subspace to extract low-dimensional features from high-dimensional incomplete degradation imaging data. Next, the extracted features are used to construct a prognostic model based on (log)-location-scale regression. An optimization algorithm for parameter estimation is proposed, and analytical solutions are discussed. Simulated data and a real-world data set are used to validate the performance of the proposed model. History: Bianca Maria Colosimo served as the senior editor for this article Funding: This work was supported by National Science Foundation [2229245]. Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://github.com/czhou9/Code-and-Data-for-IJDS and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2022.x022 ).  more » « less
Award ID(s):
2229245
PAR ID:
10545690
Author(s) / Creator(s):
;
Publisher / Repository:
INFORMS
Date Published:
Journal Name:
INFORMS Journal on Data Science
Volume:
3
Issue:
1
ISSN:
2694-4022
Page Range / eLocation ID:
84 to 104
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Higher-order tensors have received increased attention across science and engineering. While most tensor decomposition methods are developed for a single tensor observation, scientific studies often collect side information, in the form of node features and interactions thereof, together with the tensor data. Such data problems are common in neuroimaging, network analysis, and spatial-temporal modeling. Identifying the relationship between a high-dimensional tensor and side information is important yet challenging. Here, we develop a tensor decomposition method that incorporates multiple feature matrices as side information. Unlike unsupervised tensor decomposition, our supervised decomposition captures the effective dimension reduction of the data tensor confined to feature space of interest. An efficient alternating optimization algorithm with provable spectral initialization is further developed. Our proposal handles a broad range of data types, including continuous, count, and binary observations. We apply the method to diffusion tensor imaging data from human connectome project and multi-relational political network data. We identify the key global connectivity pattern and pinpoint the local regions that are associated with available features. The package and data used are available at https://CRAN.R-project.org/package=tensorregress. Supplementary materials for this article are available online. 
    more » « less
  2. Abstract In the form of multidimensional arrays, tensor data have become increasingly prevalent in modern scientific studies and biomedical applications such as computational biology, brain imaging analysis, and process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structure. Therefore, ad‐hoc dimension reduction methods on tensor data may lack statistical efficiency and can obscure essential findings. Model‐based clustering is a cornerstone of multivariate statistics and unsupervised learning; however, existing methods and algorithms are not designed for tensor‐variate samples. In this article, we propose a tensor envelope mixture model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. TEMM incorporates tensor‐structure‐preserving dimension reduction into mixture modeling and drastically reduces the number of free parameters and estimative variability. An expectation‐maximization‐type algorithm is developed to obtain likelihood‐based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower dimensional subspaces known as the tensor envelopes. We demonstrate the encouraging empirical performance of the proposed method in extensive simulation studies and a real data application in comparison with existing vector and tensor clustering methods. 
    more » « less
  3. Theunissen, Frédéric E. (Ed.)
    Recent neuroscience studies demonstrate that a deeper understanding of brain function requires a deeper understanding of behavior. Detailed behavioral measurements are now often collected using video cameras, resulting in an increased need for computer vision algorithms that extract useful information from video data. Here we introduce a new video analysis tool that combines the output of supervised pose estimation algorithms (e.g. DeepLabCut) with unsupervised dimensionality reduction methods to produce interpretable, low-dimensional representations of behavioral videos that extract more information than pose estimates alone. We demonstrate this tool by extracting interpretable behavioral features from videos of three different head-fixed mouse preparations, as well as a freely moving mouse in an open field arena, and show how these interpretable features can facilitate downstream behavioral and neural analyses. We also show how the behavioral features produced by our model improve the precision and interpretation of these downstream analyses compared to using the outputs of either fully supervised or fully unsupervised methods alone. 
    more » « less
  4. We consider the semi-supervised dimension reduction problem: given a high dimensional dataset with a small number of labeled data and huge number of unlabeled data, the goal is to find the low-dimensional embedding that yields good classification results. Most of the previous algorithms for this task are linkage-based algorithms. They try to enforce the must-link and cannot-link constraints in dimension reduction, leading to a nearest neighbor classifier in low dimensional space. In this paper, we propose a new hyperplane-based semi-supervised dimension reduction method---the main objective is to learn the low-dimensional features that can both approximate the original data and form a good separating hyperplane. We formulate this as a non-convex optimization problem and propose an efficient algorithm to solve it. The algorithm can scale to problems with millions of features and can easily incorporate non-negative constraints in order to learn interpretable non-negative features. Experiments on real world datasets demonstrate that our hyperplane-based dimension reduction method outperforms state-of-art linkage-based methods when very few labels are available. 
    more » « less
  5. Abstract Prostate cancer treatment decisions rely heavily on subjective visual interpretation [assigning Gleason patterns or International Society of Urological Pathology (ISUP) grade groups] of limited numbers of two‐dimensional (2D) histology sections. Under this paradigm, interobserver variance is high, with ISUP grades not correlating well with outcome for individual patients, and this contributes to the over‐ and undertreatment of patients. Recent studies have demonstrated improved prognostication of prostate cancer outcomes based on computational analyses of glands and nuclei within 2D whole slide images. Our group has also shown that the computational analysis of three‐dimensional (3D) glandular features, extracted from 3D pathology datasets of whole intact biopsies, can allow for improved recurrence prediction compared to corresponding 2D features. Here we seek to expand on these prior studies by exploring the prognostic value of 3D shape‐based nuclear features in prostate cancer (e.g. nuclear size, sphericity). 3D pathology datasets were generated using open‐top light‐sheet (OTLS) microscopy of 102 cancer‐containing biopsies extractedex vivofrom the prostatectomy specimens of 46 patients. A deep learning‐based workflow was developed for 3D nuclear segmentation within the glandular epithelium versus stromal regions of the biopsies. 3D shape‐based nuclear features were extracted, and a nested cross‐validation scheme was used to train a supervised machine classifier based on 5‐year biochemical recurrence (BCR) outcomes. Nuclear features of the glandular epithelium were found to be more prognostic than stromal cell nuclear features (area under the ROC curve [AUC] = 0.72 versus 0.63). 3D shape‐based nuclear features of the glandular epithelium were also more strongly associated with the risk of BCR than analogous 2D features (AUC = 0.72 versus 0.62). The results of this preliminary investigation suggest that 3D shape‐based nuclear features are associated with prostate cancer aggressiveness and could be of value for the development of decision‐support tools. © 2023 The Pathological Society of Great Britain and Ireland. 
    more » « less