skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 1, 2026

Title: SSL-SurvFormer: A Self-Supervised Learning and Continuously Monotonic Transformer Network for Missing Values in Survival Analysis
Survival analysis is a crucial statistical technique used to estimate the anticipated duration until a specific event occurs. However, current methods often involve discretizing the time scale and struggle with managing absent features within the data. This becomes especially pertinent since events can transpire at any given point, rendering event analysis a continuous concern. Additionally, the presence of missing attributes within tabular data is widespread. By leveraging recent developments of Transformer and Self-Supervised Learning (SSL), we introduce SSL-SurvFormer. This entails a continuously monotonic Transformer network, empowered by SSL pre-training, that is designed to address the challenges presented by continuous events and absent features in survival prediction. Our proposed continuously monotonic Transformer model facilitates accurate estimation of survival probabilities, thereby bypassing the need for temporal discretization. Additionally, our SSL pre-training strategy incorporates data transformation to adeptly manage missing information. The SSL pre-training encompasses two tasks: mask prediction, which identifies positions of absent features, and reconstruction, which endeavors to recover absent elements based on observed ones. Our empirical evaluations conducted across a variety of datasets, including FLCHAIN, METABRIC, and SUPPORT, consistently highlight the superior performance of SSL-SurvFormer in comparison to existing methods. Additionally, SSL-SurvFormer demonstrates effectiveness in handling missing values, a critical aspect often encountered in real-world datasets.  more » « less
Award ID(s):
2223793
PAR ID:
10639061
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Informatics
Volume:
12
Issue:
1
ISSN:
2227-9709
Page Range / eLocation ID:
32
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions.In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches. 
    more » « less
  2. Identifying the subset of events that influence events of interest from continuous time datasets is of great interest in various applications. Existing methods however often fail to produce accurate and interpretable results in a time-efficient manner. In this paper, we propose a neural model – Influence-Aware Attention for Multivariate Temporal Point Processes (IAA-MTPPs) – which leverages the powerful attention mechanism in transformers to capture temporal dynamics between event types, which is different from existing instance-to-instance attentions, using variational inference while maintaining interpretability. Given event sequences and a prior influence matrix, IAA-MTPP efficiently learns an approximate posterior by an Attention-to-Influence mechanism, and subsequently models the conditional likelihood of the sequences given a sampled influence through an Influence-to-Attention formulation. Both steps are completed efficiently inside a B-block multi-head self-attention layer, thus our end-to-end training with parallelizable transformer architecture enables faster training compared to sequential models such as RNNs. We demonstrate strong empirical performance compared to existing baselines on multiple synthetic and real benchmarks, including qualitative analysis for an application in decentralized finance. 
    more » « less
  3. van_der_Schaar, M; Janzing, D; Zhang, C (Ed.)
    Identifying the subset of events that influence events of interest from continuous time datasets is of great interest in various applications. Existing methods however often fail to produce accurate and interpretable results in a time-efficient manner. In this paper, we propose a neural model – Influence-Aware Attention for Multivariate Temporal Point Processes (IAA-MTPPs) – which leverages the powerful attention mechanism in transformers to capture temporal dynamics between event types, which is different from existing instance-to-instance attentions, using variational inference while maintaining interpretability. Given event sequences and a prior influence matrix, IAA-MTPP efficiently learns an approximate posterior by an Attention-to-Influence mechanism, and subsequently models the conditional likelihood of the sequences given a sampled influence through an Influence-to-Attention formulation. Both steps are completed efficiently inside a Bblock multi-head self-attention layer, thus our end-to-end training with parallelizable transformer architecture enables faster training compared to sequential models such as RNNs. We demonstrate strong empirical performance compared to existing baselines on multiple synthetic and real benchmarks, including qualitative analysis for an application in decentralized finance. 
    more » « less
  4. The past decade witnessed rapid development in the measurement and monitoring technologies for food science. Among these technologies, spectroscopy has been widely used for the analysis of food quality, safety, and nutritional properties. Due to the complexity of food systems and the lack of comprehensive predictive models, rapid and simple measurements to predict complex properties in food systems are largely missing. Machine Learning (ML) has shown great potential to improve the classification and prediction of these properties. However, the barriers to collecting large datasets for ML applications still persists. In this paper, we explore different approaches of data annotation and model training to improve data efficiency for ML applications. Specifically, we leverage Active Learning (AL) and Semi-Supervised Learning (SSL) and investigate four approaches: baseline passive learning, AL, SSL, and a hybrid of AL and SSL. To evaluate these approaches, we collect two spectroscopy datasets: predicting plasma dosage and detecting foodborne pathogen. Our experimental results show that, compared to the de facto passive learning approach, advanced approaches (AL, SSL, and the hybrid) can greatly reduce the number of labeled samples, with some cases decreasing the number of labeled samples by more than half. 
    more » « less
  5. Efthimiou, Eleni; Fotinea, Stavroula-Evita; Hanke, Thomas; Hochgesang, Julie A; Mesch, Johanna; Schulder, Marc (Ed.)
    We propose a multimodal network using skeletons and handshapes as input to recognize individual signs and detect their boundaries in American Sign Language (ASL) videos. Our method integrates a spatio-temporal Graph Convolutional Network (GCN) architecture to estimate human skeleton keypoints; it uses a late-fusion approach for both forward and backward processing of video streams. Our (core) method is designed for the extraction---and analysis of features from---ASL videos, to enhance accuracy and efficiency of recognition of individual signs. A Gating module based on per-channel multi-layer convolutions is employed to evaluate significant frames for recognition of isolated signs. Additionally, an auxiliary multimodal branch network, integrated with a transformer, is designed to estimate the linguistic start and end frames of an isolated sign within a video clip. We evaluated performance of our approach on multiple datasets that include isolated, citation-form signs and signs pre-segmented from continuous signing based on linguistic annotations of start and end points of signs within sentences. We have achieved very promising results when using both types of sign videos combined for training, with overall sign recognition accuracy of 80.8% Top-1 and 95.2% Top-5 for citation-form signs, and 80.4% Top-1 and 93.0% Top-5 for signs pre-segmented from continuous signing. 
    more » « less