skip to main content


Title: Estimation of the distribution of longitudinal biomarker trajectories prior to disease progression

Most studies characterize longitudinal biomarker trajectories by looking forward at them from a commonly used time origin, such as the initial treatment time. For a better understanding of the relationship between biomarkers and disease progression, we propose to align all subjects by using their disease progression time as the origin and then looking backward at the biomarker distributions prior to that event. We demonstrate that such backward‐looking plots are much more informative than forward‐looking plots when the research goal is to understand the shape of the trajectory leading up to the event of interest. Such backward‐looking plotting is an easy task if disease progression is observed for all the subjects. However, when these events are censored for a significant proportion of subjects in the study cohort, their time origins cannot be identified, and the task of aligning them cannot be performed. We propose a new method to tackle this problem by considering the distributions of longitudinal biomarker data conditional on the failure time. We use landmark analysis models to estimate these distributions. Compared to a naïve method, our new method greatly reduces estimation bias. We apply our method to a study for chronic myeloid leukemia patients whose BCR‐ABL transcript expression levels after treatment are good indicators of residual disease. Our proposed method provides a good visualization tool for longitudinal biomarker studies for the early detection of disease.

 
more » « less
NSF-PAR ID:
10460936
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistics in Medicine
Volume:
38
Issue:
11
ISSN:
0277-6715
Page Range / eLocation ID:
p. 2030-2046
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Predicting patient life expectancy is of great importance for clinicians in making treatment decisions. This prediction needs to be conducted in a dynamic manner, based on longitudinal biomarkers repeatedly measured during the patient's post-treatment follow-up period. The prediction is updated any time a new biomarker measurement is obtained. The heterogeneity across patients of biomarker trajectories over time requires flexible and powerful approaches to model noisy and irregularly measured longitudinal data. In this article, we use functional principal component analysis (FPCA) to extract the dominant features of the biomarker trajectory of each individual, and use these features as time-dependent predictors (covariates) in a transformed mean residual life (MRL) regression model to conduct dynamic prediction. Simulation studies demonstrate the improved performance of the transformed MRL model that includes longitudinal biomarker information in the prediction. We apply the proposed method to predict the remaining time expectancy until disease progression for patients with chronic myeloid leukemia, using the transcript levels of an oncogene, BCR-ABL.

     
    more » « less
  2. Background: Biomarkers for Alzheimer’s disease (AD) are crucial for early diagnosis and treatment monitoring once disease modifying therapies become available. Objective: This study aims to quantify the forward magnetization transfer rate (kfor) map from brain tissue water to macromolecular protons and use it to identify the brain regions with abnormal kfor in AD and AD progression. Methods: From the Cardiovascular Health Study (CHS) cognition study, magnetization transfer imaging (MTI) was acquired at baseline from 63 participants, including 20 normal controls (NC), 18 with mild cognitive impairment (MCI), and 25 AD subjects. Of those, 53 participants completed a follow-up MRI scan and were divided into four groups: 15 stable NC, 12 NC-to-MCI, 12 stable MCI, and 14 MCI/AD-to-AD subjects. kfor maps were compared across NC, MCI, and AD groups at baseline for the cross-sectional study and across four longitudinal groups for the longitudinal study. Results: We found a lower kfor in the frontal gray matter (GM), parietal GM, frontal corona radiata (CR) white matter (WM) tracts, frontal and parietal superior longitudinal fasciculus (SLF) WM tracts in AD relative to both NC and MCI. Further, we observed progressive decreases of kfor in the frontal GM, parietal GM, frontal and parietal CR WM tracts, and parietal SLF WM tracts in stable MCI. In the parietal GM, parietal CR WM tracts, and parietal SLF WM tracts, we found trend differences between MCI/AD-to-AD and stable NC. Conclusion: Forward magnetization transfer rate is a promising biomarker for AD diagnosis and progression. 
    more » « less
  3. Abstract

    In clinical research and practice, landmark models are commonly used to predict the risk of an adverse future event, using patients' longitudinal biomarker data as predictors. However, these data are often observable only at intermittent visits, making their measurement times irregularly spaced and unsynchronized across different subjects. This poses challenges to conducting dynamic prediction at any post‐baseline time. A simple solution is the last‐value‐carry‐forward method, but this may result in bias for the risk model estimation and prediction. Another option is to jointly model the longitudinal and survival processes with a shared random effects model. However, when dealing with multiple biomarkers, this approach often results in high‐dimensional integrals without a closed‐form solution, and thus the computational burden limits its software development and practical use. In this article, we propose to process the longitudinal data by functional principal component analysis techniques, and then use the processed information as predictors in a class of flexible linear transformation models to predict the distribution of residual time‐to‐event occurrence. The measurement schemes for multiple biomarkers are allowed to be different within subject and across subjects. Dynamic prediction can be performed in a real‐time fashion. The advantages of our proposed method are demonstrated by simulation studies. We apply our approach to the African American Study of Kidney Disease and Hypertension, predicting patients' risk of kidney failure or death by using four important longitudinal biomarkers for renal functions.

     
    more » « less
  4. Patients' longitudinal biomarker changing patterns are crucial factors for their disease progression. In this research, we apply functional principal component analysis techniques to extract these changing patterns and use them as predictors in landmark models for dynamic prediction. The time‐varying effects of risk factors along a sequence of landmark times are smoothed by a supermodel to borrow information from neighbor time intervals. This results in more stable estimation and more clear demonstration of the time‐varying effects. Compared with the traditional landmark analysis, simulation studies show our proposed approach results in lower prediction error rates and higher area under receiver operating characteristic curve (AUC) values, which indicate better ability to discriminate between subjects with different risk levels. We apply our method to data from the Framingham Heart Study, using longitudinal total cholesterol (TC) levels to predict future coronary heart disease (CHD) risk profiles. Our approach not only obtains the overall trend of biomarker‐related risk profiles, but also reveals different risk patterns that are not available from the traditional landmark analyses. Our results show that high cholesterol levels during young ages are more harmful than those in old ages. This demonstrates the importance of analyzing the age‐dependent effects of TC on CHD risk.

     
    more » « less
  5. Abstract Objective Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case–control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them. Materials and Methods We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study. Results We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies. Discussion and Conclusion Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression. 
    more » « less