skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Novel classification of axial spondyloarthritis to predict radiographic progression using machine learning
OBJECTIVES: Prediction and determination of drug efficacy for radiographic progression is limited by the heterogeneity inherent in axial spondyloarthritis (axSpA). We investigated whether unbiased clustering analysis of phenotypic data can lead to coherent subgroups of axSpA patients with a distinct risk of radiographic progression. METHODS: A group of 412 patients with axSpA was clustered in an unbiased way using a agglomerative hierarchical clustering method, based on their phenotype mapping. We used a generalised linear model, naïve Bayes, Decision Trees, K-Nearest-Neighbors, and Support Vector Machines to construct a consensus classification method. Radiographic progression over 2 years was assessed using the modified Stoke Ankylosing Spondylitis Spine Score (mSASSS). RESULTS: axSpA patients were classified into three distinct subgroups with distinct clinical characteristics. Sex, smoking, HLA-B27, baseline mSASSS, uveitis, and peripheral arthritis were the key features that were found to stratifying the phenogroups. The three phenogroups showed distinct differences in radiographic progression rate (p<0.05) and the proportion of progressors (p<0.001). Phenogroup 2, consisting of male smokers, had the worst radiographic progression, while phenogroup 3, exclusively suffering from uveitis, showed the least radiographic progression. The axSpA phenogroup classification, including its ability to stratify risk, was successfully replicated in an independent validation group. CONCLUSIONS: Phenotype mapping results in a clinically relevant classification of axSpA that is applicable for risk stratification. Novel coupling between phenotypic features and radiographic progression can provide a glimpse into the mechanisms underlying divergent and shared features of axSpA.  more » « less
Award ID(s):
1934568
PAR ID:
10281958
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Clinical and Experimental Rheumatology
Volume:
39
Issue:
3
ISSN:
1593-098X
Page Range / eLocation ID:
508-518
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background We aimed to determine if composite structural measures of knee osteoarthritis (KOA) progression on magnetic resonance (MR) imaging can predict the radiographic onset of accelerated knee osteoarthritis. Methods We used data from a nested case-control study among participants from the Osteoarthritis Initiative without radiographic KOA at baseline. Participants were separated into three groups based on radiographic disease progression over 4 years: 1) accelerated (Kellgren-Lawrence grades [KL] 0/1 to 3/4), 2) typical (increase in KL, excluding accelerated osteoarthritis), or 3) no KOA (no change in KL). We assessed tibiofemoral cartilage damage (four regions: medial/lateral tibia/femur), bone marrow lesion (BML) volume (four regions: medial/lateral tibia/femur), and whole knee effusion-synovitis volume on 3 T MR images with semi-automated programs. We calculated two MR-based composite scores. Cumulative damage was the sum of standardized cartilage damage. Disease activity was the sum of standardized volumes of effusion-synovitis and BMLs. We focused on annual images from 2 years before to 2 years after radiographic onset (or a matched time for those without knee osteoarthritis). To determine between group differences in the composite metrics at all time points, we used generalized linear mixed models with group (3 levels) and time (up to 5 levels). For our prognostic analysis, we used multinomial logistic regression models to determine if one-year worsening in each composite metric change associated with future accelerated knee osteoarthritis (odds ratios [OR] based on units of 1 standard deviation of change). Results Prior to disease onset, the accelerated KOA group had greater average disease activity compared to the typical and no KOA groups and this persisted up to 2 years after disease onset. During a pre-radiographic disease period, the odds of developing accelerated KOA were greater in people with worsening disease activity [versus typical KOA OR (95% confidence interval [CI]): 1.58 (1.08 to 2.33); versus no KOA: 2.39 (1.55 to 3.71)] or cumulative damage [versus typical KOA: 1.69 (1.14 to 2.51); versus no KOA: 2.11 (1.41 to 3.16)]. Conclusions MR-based disease activity and cumulative damage metrics may be prognostic markers to help identify people at risk for accelerated onset and progression of knee osteoarthritis. 
    more » « less
  2. null (Ed.)
    Single cell RNA-sequencing (scRNA-seq) technology enables comprehensive transcriptomic profiling of thousands of cells with distinct phenotypic and physiological states in a complex tissue. Substantial efforts have been made to characterize single cells of distinct identities from scRNA-seq data, including various cell clustering techniques. While existing approaches can handle single cells in terms of different cell (sub)types at a high resolution, identification of the functional variability within the same cell type remains unsolved. In addition, there is a lack of robust method to handle the inter-subject variation that often brings severe confounding effects for the functional clustering of single cells. In this study, we developed a novel data denoising and cell clustering approach, namely CIBS, to provide biologically explainable functional classification for scRNA-seq data. CIBS is based on a systems biology model of transcriptional regulation that assumes a multi-modality distribution of the cells’ activation status, and it utilizes a Boolean matrix factorization approach on the discretized expression status to robustly derive functional modules. CIBS is empowered by a novel fast Boolean Matrix Factorization method, namely PFAST, to increase the computational feasibility on large scale scRNA-seq data. Application of CIBS on two scRNA-seq datasets collected from cancer tumor micro-environment successfully identified subgroups of cancer cells with distinct expression patterns of epithelial-mesenchymal transition and extracellular matrix marker genes, which was not revealed by the existing cell clustering analysis tools. The identified cell groups were significantly associated with the clinically confirmed lymph-node invasion and metastasis events across different patients. Index Terms—Cell clustering analysis, Data denoising, Boolean matrix factorization, Cancer microenvirionment, Metastasis. 
    more » « less
  3. BACKGROUND:Classification of perioperative risk is important for patient care, resource allocation, and guiding shared decision-making. Using discriminative features from the electronic health record (EHR), machine-learning algorithms can create digital phenotypes among heterogenous populations, representing distinct patient subpopulations grouped by shared characteristics, from which we can personalize care, anticipate clinical care trajectories, and explore therapies. We hypothesized that digital phenotypes in preoperative settings are associated with postoperative adverse events including in-hospital and 30-day mortality, 30-day surgical redo, intensive care unit (ICU) admission, and hospital length of stay (LOS). METHODS:We identified all laminectomies, colectomies, and thoracic surgeries performed over a 9-year period from a large hospital system. Seventy-seven readily extractable preoperative features were first selected from clinical consensus, including demographics, medical history, and lab results. Three surgery-specific datasets were built and split into derivation and validation cohorts using chronological occurrence. Consensusk-means clustering was performed independently on each derivation cohort, from which phenotypes’ characteristics were explored. Cluster assignments were used to train a random forest model to assign patient phenotypes in validation cohorts. We reconducted descriptive analyses on validation cohorts to confirm the similarity of patient characteristics with derivation cohorts, and quantified the association of each phenotype with postoperative adverse events by using the area under receiver operating characteristic curve (AUROC). We compared our approach to American Society of Anesthesiologists (ASA) alone and investigated a combination of our phenotypes with the ASA score. RESULTS:A total of 7251 patients met inclusion criteria, of which 2770 were held out in a validation dataset based on chronological occurrence. Using segmentation metrics and clinical consensus, 3 distinct phenotypes were created for each surgery. The main features used for segmentation included urgency of the procedure, preoperative LOS, age, and comorbidities. The most relevant characteristics varied for each of the 3 surgeries. Low-risk phenotype alpha was the most common (2039 of 2770, 74%), while high-risk phenotype gamma was the rarest (302 of 2770, 11%). Adverse outcomes progressively increased from phenotypes alpha to gamma, including 30-day mortality (0.3%, 2.1%, and 6.0%, respectively), in-hospital mortality (0.2%, 2.3%, and 7.3%), and prolonged hospital LOS (3.4%, 22.1%, and 25.8%). When combined with the ASA score, digital phenotypes achieved higher AUROC than the ASA score alone (hospital mortality: 0.91 vs 0.84; prolonged hospitalization: 0.80 vs 0.71). CONCLUSIONS:For 3 frequently performed surgeries, we identified 3 digital phenotypes. The typical profiles of each phenotype were described and could be used to anticipate adverse postoperative events. 
    more » « less
  4. Shaharudin, Shazlin (Ed.)
    Objective To apply biclustering, a methodology originally developed for analysis of gene expression data, to simultaneously cluster observations and clinical features to explore candidate phenotypes of knee osteoarthritis (KOA) for the first time. Methods Data from the baseline Osteoarthritis Initiative (OAI) visit were cleaned, transformed, and standardized as indicated (leaving 6461 knees with 86 features). Biclustering produced submatrices of the overall data matrix, representing similar observations across a subset of variables. Statistical validation was determined using the novel SigClust procedure. After identifying biclusters, relationships with key outcome measures were assessed, including progression of radiographic KOA, total knee arthroplasty, loss of joint space width, and worsening Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) scores, over 96 months of follow-up. Results The final analytic set included 6461 knees from 3330 individuals (mean age 61 years, mean body mass index 28 kg/m 2 , 57% women and 86% White). We identified 6 mutually exclusive biclusters characterized by different feature profiles at baseline, particularly related to symptoms and function. Biclusters represented overall better (#1), similar (#2, 3, 6), and poorer (#4, 5) prognosis compared to the overall cohort of knees, respectively. In general, knees in biclusters #4 and 5 had more structural progression (based on Kellgren-Lawrence grade, total knee arthroplasty, and loss of joint space width) but tended to have an improvement in WOMAC pain scores over time. In contrast, knees in bicluster #1 had less incident and progressive KOA, fewer total knee arthroplasties, less loss of joint space width, and stable pain scores compared with the overall cohort. Significance We identified six biclusters within the baseline OAI dataset which have varying relationships with key outcomes in KOA. Such biclusters represent potential phenotypes within the larger cohort and may suggest subgroups at greater or lesser risk of progression over time. 
    more » « less
  5. Abstract Objective Severe infection can lead to organ dysfunction and sepsis. Identifying subphenotypes of infected patients is essential for personalized management. It is unknown how different time series clustering algorithms compare in identifying these subphenotypes. Materials and Methods Patients with suspected infection admitted between 2014 and 2019 to 4 hospitals in Emory healthcare were included, split into separate training and validation cohorts. Dynamic time warping (DTW) was applied to vital signs from the first 8 h of hospitalization, and hierarchical clustering (DTW-HC) and partition around medoids (DTW-PAM) were used to cluster patients into subphenotypes. DTW-HC, DTW-PAM, and a previously published group-based trajectory model (GBTM) were evaluated for agreement in subphenotype clusters, trajectory patterns, and subphenotype associations with clinical outcomes and treatment responses. Results There were 12 473 patients in training and 8256 patients in validation cohorts. DTW-HC, DTW-PAM, and GBTM models resulted in 4 consistent vitals trajectory patterns with significant agreement in clustering (71–80% agreement, P < .001): group A was hyperthermic, tachycardic, tachypneic, and hypotensive. Group B was hyperthermic, tachycardic, tachypneic, and hypertensive. Groups C and D had lower temperatures, heart rates, and respiratory rates, with group C normotensive and group D hypotensive. Group A had higher odds ratio of 30-day inpatient mortality (P < .01) and group D had significant mortality benefit from balanced crystalloids compared to saline (P < .01) in all 3 models. Discussion DTW- and GBTM-based clustering algorithms applied to vital signs in infected patients identified consistent subphenotypes with distinct clinical outcomes and treatment responses. Conclusion Time series clustering with distinct computational approaches demonstrate similar performance and significant agreement in the resulting subphenotypes. 
    more » « less