skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


This content will become publicly available on April 23, 2025

Title: Sex-specific cardiovascular risk factors in the UK Biobank
The lack of sex-specific cardiovascular disease criteria contributes to the underdiagnosis of women compared to that of men. For more than half a century, the Framingham Risk Score has been the gold standard to estimate an individual’s risk of developing cardiovascular disease based on the age, sex, cholesterol levels, blood pressure, diabetes status, and the smoking status. Now, machine learning can offer a much more nuanced insight into predicting the risk of cardiovascular diseases. The UK Biobank is a large database that includes traditional risk factors and tests related to the cardiovascular system: magnetic resonance imaging, pulse wave analysis, electrocardiograms, and carotid ultrasounds. Here, we leverage 20,542 datasets from the UK Biobank to build more accurate cardiovascular risk models than the Framingham Risk Score and quantify the underdiagnosis of women compared to that of men. Strikingly, for a first-degree atrioventricular block and dilated cardiomyopathy, two conditions with non-sex-specific diagnostic criteria, our study shows that women are under-diagnosed 2× and 1.4× more than men. Similarly, our results demonstrate the need for sex-specific criteria in essential primary hypertension and hypertrophic cardiomyopathy. Our feature importance analysis reveals that out of the top 10 features across three sexes and four disease categories, traditional Framingham factors made up between 40% and 50%; electrocardiogram, 30%–33%; pulse wave analysis, 13%–23%; and magnetic resonance imaging and carotid ultrasound, 0%–10%. Improving the Framingham Risk Score by leveraging big data and machine learning allows us to incorporate a wider range of biomedical data and prediction features, enhance personalization and accuracy, and continuously integrate new data and knowledge, with the ultimate goal to improve accurate prediction, early detection, and early intervention in cardiovascular disease management. Our analysis pipeline and trained classifiers are freely available at https://github.com/LivingMatterLab/CardiovascularDiseaseClassification.  more » « less
Award ID(s):
2320933
NSF-PAR ID:
10512687
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Frontiers
Date Published:
Journal Name:
Frontiers in Physiology
Volume:
15
ISSN:
1664-042X
Page Range / eLocation ID:
1339866
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background and Aims

    It is not clear how a polygenic risk score (PRS) can be best combined with guideline-recommended tools for cardiovascular disease (CVD) risk prediction, e.g. SCORE2.

    Methods

    A PRS for coronary artery disease (CAD) was calculated in participants of UK Biobank (n = 432 981). Within each tenth of the PRS distribution, the odds ratios (ORs)—referred to as PRS-factor—for CVD (i.e. CAD or stroke) were compared between the entire population and subgroups representing the spectrum of clinical risk. Replication was performed in the combined Framingham/Atherosclerosis Risk in Communities (ARIC) populations (n = 10 757). The clinical suitability of a multiplicative model ‘SCORE2 × PRS-factor’ was tested by risk reclassification.

    Results

    In subgroups with highly different clinical risks, CVD ORs were stable within each PRS tenth. SCORE2 and PRS showed no significant interactive effects on CVD risk, which qualified them as multiplicative factors: SCORE2 × PRS-factor = total risk. In UK Biobank, the multiplicative model moved 9.55% of the intermediate (n = 145 337) to high-risk group increasing the individuals in this category by 56.6%. Incident CVD occurred in 8.08% of individuals reclassified by the PRS-factor from intermediate to high risk, which was about two-fold of those remained at intermediate risk (4.08%). Likewise, the PRS-factor shifted 8.29% of individuals from moderate to high risk in Framingham/ARIC.

    Conclusions

    This study demonstrates that absolute CVD risk, determined by a clinical risk score, and relative genetic risk, determined by a PRS, provide independent information. The two components may form a simple multiplicative model improving precision of guideline-recommended tools in predicting incident CVD.

     
    more » « less
  2. Scope

    A better understanding of factors contributing to interindividual variability in biomarkers of vitamin K can enhance the understanding of the equivocal role of vitamin K in cardiovascular disease. Based on the known biology of phylloquinone, the major form of vitamin K, it is hypothesized that plasma lipids contribute to the variable response of biomarkers of vitamin K metabolism to phylloquinone supplementation.

    Methods and results

    The association of plasma lipids and 27 lipid‐related genetic variants with the response of biomarkers of vitamin K metabolism is examined in a secondary analysis of data from a 3‐year phylloquinone supplementation trial in men (n = 66) and women (n = 85). Year 3 plasma triglycerides (TG), but not total cholesterol, LDL‐cholesterol, or HDL‐cholesterol, are associated with the plasma phylloquinone response (men: β = 1.01,p < 0.001,R2 = 0.34; women: β = 0.61,p = 0.008,R2 = 0.11; sex interactionp = 0.077). Four variants and the TG‐weighted genetic risk score are associated with the plasma phylloquinone response in men only. Plasma lipids are not associated with changes in biomarkers of vitamin K function (undercarboxylated osteocalcin and matrix gla protein) in either sex.

    Conclusion

    Plasma TG are an important determinant of the interindividual response of plasma phylloquinone to phylloquinone supplementation, but changes in biomarkers of vitamin K carboxylation are not influenced by lipids.

     
    more » « less
  3. Abstract

    Tourette Syndrome (TS) is a complex neurodevelopmental disorder characterized by vocal and motor tics lasting more than a year. It is highly polygenic in nature with both rare and common previously associated variants. Epidemiological studies have shown TS to be correlated with other phenotypes, but large-scale phenome wide analyses in biobank level data have not been performed to date. In this study, we used the summary statistics from the latest meta-analysis of TS to calculate the polygenic risk score (PRS) of individuals in the UK Biobank data and applied a Phenome Wide Association Study (PheWAS) approach to determine the association of disease risk with a wide range of phenotypes. A total of 57 traits were found to be significantly associated with TS polygenic risk, including multiple psychosocial factors and mental health conditions such as anxiety disorder and depression. Additional associations were observed with complex non-psychiatric disorders such as Type 2 diabetes, heart palpitations, and respiratory conditions. Cross-disorder comparisons of phenotypic associations with genetic risk for other childhood-onset disorders (e.g.: attention deficit hyperactivity disorder [ADHD], autism spectrum disorder [ASD], and obsessive-compulsive disorder [OCD]) indicated an overlap in associations between TS and these disorders. ADHD and ASD had a similar direction of effect with TS while OCD had an opposite direction of effect for all traits except mental health factors. Sex-specific PheWAS analysis identified differences in the associations with TS genetic risk between males and females. Type 2 diabetes and heart palpitations were significantly associated with TS risk in males but not in females, whereas diseases of the respiratory system were associated with TS risk in females but not in males. This analysis provides further evidence of shared genetic and phenotypic architecture of different complex disorders.

     
    more » « less
  4. Importance

    Body mass index (BMI; calculated as weight in kilograms divided by height in meters squared) is a commonly used estimate of obesity, which is a complex trait affected by genetic and lifestyle factors. Marked weight gain and loss could be associated with adverse biological processes.

    Objective

    To evaluate the association between BMI variability and incident cardiovascular disease (CVD) events in 2 distinct cohorts.

    Design, Setting, and Participants

    This cohort study used data from the Million Veteran Program (MVP) between 2011 and 2018 and participants in the UK Biobank (UKB) enrolled between 2006 and 2010. Participants were followed up for a median of 3.8 (5th-95th percentile, 3.5) years. Participants with baseline CVD or cancer were excluded. Data were analyzed from September 2022 and September 2023.

    Exposure

    BMI variability was calculated by the retrospective SD and coefficient of variation (CV) using multiple clinical BMI measurements up to the baseline.

    Main Outcomes and Measures

    The main outcome was incident composite CVD events (incident nonfatal myocardial infarction, acute ischemic stroke, and cardiovascular death), assessed using Cox proportional hazards modeling after adjustment for CVD risk factors, including age, sex, mean BMI, systolic blood pressure, total cholesterol, high-density lipoprotein cholesterol, smoking status, diabetes status, and statin use. Secondary analysis assessed whether associations were dependent on the polygenic score of BMI.

    Results

    Among 92 363 US veterans in the MVP cohort (81 675 [88%] male; mean [SD] age, 56.7 [14.1] years), there were 9695 Hispanic participants, 22 488 non-Hispanic Black participants, and 60 180 non-Hispanic White participants. A total of 4811 composite CVD events were observed from 2011 to 2018. The CV of BMI was associated with 16% higher risk for composite CVD across all groups (hazard ratio [HR], 1.16; 95% CI, 1.13-1.19). These associations were unchanged among subgroups and after adjustment for the polygenic score of BMI. The UKB cohort included 65 047 individuals (mean [SD] age, 57.30 (7.77) years; 38 065 [59%] female) and had 6934 composite CVD events. Each 1-SD increase in BMI variability in the UKB cohort was associated with 8% increased risk of cardiovascular death (HR, 1.08; 95% CI, 1.04-1.11).

    Conclusions and Relevance

    This cohort study found that among US veterans, higher BMI variability was a significant risk marker associated with adverse cardiovascular events independent of mean BMI across major racial and ethnic groups. Results were consistent in the UKB for the cardiovascular death end point. Further studies should investigate the phenotype of high BMI variability.

     
    more » « less
  5. Abstract Objective Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case–control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them. Materials and Methods We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study. Results We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies. Discussion and Conclusion Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression. 
    more » « less