Methylation risk scores are associated with a collection of phenotypes within electronic health record systems

Thompson, Mike; Hill, Brian L.; Rakocz, Nadav; Chiang, Jeffrey N.; Geschwind, Daniel; Sankararaman, Sriram; Hofer, Ira; Cannesson, Maxime; Zaitlen, Noah; Halperin, Eran

doi:10.1038/s41525-022-00320-1

Abstract Inference of clinical phenotypes is a fundamental task in precision medicine, and has therefore been heavily investigated in recent years in the context of electronic health records (EHR) using a large arsenal of machine learning techniques, as well as in the context of genetics using polygenic risk scores (PRS). In this work, we considered the epigenetic analog of PRS, methylation risk scores (MRS), a linear combination of methylation states. We measured methylation across a large cohort ( n = 831) of diverse samples in the UCLA Health biobank, for which both genetic and complete EHR data are available. We constructed MRS for 607 phenotypes spanning diagnoses, clinical lab tests, and medication prescriptions. When added to a baseline set of predictive features, MRS significantly improved the imputation of 139 outcomes, whereas the PRS improved only 22 (median improvement for methylation 10.74%, 141.52%, and 15.46% in medications, labs, and diagnosis codes, respectively, whereas genotypes only improved the labs at a median increase of 18.42%). We added significant MRS to state-of-the-art EHR imputation methods that leverage the entire set of medical records, and found that including MRS as a medical feature in the algorithm significantly improves EHR imputation in 37% of lab tests examined (median R 2 increase 47.6%). Finally, we replicated several MRS in multiple external studies of methylation (minimum p -value of 2.72 × 10 −7 ) and replicated 22 of 30 tested MRS internally in two separate cohorts of different ethnicity. Our publicly available results and weights show promise for methylation risk scores as clinical and scientific tools.

More Like this