Abstract:The newer technologies such as data mining, machine learning, artificial intelligence and data analytics have revolutionized medical sector in terms of using the existing big data to predict the various patterns emerging from the datasets available inthe healthcare repositories. The predictions based on the existing datasets in the healthcare sector have rendered several benefits such as helping clinicians to make accurate and informed decisions while managing the patients’ health leading to better management of patients’ wellbeing and health-care coordination. The millions of people have been affected by the coronary artery disease (CAD). There are several machine learning including ensemble learning approach and deep neural networks-based algorithms have shown promising outcomes in improving prediction accuracy for early diagnosis of CAD. This paper analyses the deep neural network variant DRN, Rider Optimization Algorithm-Neural network (RideNN) and Deep Neural Network-Fuzzy Neural Network (DNFN) with application of ensemble learning method for improvement in the prediction accuracy of CAD. The experimental outcomes showed the proposed ensemble classifier achieved the highest accuracy compared to the other machine learning models. Keywords:Heart disease prediction, Deep Residual Network (DRN), Ensemble classifiers, coronary artery disease.
more »
« less
Multi-Objective artificial bee colony optimized hybrid deep belief network and XGBoost algorithm for heart disease prediction
The global rise in heart disease necessitates precise prediction tools to assess individual risk levels. This paper introduces a novel Multi-Objective Artificial Bee Colony Optimized Hybrid Deep Belief Network and XGBoost (HDBN-XG) algorithm, enhancing coronary heart disease prediction accuracy. Key physiological data, including Electrocardiogram (ECG) readings and blood volume measurements, are analyzed. The HDBN-XG algorithm assesses data quality, normalizes using z-score values, extracts features via the Computational Rough Set method, and constructs feature subsets using the Multi-Objective Artificial Bee Colony approach. Our findings indicate that the HDBN-XG algorithm achieves an accuracy of 99%, precision of 95%, specificity of 98%, sensitivity of 97%, and F1-measure of 96%, outperforming existing classifiers. This paper contributes to predictive analytics by offering a data-driven approach to healthcare, providing insights to mitigate the global impact of coronary heart disease.
more »
« less
- PAR ID:
- 10478052
- Publisher / Repository:
- Frontiers in Digital Health
- Date Published:
- Journal Name:
- Frontiers in Digital Health
- Volume:
- 5
- ISSN:
- 2673-253X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Cardiovascular disease (CVD) is the leading cause of death worldwide. Coronary artery disease (CAD), a prevalent form of CVD, is typically assessed using catheter coronary angiography (CCA), an invasive, costly procedure with associated risks. While cardiac computed tomography angiography (CTA) presents a less invasive alternative, it suffers from limited temporal resolution, often resulting in motion artifacts that degrade diagnostic quality. Traditional ECG-based gating methods for CTA inadequately capture cardiac mechanical motion. To address this, we propose a novel multimodal approach that enhances CTA imaging by predicting cardiac quiescent periods using seismocardiogram (SCG) and ECG data, integrated through a weighted fusion (WF) approach and artificial neural networks (ANNs). We developed a regression-based ANN framework (r-ANN WF) designed to improve prediction accuracy and reduce computational complexity, which was compared with a classification-based framework (c-ANN WF), ECG gating, and US data. Our results demonstrate that the r-ANN WF approach improved overall diastolic and systolic cardiac quiescence prediction accuracy by 52.6% compared to ECG-based predictions, using ultrasound (US) as the ground truth, with an average prediction time of 4.83 ms. Comparative evaluations based on reconstructed CTA images show that both r-ANN WF and c-ANN WF offer diagnostic quality comparable to US-based gating, underscoring their clinical potential. Additionally, the lower computational complexity of r-ANN WF makes it suitable for real-time applications. This approach could enhance CTA’s diagnostic quality, offering a more accurate and efficient method for CVD diagnosis and management.more » « less
-
Background Heart failure is a leading cause of mortality and morbidity worldwide. Acute heart failure, broadly defined as rapid onset of new or worsening signs and symptoms of heart failure, often requires hospitalization and admission to the intensive care unit (ICU). This acute condition is highly heterogeneous and less well-understood as compared to chronic heart failure. The ICU, through detailed and continuously monitored patient data, provides an opportunity to retrospectively analyze decompensation and heart failure to evaluate physiological states and patient outcomes. Objective The goal of this study is to examine the prevalence of cardiovascular risk factors among those admitted to ICUs and to evaluate combinations of clinical features that are predictive of decompensation events, such as the onset of acute heart failure, using machine learning techniques. To accomplish this objective, we leveraged tele-ICU data from over 200 hospitals across the United States. Methods We evaluated the feasibility of predicting decompensation soon after ICU admission for 26,534 patients admitted without a history of heart failure with specific heart failure risk factors (ie, coronary artery disease, hypertension, and myocardial infarction) and 96,350 patients admitted without risk factors using remotely monitored laboratory, vital signs, and discrete physiological measurements. Multivariate logistic regression and random forest models were applied to predict decompensation and highlight important features from combinations of model inputs from dissimilar data. Results The most prevalent risk factor in our data set was hypertension, although most patients diagnosed with heart failure were admitted to the ICU without a risk factor. The highest heart failure prediction accuracy was 0.951, and the highest area under the receiver operating characteristic curve was 0.9503 with random forest and combined vital signs, laboratory values, and discrete physiological measurements. Random forest feature importance also highlighted combinations of several discrete physiological features and laboratory measures as most indicative of decompensation. Timeline analysis of aggregate vital signs revealed a point of diminishing returns where additional vital signs data did not continue to improve results. Conclusions Heart failure risk factors are common in tele-ICU data, although most patients that are diagnosed with heart failure later in an ICU stay presented without risk factors making a prediction of decompensation critical. Decompensation was predicted with reasonable accuracy using tele-ICU data, and optimal data extraction for time series vital signs data was identified near a 200-minute window size. Overall, results suggest combinations of laboratory measurements and vital signs are viable for early and continuous prediction of patient decompensation.more » « less
-
The use of machine learning algorithms in healthcare can amplify social injustices and health inequities. While the exacerbation of biases can occur and be compounded during problem selection, data collection, and outcome definition, this research pertains to the generalizability impediments that occur during the development and post-deployment of machine learning classification algorithms. Using the Framingham coronary heart disease data as a case study, we show how to effectively select a probability cutoff to convert a regression model for a dichotomous variable into a classifier. We then compare the sampling distribution of the predictive performance of eight machine learning classification algorithms under four stratified training/testing scenarios to test their generalizability and their potential to perpetuate biases. We show that both extreme gradient boosting and support vector machine are flawed when trained on an unbalanced dataset. We then show that the double discriminant scoring of type 1 and 2 is the most generalizable with respect to the true positive and negative rates, respectively, as it consistently outperforms the other classification algorithms, regardless of the training/testing scenario. Finally, we introduce a methodology to extract an optimal variable hierarchy for a classification algorithm and illustrate it on the overall, male and female Framingham coronary heart disease data.more » « less
-
null (Ed.)Abstract Aims Coronary artery calcium (CAC) scoring is an established tool for cardiovascular risk stratification. However, the lack of widespread availability and concerns about radiation exposure have limited the universal clinical utilization of CAC. In this study, we sought to explore whether machine learning (ML) approaches can aid cardiovascular risk stratification by predicting guideline recommended CAC score categories from clinical features and surface electrocardiograms. Methods and results In this substudy of a prospective, multicentre trial, a total of 534 subjects referred for CAC scores and electrocardiographic data were split into 80% training and 20% testing sets. Two binary outcome ML logistic regression models were developed for prediction of CAC scores equal to 0 and ≥400. Both CAC = 0 and CAC ≥400 models yielded values for the area under the curve, sensitivity, specificity, and accuracy of 84%, 92%, 70%, and 75%, and 87%, 91%, 75%, and 81%, respectively. We further tested the CAC ≥400 model to risk stratify a cohort of 87 subjects referred for invasive coronary angiography. Using an intermediate or higher pretest probability (≥15%) to predict CAC ≥400, the model predicted the presence of significant coronary artery stenosis (P = 0.025), the need for revascularization (P < 0.001), notably bypass surgery (P = 0.021), and major adverse cardiovascular events (P = 0.023) during a median follow-up period of 2 years. Conclusion ML techniques can extract information from electrocardiographic data and clinical variables to predict CAC score categories and similarly risk-stratify patients with suspected coronary artery disease.more » « less