A Novel Paradigm in Cardiovascular Disease Risk Prediction Through Hybrid Machine Learning

Parvez Rahi, Sandeep Singh Kang

Аннотация


Новая парадигма в прогнозировании риска сердечно-сосудистых заболеваний с помощью гибридного машинного обучения

Парвез Рахи, Сандип Сингх Канг

Известно, что сердечные заболевания убивают больше всего людей во всей вселенной, ежегодно унося жизни более 17,9 миллионов человек. Раннее и точное прогнозирование риска считается необходимым для улучшения клинических результатов, а также снижения нагрузки на здравоохранение. В этой статье предлагается инновационная гибридная структура машинного обучения, которая прогнозирует сердечные заболевания с хорошей степенью точности, используя жизненно важные медицинские факторы, а также факторы образа жизни. Такие клинически значимые параметры, как ИМТ, диабетический анамнез, гипертоническое состояние, инсульт в анамнезе, хроническое заболевание почек, физическая неактивность и психические расстройства сами по себе известны как факторы риска сердечно-сосудистой патологии. Гибридная модель использует XGBoost, который сочетает в себе преимущества обоих алгоритмов, SVM и DNN. Эти передовые инженерные методы улавливают сложные нелинейные корреляции между переменными риска, такими как диабет и ожирение, с помощью полиномиальных преобразований и условий взаимодействия. Алгоритм SMOTE помог в классификации работы для устранения дисбаланса классов и повышения точности прогнозирования за счет использования правильно сбалансированного набора данных для обучения модели. Предложенный метод показал лучшие результаты, чем традиционные модели прогнозирования, с точностью 94%. Нет риска, низкий риск, умеренный риск, высокий риск и тяжелое заболевание сердца — это пять категорий, которые используются для точной классификации риска сердечных заболеваний. Четыре ключевых предиктора сердечных заболеваний — используемый алгоритм определил ИМТ, гипертонию, диабет и физическое здоровье — хорошо согласуются с современными медицинскими знаниями. Этот алгоритм представляет собой мощный инструмент для врачей, которые могут использовать его для стратификации своих пациентов на индивидуальной основе и, в частности, для раннего выявления тех, кто находится в группе высокого риска. Модель поможет врачам предлагать конкретные методы лечения, будучи интегрированной в клиническую практику, тем самым в итоге приводя к улучшению результатов для пациентов и снижению распространенности сердечно-сосудистых событий с течением времени.


Ключевые слова


Сердечно-сосудистые заболевания, ИМТ, диабет, гипертония, XGBoost, глубокие нейронные сети, стратификация риска, прогнозирование сердечных заболеваний, принятие клинических решений.

Полный текст:

PDF (English)

Литература


[Abs21] H. R. H. Al-Absi, M. A. Refaee, A. U. Rehman, M. T. Islam, S. B. Belhaouari, and T. Alam, "Risk Factors and Comorbidities Associated to Cardiovascular Disease in Qatar: A Machine Learning Based Case-Control Study," in IEEE Access, vol. 9, pp. 29929-29941, 2021. DOI: 10.1109/ACCESS.2021.3059469. EDN: XQTDAR.

[Akt24] K. Akther, M. S. R. Kohinoor, B. S. Priya, M. J. Rahaman, M. M. Rahman and M. Shafiullah, "Multi-Faceted Approach to Cardiovascular Risk Assessment by Utilizing Predictive Machine Learning and Clinical Data in a Unified Web Platform," in IEEE Access, vol. 12, pp. 120454-120473, 2024. DOI: 10.1109/ACCESS.2024.3436020.

[Alb21] Alballa, Norah, and Isra Al-Turaiki, "Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: A review", Informatics in medicine unlocked 24 2021. 100564. DOI: 10.1016/j.imu.2021.100564. EDN: FEJGGX.

[Als24] Al-Alshaikh, Halah A., et al, "Comprehensive evaluation and performance analysis of machine learning in heart disease prediction", Scientific Reports 14.1.2024. 7819. DOI:10.1038/s41598-024-58489-7.

[Ara24] Araf, Imane, Ali Idri, and Ikram Chairi, "Cost-sensitive Learning for imbalanced medical data: a review ", Artificial Intelligence Review 57.4 2024. 80. DOI: 10.1007/s10462-023-10652-8. EDN: CPOULR.

[Arm24] Armoundas, Antonis A., et al, "Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement from the American Heart Association", Circulation 149.14.2024. e1028-e1050. DOI: 10.1161/CIR.0000000000001201.

[Bar24] Barkas, Fotios, et al, "Advancements in risk stratification and management strategies in primary cardiovascular prevention", Atherosclerosis 395 2024. 117579. DOI: 10.1016/j.atherosclerosis.2024.117579. EDN: JDFWQV.

[Bay21] Bays, Harold E., et al, "Ten things to know about ten cardiovascular disease risk factors ", American Journal of Preventive Cardiology 5 2021. 100149. DOI: 10.1016/j.ajpc.2021.100149. EDN: LBHJNV.

[Bud20] Budreviciute, Aida, et al, "Management and prevention strategies for non-communicable diseases (NCDs) and their risk factors", Frontiers in Public Health 8 2020. 574111. DOI: 10.3389/fpubh.2020.574111. EDN: UXMTFK.

[Bud22] Budholiya, Kartik, Shailendra Kumar Shrivastava, and Vivek Sharma, "An optimized XGBoost based diagnostic system for effective prediction of heart disease ", Journal of King Saud University – Computer and Information Sciences 34.7 2022. 4514-4523. DOI: 10.1016/j.jksuci.2020.10.013. EDN: XURYSJ.

[But22] Butnariu, Lăcrămioara Ionela, et al., "Etiologic puzzle of coronary artery disease: how important is genetic component?"., Life 12.6.2022. 865. DOI: 10.3390/life12060865.

[Cha23] Chakraborty, Chiranjib, et al, "From machine learning to deep learning: Advances of the recent data-driven paradigm shift in medicine and healthcare", Current Research in Biotechnology 2023. 100164. DOI: 10.1016/j.crbiot.2023.100164.

[Cha23b] Chan, Sze Ling, et al, "Implementation of prediction models in the emergency department from an implementation science perspective-determinants, outcomes, and real-world impact: a scoping review ", Annals of Emergency Medicine 82.1 2023. 22-36. DOI: 10.1016/j.annemergmed.2023.02.001. EDN: WLEYRQ.

[Che20] G. Cheng, X. Xie, J. Han, L. Guo and G. -S. Xia, "Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 3735-3756, 2020. DOI: 10.1109/JSTARS.2020.3005403. EDN: DPUHSJ.

[Chi21] D. Chicco and L. Oneto, "An Enhanced Random Forests Approach to Predict Heart Failure from Small Imbalanced Gene Expression Data," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 18, no. 6, pp. 2759-2765, 1 Nov.-Dec. 2021. DOI: 10.1109/TCBB.2020.3041527. EDN: QPUYZB.

[Chr21] Christodorescu, Ruxandra, Domenico Corrado, and Michele D'Alto, "2020 ESC Guidelines for the management of adult congenital heart disease", European Heart Journal 42.2021. 563À645. DOI: 10.1093/eurheartj/ehaa554.

[Col22] Collin, Catherine Bjerre, et al, "Computational models for clinical applications in personalized medicine-guidelines and recommendations for data integration and model validation", Journal of Personalized Medicine 12.2.2022. 166. DOI: 10.3390/jpm12020166.

[Com22] C. Comito, D. Falcone and A. Forestiero, "AI-Driven Clinical Decision Support: Enhancing Disease Diagnosis Exploiting Patients Similarity," in IEEE Access, vol. 10, pp. 6878-6888, 2022. DOI: 10.1109/ACCESS.2022.3142100. EDN: WVQCTG.

[Deg23] Degtiar, Irina, and Sherri Rose, "A review of generalizability and transportability ", Annual Review of Statistics and Its Application 10.1.2023. 501-524.

[Dhi23] Dhingra, Lovedeep Singh, et al, "Cardiovascular care innovation through data-driven discoveries in the electronic health record ", The American Journal of Cardiology 203 2023. 136-148. DOI: 10.1016/j.amjcard.2023.06.104. EDN: RVSNWQ.

[DiC24] Di Cesare M and Perel P et al, "The Heart of the World. Glob Heart", 2024 Jan 25, 19(1):11. 10.5334/gh.1288. PMID: 38273998; PMCID: PMC10809869. DOI: 10.5334/gh.1288.;PMCID.

[Din19] Dinh, An, et al, "A data-driven approach to predicting diabetes and cardiovascular disease with machine learning", BMC Medical Informatics and Decision Making 19.1.2019. 1-15. DOI: 10.1186/s12911-019-0918-5.

[Edw23] J. Edward, M. M. Rosli and A. Seman, "A New Multi-Class Rebalancing Framework for Imbalance Medical Data," in IEEE Access, vol. 11, pp. 92857-92874, 2023. DOI: 10.1109/ACCESS.2023.3309732.

[Elr24] Elreedy, Dina, A. F. Atiya, and F. Kamalov, "A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning", Machine Learning 113.7.2024. 4903-4923. DOI: 10.1007/s10994-022-06296-4.

[ESC21] ESC Cardiovasc Risk Collaboration, and SCORE2 Working Group, "SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe.", European Heart Journal 42.25 2021. 2439-2454. DOI: 10.1093/eurheartj/ehab309. EDN: DGYBZW.

[Gen20] Geneviève, Lester Darryl, et al, "Structural racism in precision medicine: leaving no one behind", BMC Medical Ethics 21.2020. 1-13. DOI: 10.1186/s12910-020-0457-8.

[Gha24] Al-Ghannam, R., Ykhlef, M. & Al-Dossari, H., "Robust Drug Use Detection on X: Ensemble Method with a Transformer Approach", Arab J Sci Eng 49, 12867-12885 2024. DOI: 10.1007/s13369-024-08845-6. EDN: LVNWBU.

[Gok02] Gokce, Noyan, et al, "Risk stratification for postoperative cardiovascular events via noninvasive assessment of endothelial function: a prospective study ", Circulation 105.13.2002. 1567-1572. DOI: 10.1161/01.CIR.0000012543.55874.47.

[Hag21] Hagan, Rachael, Charles J. Gillan, and Fiona Mallett, "Comparison of machine learning methods for the classification of cardiovascular disease ", Informatics in Medicine Unlocked 24 2021. 100606. DOI: 10.1016/j.imu.2021.100606. EDN: VOGCPN.

[Jia22] Jia, W., Sun, M., Lian, J. et al, "Feature dimensionality reduction: a review", Complex Intell. Syst. 8, 2663-2693, 2022. DOI: 10.1007/s40747-021-00637-x. EDN: CBHSSH.

[Jui24] Jui, Tonni Das, and Pablo Rivas, "Fairness issues, current approaches, and challenges in machine learning models", International Journal of Machine Learning and Cybernetics.2024. 1-31. DOI:10.1007/s13042-023-02083-2.

[Kha24] Khalifa, Mohamed, and Mona Albadawy, "Artificial Intelligence for Clinical Prediction: Exploring Key Domains and Essential Functions", Computer Methods and Programs in Biomedicine Update.2024. 100148. DOI:10.1016/j.cmpbup.2024.100148.

[Kha24b] R. Khanam, M. Hussain, R. Hill and P. Allen, "A Comprehensive Review of Convolutional Neural Networks for Defect Detection in Industrial Applications," in IEEE Access, vol. 12, pp. 94250-94295 2024. DOI: 10.1109/ACCESS.2024.3425166. EDN: ISBXHG.

[Kim20] Kim, Junho, et al, "Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea", BMC Public Health 22.1.2022. 664. DOI: 10.1186/s12889-022-13131-x.

[Kum23] Kumar, Yogesh, et al., "Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda", Journal of Ambient Intelligence and Humanized Computing 14.7.2023. 8459-8486. DOI: 10.1007/s12652-021-03612-z.

[Lan20] Landi, Isotta, et al, "Deep representation learning of electronic health records to unlock patient stratification at scale", NPJ Digital Medicine 3.1.2020. 96. DOI: 10.1038/s41746-020-0301-z.

[Llo19] Lloyd-Jones, Donald M., et al, "Use of risk assessment tools to guide decision-making in the primary prevention of atherosclerotic cardiovascular disease: a special report from the American Heart Association and American College of Cardiology", Circulation 139.25.2019. e1162-e1177. DOI: 10.1161/CIR.0000000000000638.

[Mah24] T. Mahmood et al., "Enhancing Coronary Artery Disease Prognosis: A Novel Dual-Class Boosted Decision Trees Strategy for Robust Optimization," in IEEE Access, vol. 12, pp. 107119-107143, 2024. DOI: 10.1109/ACCESS.2024.3435948. EDN: SUDMJC.

[Mar24] Marey, Ahmed, et al, "Explainability, transparency and black box challenges of AI in radiology: impact on patient care in cardiovascular radiology", Egyptian Journal of Radiology and Nuclear Medicine 55.1.2024. 1-14. DOI: 10.1186/s43055-024-01356-2.

[Moh22] Mohd Javaid and Abid Haleem et al, "Significance of machine learning in healthcare: Features, pillars and applications", International Journal of Intelligent Networks, Volume 3, 2022, Pages 58-73. DOI 10.1016/j.ijin.2022.05.002.

[Moh24] S. Mohite, S. G. Mohite, J. Sutariya, A. Sawant, A. Dwivedi and S. Joshi, "Predictive Disease Modeling for Proactive Healthcare," 2024 International Conference on Intelligent Systems for Cybersecurity (ISCS), Gurugram, India, 2024, pp. 1-6. DOI: 10.1109/ISCS61804.2024.10581019.

[Nai23] P. Naik, M. Dalponte and L. Bruzzone, "Automated Machine Learning Driven Stacked Ensemble Modeling for Forest Aboveground Biomass Prediction Using Multitemporal Sentinel-2 Data," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 16, pp. 3442-3454, 2023. DOI: 10.1109/JSTARS.2022.3232583. EDN: CXSUYH.

[Nay24] Nayak, GH Harish, et al, "Exogenous variable driven deep learning models for improved price forecasting of TOP crops in India", Scientific Reports 14.1.2024. 17203. DOI: 10.1038/s41598-024-68040-3.

[Naz24] N. N. N. Nazirun et al., "Prediction Models for Type 2 Diabetes Progression: A Systematic Review," in IEEE Access. DOI: 10.1109/ACCESS.2024.3432118.

[Oh22] Oh, Taeseob, et al, "Machine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES", Scientific Reports 12.1.2022. 2250. DOI: 10.1038/s41598-022-06333-1.

[Orf20] Orfanoudaki, Agni, et al, "Machine learning provides evidence that stroke risk is not linear: The non-linear Framingham stroke risk score", PloS one 15.5.2020. e0232414. DOI: 10.1371/journal.pone.0232414.

[Pan20] Pandey, Ambarish, et al, "Association of intensive lifestyle intervention, fitness, and body mass index with risk of heart failure in overweight or obese adults with type 2 diabetes mellitus: an analysis from the Look AHEAD trial.", Circulation 141.16 2020. 1295-1306. DOI: 10.1161/circulationaha.119.044865. EDN: CAWWOY.

[Pas20] Pashayan, N., Antoniou, A.C., Ivanus, U. et al, "Personalized early detection and prevention of breast cancer: ENVISION consensus statement", Nat Rev Clin Oncol 17, 687-705 2020. DOI: 10.1038/s41571-020-0388-9. EDN: DFOOHN.

[Paw24] D. Pawuś, T. Porażko and S. Paszkiel, "Automation and Decision Support in the Area of Nephrology Using Numerical Algorithms, Artificial Intelligence, and Expert Approach: Review of the Current State of Knowledge," in IEEE Access, vol. 12, pp. 86043-86066. 2024. DOI: 10.1109/ACCESS.2024.3413595.

[Pow21] Powell-Wiley TM and Poirier P, Burke LE et al , "American Heart Association Council on Lifestyle and Cardiometabolic Health; Council on Cardiovascular and Stroke Nursing; Council on Clinical Cardiology; Council on Epidemiology and Prevention; and Stroke Council. Obesity and Cardiovascular Disease: A Scientific Statement from the American Heart Association", Circulation. 2021 May 25;143(21):e984-e1010. Epub 2021 Apr 22. PMID: 33882682; PMCID: PMC8493650. DOI: 10.1161/CIR.0000000000000973.

[Pri20] R. J. P. Princy, S. Parthasarathy, P. S. Hency Jose, A. Raj Lakshminarayanan and S. Jeganathan, "Prediction of Cardiac Disease using Supervised Machine Learning Algorithms," 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2020, pp. 570-575. DOI: 10.1109/ICICCS48265.2020.9121169.

[Rön24] Rönn, Tina, et al, "Predicting type 2 diabetes via machine learning integration of multiple omics from human pancreatic islets", Scientific Reports 14.1.2024. 14637. DOI: 10.1038/s41598-024-64846-3

[Rus20] Russak, Adam J., et al, "Machine learning in cardiology-ensuring clinical impact lives up to the hype", Journal of Cardiovascular Pharmacology and Therapeutics 25.5.2020. 379-390. DOI: 10.1177/1074248420928651.

[Sah20] Sahin, Emrehan Kutlug, "Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest", SN Applied Sciences 2.7.2020. 1308. DOI:10.1007/s42452-020-3060-1.

[Sam24] Samadi, Moein E., et al, "A hybrid modeling framework for generalizable and interpretable predictions of ICU mortality across multiple hospitals", Scientific Reports 14.1.2024. 5725. DOI: 10.1038/s41598-024-55577-6

[Set23] Sethi, Yashendra, et al, "Precision medicine and the future of cardiovascular diseases: a clinically oriented comprehensive review", Journal of Clinical Medicine 12.5.2023. 1799. DOI: 10.3390/jcm12051799

[Sha20] Shapiro, Michael D., and Sergio Fazio et al, "Preventive cardiology as a dedicated clinical service: The past, the present, and the (Magnificent) future", American Journal of Preventive Cardiology 1 2020. 100011. DOI: 10.1016/j.ajpc.2020.100011. EDN: RODKAS.

[Sha20b] V. Sharma, A. Rasool, and G. Hajela, "Prediction of Heart disease using DNN," 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2020, pp. 554-562. DOI: 10.1109/ICIRCA48905.2020.9182991.

[Shu23] Shu, Xiaoling, and Yiwan Ye, "Knowledge Discovery: Methods from data mining and machine learning", Social Science Research 110 2023. 102817. DOI: 10.1016/j.ssresearch.2022.102817. EDN: VKYJYJ.

[Sri23] Srinivasan, Saravanan, et al "An active learning machine technique based prediction of cardiovascular heart disease from UCI-repository database", Scientific Reports 13.1.2023. 13588. DOI: 10.1038/s41598-023-40717-1.

[Thu22] Thupakula, Sreenu, et al., "Emerging biomarkers for the detection of cardiovascular diseases.", The Egyptian Heart Journal 74.1 2022. 77. DOI: 10.1186/s43044-022-00317-2. EDN: UYGSEM.

[Vis24] V. Vision Paul and J. A. I. S. Masood, "Exploring Predictive Methods for Cardiovascular Disease: A Survey of Methods and Applications," in IEEE Access, vol. 12, pp. 101497-101505, 2024. DOI: 10.1109/ACCESS.2024.3430898.

[Xu23] Y. Xu, Z. Yu, W. Cao and C. L. P. Chen, "A Novel Classifier Ensemble Method Based on Subspace Enhancement for High-Dimensional Data Classification", in IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 1, pp. 16-30, 1 Jan. 2023. DOI: 10.1109/TKDE.2021.3087517. EDN: VKXUIO.

[Yad24] Yadav, Devendra K., Aditya Kaushik, and Nidhi Yadav, "Predicting machine failures using machine learning and deep learning algorithms", Sustainable Manufacturing and Service Economics 3 2024. 100029. DOI: 10.1016/j.smse.2024.100029. EDN: VPIVMQ.

[Ye22] Q. Ye, P. Huang, Z. Zhang, Y. Zheng, L. Fu and W. Yang, "Multiview Learning with Robust Double-Sided Twin SVM," in IEEE Transactions on Cybernetics, vol. 52, no. 12, pp. 12745-12758, Dec. 2022. DOI: 10.1109/TCYB.2021.3088519. EDN: YOALLL.

[Zaf21] Zafar, Muhammad Rehman, and Naimul Khan, "Deterministic local interpretable model-agnostic explanations for stable explainability", Machine Learning and Knowledge Extraction 3.3.2021. 525-541. DOI: 10.3390/make3030027.

[Zha19] Zhao, Juan, et al, "Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction", Scientific Reports 9.1.2019. 717. DOI: 10.1038/s41598-018-36745-x.

[Zhe21] H. Zheng, S. W. A. Sherazi and J. Y. Lee, "A Stacking Ensemble Prediction Model for the Occurrences of Major Adverse Cardiovascular Events in Patients with Acute Coronary Syndrome on Imbalanced Data", in IEEE Access, vol. 9, pp. 113692-113704, 2021. DOI: 10.1109/ACCESS.2021.3099795. EDN: IBSCAS.

[Zho21] B. Zhou, et al, "Global epidemiology, health burden and effective interventions for elevated blood pressure and hypertension", Nature Reviews Cardiology 18.11.2021. 785-802. DOI: 10.1038/s41569-021-00559-8.


Ссылки

  • На текущий момент ссылки отсутствуют.


(c) 2025 Parvez Rahi, Sandeep Singh Kang