Статья

Exploratory Data Analysis and Machine Learning for Cardiometabolic Risk Prediction, Stratified by Age and Gender

Nurmeen RafiqueBUITEMS,Dept of Computer Science,Quetta,PakistanSibghat Ullah BazaiBUITEMS,Dept of Computer Engineering,Quetta,PakistanMuhammad ImranBUITEMS,Dept of Information Technology,Quetta,PakistanAnnaev UmidjonTermez University of Economics and Service,Department of Natural Sciences,Termez,UzbekistanYuldashev BakhromMamun University,Dean of the Faculty of Medicine,Khiva,UzbekistanRakhimjon Rajapboyevich RakhimovUrgench State University,Department of Electrical Engineering and Energy,Urgench,UzbekistanUzair Aslam BhattiSchool of Information and Communication Engineering, Hainan University,Hainan,ChinaMuhammad AamirCollege of Computer Science and Artificial Intelligence, Huanggang Normal University,Huanggang,China

2025

ABI

Аннотация

The modern healthcare analytics depends on machine learning and data visualization to transform complex healthcare information into useful knowledge. This study performs Exploratory Data Analysis (EDA) and predictive modeling on a medical dataset to investigate patterns focused on age and gender of the patients. Moreover, it also explores the risk factors associated with cardiometabolic diseases. Using a Python-based environment, we analyze the relationship between gender and age, which are independent variables. Additionally, this study also analyzes the relationship between dependent variables such as cholesterol, diabetes, chest pain, and blood sugar. To address the limited public dataset availability, we augmented our analysis by incorporating supplementary cardiovascular risk data from regional health surveys, increasing the effective sample size to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathbf{N}=512$</tex>. Our findings reveal significant differences in the frequency of cardiometabolic risk factors. Specifically, we observed a sharp increase in the rate of high cholesterol risk starting between age groups of 50 and 59 years. Furthermore, we recommend implementing multilevel regression models incorporating age × gender × BMI × blood pressure interactions and Restricted Cubic Splines (RCS) for continuous variables to capture non-linear threshold effects in future studies.

Перевод пока недоступен

Темы

Artificial Intelligence in Healthcare Machine Learning in Healthcare Health, Environment, Cognitive Aging

Идентификаторы

DOI: 10.1109/iccvit67848.2025.11391452

Цитирования и источники

Цитирований: 0Использованных источников: 14

Показатели — AkademScholar · Скоро