Explainable machine learning for somatotype prediction in adolescents living under anthropogenic environmental stressors
Annotatsiya
Long-term exposure to anthropogenic environmental stressors is associated with endocrine disturbance and developmental impairments in adolescents living in the former Aral Sea region. Nonetheless, converting diverse anthropometrical and hormonal data into explainable predictors of somatotype deviation remains complex. This study utilizes machine learning models (ML) with explainable AI (XAI) to discover influential factors of somatotype components in adolescents from environmentally polluted areas. A dataset of 405 prepubertal boys (11–13 years) from an environmentally affected area (North, n=198) and a controlled area (Nukus, n=207) was analyzed. Thirty predictors were utilized, including growth indicators, anthropometric and skinfold measures, hormone markers, and region indicators to predict somatotype components scores. Five machine learning models were assessed using evaluation metrics (MAE, MAPE, MSE, and R 2 ). Linear regression displayed the best model performance by achieving R 2 =0.9836 for the test data, and this model was further examined using SHAP and LIME explainable AI methods. SHAP and LIME analysis results demonstrated that ectomorphy (Ecto) was mainly affected by height, weight, and region; mesomorphy (Meso) by skeletal thickness and circumferences, specifically elbow diameter (ED), and endomorphy (Endo) by skinfold thickness measurements (SS, SiS, TS). Well-accurate somatotype prediction scores were attained by an interpretable linear regression model, and clinically reliable explanations were provided by explainable AI (SHAP and LIME). This ML and XAI methodology encourage long-term study using direct toxin indicators and enables clear analysis on growth and body structure in prepubertal boys who are living under anthropogenic environmental stressors.