Article

Explainable Machine Learning Models for Robust Clinical Biomarker Identification

I. ArivazhagiMohamed Sathak College of Arts and Science Sholinganallur,Department of Computer Applications,Chennai,Tamil Nadu,India,600119Shobha K.Ramaiah Institute of Technology MSR Nagar,Department of CSE (Cyber Security),Bengaluru,Karnataka,IndiaDeepak GuptaITM Gwalior,Department of Computer Science and Engineering,Gwalior,Madhya Pradesh,IndiaChoriyev MuzaffarTermez University of Economics and Service,Department of Medical Fundamental Sciences,Termez,UzbekistanMuyassar AllaberganovaUrgench State University,Department of Data Transmission Networks and Systems,Urgench,UzbekistanAnorgul Ashirova

2025

ABI

Abstract

Real-time identification of robust clinical biomarkers is fundamental to precision medicine, yet traditional machine learning approaches often function as "black boxes," limiting their clinical adoption. This paper presents a comprehensive framework integrating explainable artificial intelligence (XAI) methods—specifically SHAP, LIME, attention mechanisms, and integrated gradients—with machine learning models for transparent biomarker discovery. We evaluate our approach across three major clinical datasets: The Cancer Genome Atlas (TCGA) for oncological biomarkers, UK Biobank for cardiovascular and metabolic markers, and MIMIC-III for critical care prognostic indicators. Our ensemble framework combining Random Forest, XGBoost, and attention-based neural networks achieves mean AUC-ROC scores of 0.94 for cancer classification, 0.89 for cardiovascular risk prediction, and 0.91 for ICU mortality prediction, while maintaining interpretability fidelity scores exceeding 0.85. Ablation studies demonstrate that explainable models incur only a 3-5% performance penalty compared to black-box alternatives while providing clinically actionable feature attributions validated by domain experts. The proposed framework addresses FDA and EU MDR regulatory requirements for algorithmic transparency, offering a pathway toward clinically deployable AI-driven biomarker identification systems.

Topics

Explainable Artificial Intelligence (XAI)Imbalanced Data Classification Techniques Machine Learning in Healthcare

Identifiers

DOI: 10.1109/gcwcn66157.2025.11448530

Citations and references

Cited by 021 references

Metrics — AkademScholar