Modular Architectures for Interpretable Credit Scoring for Heterogeneous Borrower Data
Annotatsiya
Modern credit scoring systems must operate under increasingly complex borrower data conditions, characterized by structural heterogeneity and regulatory demands for transparency. This study proposes a modular modeling framework that addresses both interpretability and data incompleteness in credit risk prediction. By leveraging Weight of Evidence (WoE) binning and logistic regression, we constructed domain-specific sub-models that correspond to different attribute sets and integrated them through ensemble, hierarchical, and stacking-based architectures. Using a real-world dataset from the American Express default prediction challenge, we demonstrate that these modular architectures maintain high predictive performance (test Gini > 0.90) while preserving model transparency. Comparative analysis across multiple architectural designs highlights trade-offs between generalization, computational complexity, and regulatory compliance. Our main contribution is a systematic comparison of logistic regression–based architectures that balances accuracy, robustness, and interpretability. These findings highlight the value of modular decomposition and stacking for building predictive yet interpretable credit risk models.