Hybrid Feature Selection and Optimiser-assisted Ensemble Learning for Credit Card Fraud Detection
Abstract
A modular detection pipeline is presented that integrates boundary-aware resampling (SMOTE-ENN), multi-criteria feature prioritization (TOPSIS), nonlinear compression via an undercomplete autoencoder, and a PSO-stabilised Extreme Learning Machine (ELM) within a stacking ensemble (SVM, KNN, PSO-ELM; Gradient-Boosting meta-learner). The architecture is designed to balance the conflicting goals of minority-class recoverability, low false-positive rates, computational efficiency during inference, and feature interpretability. TOPSIS combines complementary statistical and information-theoretic criteria to produce a compact, interpretable subset, which is then mapped onto a latent manifold by the autoencoder to reduce residual noise and encode nonlinear dependencies. PSO stabilises the ELM hidden-layer initialisation and hyperparameters, reducing variability in ensemble contributions. Empirical testing on the standard credit-card dataset (284,807 transactions) shows superior discrimination (Accuracy ≈ 99.95%, Recall ≈ 99.97%, AUC ≈ 1.00) and an acceptable false-positive rate (FPR ≈ 0.02%); cross-domain validation on PaySim confirms robustness (Accuracy ≈ 98.84%, AUC ≈ 0.99). The study demonstrates that combining boundary-aware resampling, multi-criterion ranking, compact nonlinear representation, and optimiser-assisted ELM initialisation significantly improves minority-class detection while limiting false alarms, offering a practical path toward deployable, high-fidelity fraud detection systems.