Статья

TransUNetB: An advanced Transformer–UNet framework for efficient and explainable brain tumor segmentation

Katura Gania KhushubuDepartment of Computer Science and Engineering, East West University, Dhaka 1212, BangladeshAbdullah Al MasumDepartment of Information Technology, Westcliff University, Irvine, California, CA 92614, USAMd Habibur RahmanDepartment of Management, International American University, 3440 wilshire Blvd ste 1000, Los Angeles, California, CA 90010, USAShakh Md Shakib HasanSchool of Engineering and Technology, Westcliff University, Irvine, California, CA 92614, USAM. M. H. BhuiyanDepartment of Business Analytics, International American University, 3440 wilshire Blvd ste 1000, Los Angeles, California, CA 90010, USAMohammad Rasel MahmudDepartment of Management Information System, International American University, 3440 wilshire Blvdste 1000, Los Angeles, California, CA 90010, USAS M Masfequier Rahman SwapnoDepartment of CSE, Bangladesh University of Business and Technology (BUBT), Mirpur, Dhaka, 1216, BangladeshAbhishek AppajiDepartment of Medical Electronics Engineering, B.M.S. College of Engineering, Bull Temple Rd, Basavanagudi, Bangalore, Karnataka 560019, India

2025en

ABI

Аннотация

This study presents TransUNetB, a hybrid architecture that combines Transformer and UNet for multi-class brain tumor segmentation. This model integrates global context modeling with precise spatial localization. A lightweight Transformer encoder at the bottleneck captures long-range dependencies, while the U-Net's skip pathways preserve fine anatomical details. Additionally, a multi-scale decoder fusion module consolidates features at various resolutions, enhancing the clarity of tumor boundaries in heterogeneous, low-contrast conditions. Our contributions are threefold: (1) a simple, efficient design that integrates bottleneck self-attention with multi-scale fusion for robust ED/TC/ET segmentation; (2) a comprehensive ablation of design choices—attention type, positional encoding, fusion strategy, loss formulation, and patch size—quantifying their impact on accuracy and efficiency; and (3) an explainability analysis using Grad-CAM with quantitative focus/entropy measures to verify that salient regions align with clinical tumor substructures. Evaluated on the BraTS 2020 and BraTS 2021 datasets, TransUNetB achieves a Dice score of 98.90% and an Intersection over Union (IoU) score of 96.10%. It outperforms strong CNN and vision-transformer baselines while maintaining a competitive runtime of approximately 63 ms per image. These results suggest that combining global attention with spatially faithful decoding provides a favorable trade-off between accuracy and efficiency for clinical deployment. We also discuss the generalization of our model beyond MRI cohorts, practical constraints in resource-limited settings, and future research avenues, including attention-guided fusion and broader multi-center validation.

Перевод пока недоступен

Идентификаторы

DOI: 10.1016/j.imu.2025.101706

Цитирования и источники

Цитирований: 2Использованных источников: 0

Показатели — AkademScholar