Skip to main content
Article

A Hybrid Swin Transformer and MLP-Mixer Approach for Automated Ear Disease Diagnosis from Otoscopic Images

Furkancan DemircanSoftware Engineering, Faculty of Engineering and Natural Sciences, Samsun University,TürkiyeZafer CömertSoftware Engineering, Faculty of Engineering and Natural Sciences, Samsun University,TürkiyeAlper Talha KaradenizSoftware Engineering, Faculty of Engineering and Natural Sciences, Samsun University,TürkiyeMurat EkіncіComputer Engineering, Faculty of Engineering, Karadeniz Technical University,Trabzon,Türkiye
2025en
ABI

Abstract

Accurate classification of ear diseases is crucial for early diagnosis and effective treatment. Traditional diagnostic methods rely on subjective visual inspection. Recent advancements in deep learning have facilitated the development of automated diagnostic models. In this study, we propose a hybrid deep learning model that integrates the Swin Transformer architecture with an MLP-Mixer. The model’s design integrates the Swin Transformer’s hierarchical feature extraction with the MLP-Mixer’s token-channel mixing. The Ear Imagery dataset was utilized for training and evaluating the proposed model. Experimental findings indicate that the proposed hybrid architecture achieves superior classification performance compared to traditional CNNs and standalone Vision Transformer models. The proposed model achieved an accuracy of 99.62%, representing a significant improvement in classification performance over the standalone Swin Transformer model, which attained an accuracy of 95.83%.

Topics

Identifiers

Citations and references

Cited by 013 references