Adaptive Pronunciation Assessment Based on Acoustic Feature Profiling and Wav2vec 2.0 For English Language Learning
Аннотация
This paper presents a speaker-adaptive approach to automatic pronunciation assessment for English language learning, with a focus on Uzbek learners. The proposed methodology integrates acoustic signal processing, feature extraction, and deep learning-based modeling within a unified framework. A key contribution of the study is the introduction of dynamic speaker profiling based on fundamental frequency and energy, enabling adaptive dataset selection according to speaker characteristics such as age and gender. Mel-Frequency Cepstral Coefficients are employed for acoustic feature representation, while Wav2Vec 2.0 is utilized for deep contextual embedding and pronunciation evaluation. Experimental results demonstrate improved accuracy and efficiency compared to conventional approaches.
Ҳали таржима қилинмаган