Мақола

Accent Classification in Industrial Voice-Controlled IoT Systems Using i-Vector Framework

S.A. NazarovaBukhara State University, Uzbekistan,Candidate of Philological Sciences,Uzbek Linguistics and Journalism Deparment of Philology faculty,Bukhara,UzbekistanDhanananth RSt. Joseph's Institute of Technology, OMR,Department of Management Studies,Chennai,600 119Arasuraja GanesanSt. Joseph's Institute of Technology, OMR,Department of Management Studies,Chennai,Tamil Nadu,India,600119Garifullina Alsu Robert KiziR. UdayakumarKalinga University,India

2025

ABI

Аннотация

Accent classification in industrial voice-controlled IoT systems is essential for ensuring accurate speech recognition and safe operation in multilingual industrial environments. With the rise of voice-activated machinery, recognizing diverse accents of operators has become critical for improving command interpretation and operational efficiency. Existing accent classification methods often rely solely on conventional i-vector frameworks or basic acoustic features, which struggle to maintain accuracy in noisy industrial settings and under varying phonetic patterns. To address these limitations, this study proposes an Accent-Adaptive Deep i-Vector Fusion with Convolutional Bottleneck Features (CABi-Vector) framework. The method first extracts MFCC or logmel spectrograms from speech signals and passes them through a convolutional neural network to obtain deep bottleneck features capturing fine-grained phonetic-acoustic patterns. These embeddings are fused with traditional i-vectors to form accent-adaptive representations, which are then classified using a lightweight neural network optimized for real-time industrial deployment. The proposed CAB-iVector framework enhances the robustness of accent recognition in challenging industrial environments and allows voice-controlled IoT systems to dynamically adapt to operator accents. Experimental results demonstrate improved classification accuracy, reduced misinterpretation of commands, and increased operational safety and efficiency compared to conventional methods.

Ҳали таржима қилинмаган

Мавзулар

Speech Recognition and Synthesis Speech and Audio Processing Voice and Speech Disorders

Идентификаторлар

DOI: 10.1109/itechsecom64750.2025.11307495

Иқтибослар ва манбалар

0 та иқтибос8 та фойдаланилган манба

Кўрсаткичлар — AkademScholar