Accent Classification in Industrial Voice-Controlled IoT Systems Using i-Vector Framework
Аннотация
Accent classification in industrial voice-controlled IoT systems is essential for ensuring accurate speech recognition and safe operation in multilingual industrial environments. With the rise of voice-activated machinery, recognizing diverse accents of operators has become critical for improving command interpretation and operational efficiency. Existing accent classification methods often rely solely on conventional i-vector frameworks or basic acoustic features, which struggle to maintain accuracy in noisy industrial settings and under varying phonetic patterns. To address these limitations, this study proposes an Accent-Adaptive Deep i-Vector Fusion with Convolutional Bottleneck Features (CABi-Vector) framework. The method first extracts MFCC or logmel spectrograms from speech signals and passes them through a convolutional neural network to obtain deep bottleneck features capturing fine-grained phonetic-acoustic patterns. These embeddings are fused with traditional i-vectors to form accent-adaptive representations, which are then classified using a lightweight neural network optimized for real-time industrial deployment. The proposed CAB-iVector framework enhances the robustness of accent recognition in challenging industrial environments and allows voice-controlled IoT systems to dynamically adapt to operator accents. Experimental results demonstrate improved classification accuracy, reduced misinterpretation of commands, and increased operational safety and efficiency compared to conventional methods.
Ҳали таржима қилинмаган