Voice Command Accuracy Improvement Using Connectionist Temporal Classification in Speech-to-Text Interfaces
Annotatsiya
Voice interfaces are an essential part of contemporary human-computer interaction and speech-to-text (STT) systems allow one to work hands-free and enhance the accessibility. Nevertheless, the precision of voice command recognition is a major issue that is yet to be addressed particularly in noisy settings or in language with complex linguistic features. Current approaches can have issues with the consistency of real-time transcription and are not contextually aware and cannot work with limited or agglutinative languages like Hindi. In order to overcome these shortcomings, Enhancing Voice Command with Connectionist Temporal Classification (EVC-CTC) is proposed in this study. It is a strong framework which makes use of the CTC to enhance sequence matching between speech input and textual output. EVC-CTC combines both context modeling and language specific features to increase recognition accuracy particularly in the presence of real-world variability. The suggested approach is deployed as a modular and scalable STT pipeline that facilitates voice assistants that run on web-based platforms. Continuous voice streams are processed with EVC-CTC to preserve continuity of the context and dynamically respond to the background noise and variations in speakers. Experimental analysis indicates that EVC-CTC is much better than baseline models in Word Error Rate (WER) and semantic accuracy, especially in Hindi voice commands.
Hali tarjima qilinmagan