Article

Speaker Separation: Use Neural Networks

Mekhriddin RakhimovTashkent University of Information Technology named after Muhammad al-Khwarizmi, TUIT, Tashkent, UzbekistanBoburkhon TuraevTashkent University of Information Technology named after Muhammad al-Khwarizmi, TUIT, Tashkent, UzbekistanTuraev KhurshidTashkent University of Information Technology named after Muhammad al-Khwarizmi, TUIT, Tashkent, Uzbekistan

2021 International Conference on Information Science and Communications Technologies (ICISCT)conference2021en

ABI

Abstract

Speaker separation is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channel). We train Neural Network for learning when a person is speaking. We use different type of Neural Networks specifically, Single Layer Perceptron (SLP), Multi Layer Perceptron (MLP), Recurrent Neural Network (RNN) and Convolution Neural Network (CNN) we achieve uzbek speech commands ~88% of accuracy with RNN.

Topics

Speech and Audio Processing Music and Audio Processing Speech Recognition and Synthesis

Identifiers

DOI: 10.1109/icisct52966.2021.9670322

Citations and references

Cited by 5 6 references

Metrics — AkademScholar