Skip to main content
AkademIndex

Products

For developers

AkademBasesoonOpen API for the ecosystem
Latin
English
Article

Detection of Synthetic Speech Using Spectral-Cepstral Features and BiLSTM Networks Furkat

Furkat RakhmatovTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, UzbekistanFakhriddin AbdirazakovTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, UzbekistanBaxodir Saydullaevich AchilovTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, UzbekistanRuslan BaydullayevTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, UzbekistanSultanmurat NasirovTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, UzbekistanShakhzod JavlievTashkent University of information technologies named after Muhammad al-Khworazmi, Tashkent 100084, Uzbekistan
Informaticajournal2025
ABI

Abstract

Experimental results demonstrate 93,4% accuracy on the test set; error analysis reveals that misclassifications predominantly occur between the Person and Robot classes, whereas the Emotion class is recognized more reliably. Feature comparison indicates that log-mel provides a robust baseline with minimal computational cost, LFCC better preserves high-frequency details characteristic of synthetic artifacts, and CQCC is effective in capturing harmonic structure and modulations. Potential directions for improving generalizability and accuracy are discussed, including feature fusion (CQCC/LFCC/log-mel) and statistical pooling for temporal aggregation. The proposed configuration offers a well-balanced trade-off between performance and computational complexity, serving as a strong baseline for anti-spoofing systems.

Topics

Identifiers

Citations and references

Cited by 00 references
Metrics — AkademScholar · Coming soon