Article

Neuromorphic Architectures for Real-Time Speech Processing in Noisy Environments

Jyotsna DwivediKalinga University,Department of Commerce,Raipur,IndiaV. MangaiyarkarasiPrince Shri Bhavani College of Engineering and Technology,Department of ECE, New,Chennai,Tamilnadu,India,600073Ramy R. HusseinCollege of technical engineering, The Islamic University,Department of computers Techniques engineering,NajafPitta Geetha BhavaniCSE(DS), Godavari Global University,Rajamahendravaram,Andhra Pradesh,533296N. V. BalajiKarpagam Academy of Higher Education,Department of Computer Science,Coimbatore,641021Saodat KambarovaTashkent State University of Uzbek Language and Literature named after Alisher Navoi,Tashkent,UzbekistanAzizov Marufkhuja AhmatkhonovichFaculty of Humanities & Pedagogy, Turan International University,Namangan

2025

ABI

Abstract

The processing of speech in acoustic cluttered environments (such as the cocktail party problem) continues to be an extremely difficult problem to the artificial auditory system. In this case, the human listeners excel at isolating and parsing speech through the use of biologically efficient neural networks, as opposed to conventional digital signal processing systems which are characterized by latency, energy dissipation and poor performance in multitalking or noisy conditions. In order to overcome these inadequacies, neuromorphic computing has emerged as one of the new radiant approaches by simulating the biological auditory pathway with respect to its structure and functionality. The framework that integrates a biologically-inspired cochlear front-end, attentional spiking neural units, and attention-directed temporal fidelity that has recently been proposed in this paper is called the Attention-Gated Spiking-FullSubNet (AGS-FSN). To achieve localization and separation of sounds, Cochlear unit performs this frequency decomposition in a neuromorphic filter bank whose phase and timing representations are crucial to sound localization and separation. Events Implemented and Gated spiking neuron, event-driven paradigm reduces computational cost and is very energy-efficient. FullSubNet-Like attention mechanism permits the selective amplification of the pertinent parts of the speech of the background speech interference to the side. Extensive experiments on noisy real world datasets reveal that AGS-FSN demonstrates state-of-the-art hearing enhancement and recognition capability, easily surpassing the conventional deep learning models by a large margin in terms of accuracies, and power consumption. The design is aimed at the neuromorphic hardware devices such as field-programmable gate arrays and event-based silicon processors, which offers a reasonable path to low-power and real-time edge AI via audition. AGS-FSN is a new milestone of robust/scalable and biologically-inspired speech processing in diverse acoustic conditions.

Topics

Advanced Memory and Neural Computing Ferroelectric and Negative Capacitance Devices Neural Networks and Reservoir Computing

Identifiers

DOI: 10.1109/ietacs68750.2025.11385742

Citations and references

Cited by 01 references

Metrics — AkademScholar · Coming soon