Maqola

Model and Algorithms for Classifying Anomalous Phenomena based on the Convergence of Acoustic-Visual Signals

N. RavshanovDigital Technologies and Artificial Intelligence Development Research InstituteB.I. BoborakhimovDigital Technologies and Artificial Intelligence Development Research InstituteM.I. BerdievNational Guard of the Republic of Uzbekistan

Problems of Computational and Applied Mathematicsjournal2026

ABI

Annotatsiya

This paper proposes a Context-adaptive Audio-Visual Neural Network (CAVN) model for anomaly detection in public safety systems. Existing approaches primarily rely on visual data and employ simple fusion strategies for combining modalities, which leads to limitations in capturing complex semantic relationships. The proposed model consists of four main components: a visual feature extraction module based on SlowFast architecture, an audio feature extraction module based on Audio Spectrogram Transformer (AST), a fusion module based on bidirectional cross-attention mechanism, and a temporal context aggregation module based on Transformer encoder. The main scientific novelty of the model lies in the adaptive modality balancing mechanism, which dynamically adjusts the relative importance of modalities under different conditions (dark/bright, noisy/quiet). Experimental results demonstrate that the proposed CAVN model outperforms existing methods by in overall accuracy and by in dark conditions. Ablation studies confirmed the contribution of each module to the overall performance.

Hali tarjima qilinmagan

Mavzular

Anomaly Detection Techniques and Applications Music and Audio Processing Fire Detection and Safety Systems

Identifikatorlar

DOI: 10.71310/pcam.6_70.2025.07

Iqtiboslar va manbalar

0 ta iqtibos0 ta foydalanilgan manba

Koʻrsatkichlar — AkademScholar · Tez orada