Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBaseEkotizim uchun ochiq API
Maqola

Classifying environmental sounds using image recognition networks

Venkatesh BoddapatiDepartment of Computer Science and Engineering, Blekinge Institute of Technology, 371 79 Karlskrona, SwedenAndrej PetefSony Mobile Communications AB, Mobilvägen, 221 88 Lund, SwedenJim RasmussonSony Mobile Communications AB, Mobilvägen, 221 88 Lund, SwedenLars LundbergDepartment of Computer Science and Engineering, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden
2017en
ABI

Annotatsiya

Automatic classification of environmental sounds, such as dog barking and glass breaking, is becoming increasingly interesting, especially for mobile devices. Most mobile devices contain both cameras and microphones, and companies that develop mobile devices would like to provide functionality for classifying both videos/images and sounds. In order to reduce the development costs one would like to use the same technology for both of these classification tasks. One way of achieving this is to represent environmental sounds as images, and use an image classification neural network when classifying images as well as sounds. In this paper we consider the classification accuracy for different image representations (Spectrogram, MFCC, and CRP) of environmental sounds. We evaluate the accuracy for environmental sounds in three publicly available datasets, using two well-known convolutional deep neural networks for image recognition (AlexNet and GoogLeNet). Our experiments show that we obtain good classification accuracy for the three datasets.

Hali tarjima qilinmagan

Identifikatorlar

Iqtiboslar va manbalar

3 ta iqtibos0 ta foydalanilgan manba