Knowledge Agora



Scientific Article details

Title Urban noise recognition with convolutional neural network
ID_Doc 42745
Authors Cao, JW; Cao, M; Wang, JZ; Yin, C; Wang, DP; Vidal, PP
Title Urban noise recognition with convolutional neural network
Year 2019
Published Multimedia Tools And Applications, 78, 20
DOI 10.1007/s11042-018-6295-8
Abstract Urban noise recognition play a vital role in city management and safety operation, especially in the recent smart city engineering. Exiting studies on urban noise recognition are mostly based on conventional acoustic features, such as Mel-Frequency Cepstral Coefficients (MFCC) and Linear Prediction Cepstral Coefficients (LPCC), and the shallow structure based classifiers, such as support vector machine (SVM). However, the urban acoustic environment is complicated and changeable. Conventional acoustic representation and recognition methods may be insufficient in characterizing urban noises, and generally suffer from a degraded performance. In this paper, we study the recent deep neural network based urban noise recognition. The log-Mel-spectrogram, namely, the FBank feature is first derived for acoustic representation. Then, the FBank spectrum constructed with a set of FBank feature vectors from multiple acoustic signal frames is fed to a convolutional neural network (CNN) for urban noise recognition. Comprehensive studies on the dimension of FBank spectrums and the parameters in CNN, including the size of learnable kernels, the dropout rate, and the activation function, etc., are presented in the paper. An acoustic database collected in real environment covering 11 most common urban noises with more than 56,000 samples is constructed for model verification and performance evaluation. In addition, the traditional LPCC and MFCC acoustic feature combining with two popular machine learning algorithms, extreme learning machine (ELM) and support vector machine (SVM), and the FBank image feature combining with extreme learning machine (ELM), hierarchical extreme learning machine (H-ELM) and multilayer extreme learning machine (ML-ELM), have also been presented for discussions. Experimental results show that the proposed method generally outperforms conventional shallow structure based classifiers.
Author Keywords Urban noise recognition; Convolutional neural network; Log-Mel-spectrum; Dropout; FBank spectrum
Index Keywords Index Keywords
Document Type Other
Open Access Open Access
Source Science Citation Index Expanded (SCI-EXPANDED)
EID WOS:000485997700037
WoS Category Computer Science, Information Systems; Computer Science, Software Engineering; Computer Science, Theory & Methods; Engineering, Electrical & Electronic
Research Area Computer Science; Engineering
PDF
Similar atricles
Scroll