Deep Scattering Spectra with Deep Neural Networks for Acoustic Scene Classification Tasks
-
Graphical Abstract
-
Abstract
As one of the most commonly used features, Mel-frequency cepstral coefficients (MFCCs) are less discriminative at high frequency. A novel technique, known as Deep scattering spectrum (DSS), addresses this issue and looks to preserve greater details. DSS feature has shown promise both on classification and recognition tasks. In this paper, we extend the use of DSS feature for acoustic scene classification task. Results on Detection and classification of acoustic scenes and events (DCASE) 2016 and 2017 show that DSS provided 4.8% and 17.4% relative improvements in accuracy over MFCC features, within a state-of-the-art time delay neural network framework.
-
-