Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection

Jun Wang; Shengchen Li

doi:10.1109/ICASSP.2018.8461713

Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection

Jun Wang, Shengchen Li

Beijing University of Posts and Telecommunications

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

8 Citations (Scopus)

Abstract

Deep Neural Network (DNN) is a basic method used for the rare Acoustic Event Detection (AED) in synthesised audio. The structure of DNNs including Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN) for AED tasks has rather fewer hidden layers compared with computer vision systems. This paper tries to demonstrate that a DNN with more hidden layers does not necessarily guarantee a better performance in AED tasks. Taking the rare AED in synthesised audio with MLPs as an example and simulating a fixed budget of memory in an embedded system, various structures of MLPs are tested with fixed number of parameters engaged. Comparing the importance of neuron numbers in a hidden layer (i.e. the width of DNNs) and the importance of layer numbers in DNNs (i.e. the depth of DNNs) for AED tasks, the performance of the candidate DNN systems are evaluated by the event-based error rate. The results illustrate that a shallower network may outperform a deeper network when enough parameters are engaged and a larger number of parameters introduces a better performance in general.

Original language	English
Title of host publication	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2681-2685
Number of pages	5
ISBN (Print)	9781538646588
DOIs	https://doi.org/10.1109/ICASSP.2018.8461713
Publication status	Published - 10 Sept 2018
Externally published	Yes
Event	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada Duration: 15 Apr 2018 → 20 Apr 2018

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2018-April
ISSN (Print)	1520-6149

Conference

Conference	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/Territory	Canada
City	Calgary
Period	15/04/18 → 20/04/18

Keywords

Audio event detection
Deep neural network
Shallow neural network

Access to Document

10.1109/ICASSP.2018.8461713

Cite this

Wang, J., & Li, S. (2018). Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (pp. 2681-2685). Article 8461713 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8461713

Wang, Jun ; Li, Shengchen. / Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection. 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 2681-2685 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{6851d93742284ff7a1a865e2614eb098,

title = "Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection",

abstract = "Deep Neural Network (DNN) is a basic method used for the rare Acoustic Event Detection (AED) in synthesised audio. The structure of DNNs including Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN) for AED tasks has rather fewer hidden layers compared with computer vision systems. This paper tries to demonstrate that a DNN with more hidden layers does not necessarily guarantee a better performance in AED tasks. Taking the rare AED in synthesised audio with MLPs as an example and simulating a fixed budget of memory in an embedded system, various structures of MLPs are tested with fixed number of parameters engaged. Comparing the importance of neuron numbers in a hidden layer (i.e. the width of DNNs) and the importance of layer numbers in DNNs (i.e. the depth of DNNs) for AED tasks, the performance of the candidate DNN systems are evaluated by the event-based error rate. The results illustrate that a shallower network may outperform a deeper network when enough parameters are engaged and a larger number of parameters introduces a better performance in general.",

keywords = "Audio event detection, Deep neural network, Shallow neural network",

author = "Jun Wang and Shengchen Li",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 ; Conference date: 15-04-2018 Through 20-04-2018",

year = "2018",

month = sep,

day = "10",

doi = "10.1109/ICASSP.2018.8461713",

language = "English",

isbn = "9781538646588",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2681--2685",

booktitle = "2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings",

}

Wang, J & Li, S 2018, Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection. in 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings., 8461713, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2018-April, Institute of Electrical and Electronics Engineers Inc., pp. 2681-2685, 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018, Calgary, Canada, 15/04/18. https://doi.org/10.1109/ICASSP.2018.8461713

Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection. / Wang, Jun; Li, Shengchen.
2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. p. 2681-2685 8461713 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection

AU - Wang, Jun

AU - Li, Shengchen

PY - 2018/9/10

Y1 - 2018/9/10

N2 - Deep Neural Network (DNN) is a basic method used for the rare Acoustic Event Detection (AED) in synthesised audio. The structure of DNNs including Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN) for AED tasks has rather fewer hidden layers compared with computer vision systems. This paper tries to demonstrate that a DNN with more hidden layers does not necessarily guarantee a better performance in AED tasks. Taking the rare AED in synthesised audio with MLPs as an example and simulating a fixed budget of memory in an embedded system, various structures of MLPs are tested with fixed number of parameters engaged. Comparing the importance of neuron numbers in a hidden layer (i.e. the width of DNNs) and the importance of layer numbers in DNNs (i.e. the depth of DNNs) for AED tasks, the performance of the candidate DNN systems are evaluated by the event-based error rate. The results illustrate that a shallower network may outperform a deeper network when enough parameters are engaged and a larger number of parameters introduces a better performance in general.

AB - Deep Neural Network (DNN) is a basic method used for the rare Acoustic Event Detection (AED) in synthesised audio. The structure of DNNs including Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN) for AED tasks has rather fewer hidden layers compared with computer vision systems. This paper tries to demonstrate that a DNN with more hidden layers does not necessarily guarantee a better performance in AED tasks. Taking the rare AED in synthesised audio with MLPs as an example and simulating a fixed budget of memory in an embedded system, various structures of MLPs are tested with fixed number of parameters engaged. Comparing the importance of neuron numbers in a hidden layer (i.e. the width of DNNs) and the importance of layer numbers in DNNs (i.e. the depth of DNNs) for AED tasks, the performance of the candidate DNN systems are evaluated by the event-based error rate. The results illustrate that a shallower network may outperform a deeper network when enough parameters are engaged and a larger number of parameters introduces a better performance in general.

KW - Audio event detection

KW - Deep neural network

KW - Shallow neural network

UR - http://www.scopus.com/inward/record.url?scp=85054237637&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2018.8461713

DO - 10.1109/ICASSP.2018.8461713

M3 - Conference Proceeding

AN - SCOPUS:85054237637

SN - 9781538646588

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2681

EP - 2685

BT - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018

Y2 - 15 April 2018 through 20 April 2018

ER -

Wang J, Li S. Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2018. p. 2681-2685. 8461713. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2018.8461713

Comparing the Influence of Depth and Width of Deep Neural Network Based on Fixed Number of Parameters for Audio Event Detection

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Cite this