TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation

Runze Ma; Shugong Xu

doi:10.1007/978-3-030-60636-7_40

TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation

Runze Ma^*, Shugong Xu

^*Corresponding author for this work

Shanghai University

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Recurrent neural networks (RNNs) have been widely used in speech signal processing. Because it is powerful to modeling some sequential information. While most of the networks about RNNs are on frame sight, we propose three-way RNN called TeeRNN which both process the input through the time and the features. According to that, TeeRNN is better to explore the relationship between the features in each frame of encoded speech. As an additional contribution, we also generated a mixture dataset based on LibriSpeech where the devices mismatched and different noises contained making the separation task harder.

Original language	English
Title of host publication	Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings
Editors	Yuxin Peng, Hongbin Zha, Qingshan Liu, Huchuan Lu, Zhenan Sun, Chenglin Liu, Xilin Chen, Jian Yang
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	485-494
Number of pages	10
ISBN (Print)	9783030606350
DOIs	https://doi.org/10.1007/978-3-030-60636-7_40
Publication status	Published - 2020
Externally published	Yes
Event	3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020 - Nanjing, China Duration: 16 Oct 2020 → 18 Oct 2020

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	12307 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020
Country/Territory	China
City	Nanjing
Period	16/10/20 → 18/10/20

Keywords

Recurrent neural network
Speech processing
Speech separation

Access to Document

10.1007/978-3-030-60636-7_40

Cite this

Ma, R., & Xu, S. (2020). TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation. In Y. Peng, H. Zha, Q. Liu, H. Lu, Z. Sun, C. Liu, X. Chen, & J. Yang (Eds.), Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings (pp. 485-494). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12307 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60636-7_40

Ma, Runze ; Xu, Shugong. / TeeRNN : A Three-Way RNN Through Both Time and Feature for Speech Separation. Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. editor / Yuxin Peng ; Hongbin Zha ; Qingshan Liu ; Huchuan Lu ; Zhenan Sun ; Chenglin Liu ; Xilin Chen ; Jian Yang. Springer Science and Business Media Deutschland GmbH, 2020. pp. 485-494 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{b0bdd94a5c72423f9a6f7377a5b1c085,

title = "TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation",

abstract = "Recurrent neural networks (RNNs) have been widely used in speech signal processing. Because it is powerful to modeling some sequential information. While most of the networks about RNNs are on frame sight, we propose three-way RNN called TeeRNN which both process the input through the time and the features. According to that, TeeRNN is better to explore the relationship between the features in each frame of encoded speech. As an additional contribution, we also generated a mixture dataset based on LibriSpeech where the devices mismatched and different noises contained making the separation task harder.",

keywords = "Recurrent neural network, Speech processing, Speech separation",

author = "Runze Ma and Shugong Xu",

note = "Publisher Copyright: {\textcopyright} 2020, Springer Nature Switzerland AG.; 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020 ; Conference date: 16-10-2020 Through 18-10-2020",

year = "2020",

doi = "10.1007/978-3-030-60636-7_40",

language = "English",

isbn = "9783030606350",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "485--494",

editor = "Yuxin Peng and Hongbin Zha and Qingshan Liu and Huchuan Lu and Zhenan Sun and Chenglin Liu and Xilin Chen and Jian Yang",

booktitle = "Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings",

}

Ma, R & Xu, S 2020, TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation. in Y Peng, H Zha, Q Liu, H Lu, Z Sun, C Liu, X Chen & J Yang (eds), Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12307 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 485-494, 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020, Nanjing, China, 16/10/20. https://doi.org/10.1007/978-3-030-60636-7_40

TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation. / Ma, Runze; Xu, Shugong.
Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. ed. / Yuxin Peng; Hongbin Zha; Qingshan Liu; Huchuan Lu; Zhenan Sun; Chenglin Liu; Xilin Chen; Jian Yang. Springer Science and Business Media Deutschland GmbH, 2020. p. 485-494 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12307 LNCS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - TeeRNN

T2 - 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020

AU - Ma, Runze

AU - Xu, Shugong

PY - 2020

Y1 - 2020

N2 - Recurrent neural networks (RNNs) have been widely used in speech signal processing. Because it is powerful to modeling some sequential information. While most of the networks about RNNs are on frame sight, we propose three-way RNN called TeeRNN which both process the input through the time and the features. According to that, TeeRNN is better to explore the relationship between the features in each frame of encoded speech. As an additional contribution, we also generated a mixture dataset based on LibriSpeech where the devices mismatched and different noises contained making the separation task harder.

AB - Recurrent neural networks (RNNs) have been widely used in speech signal processing. Because it is powerful to modeling some sequential information. While most of the networks about RNNs are on frame sight, we propose three-way RNN called TeeRNN which both process the input through the time and the features. According to that, TeeRNN is better to explore the relationship between the features in each frame of encoded speech. As an additional contribution, we also generated a mixture dataset based on LibriSpeech where the devices mismatched and different noises contained making the separation task harder.

KW - Recurrent neural network

KW - Speech processing

KW - Speech separation

UR - http://www.scopus.com/inward/record.url?scp=85094109019&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-60636-7_40

DO - 10.1007/978-3-030-60636-7_40

M3 - Conference Proceeding

AN - SCOPUS:85094109019

SN - 9783030606350

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 485

EP - 494

BT - Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings

A2 - Peng, Yuxin

A2 - Zha, Hongbin

A2 - Liu, Qingshan

A2 - Lu, Huchuan

A2 - Sun, Zhenan

A2 - Liu, Chenglin

A2 - Chen, Xilin

A2 - Yang, Jian

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 16 October 2020 through 18 October 2020

ER -

Ma R, Xu S. TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation. In Peng Y, Zha H, Liu Q, Lu H, Sun Z, Liu C, Chen X, Yang J, editors, Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. Springer Science and Business Media Deutschland GmbH. 2020. p. 485-494. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-60636-7_40

TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this