TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation

Runze Ma*, Shugong Xu

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Recurrent neural networks (RNNs) have been widely used in speech signal processing. Because it is powerful to modeling some sequential information. While most of the networks about RNNs are on frame sight, we propose three-way RNN called TeeRNN which both process the input through the time and the features. According to that, TeeRNN is better to explore the relationship between the features in each frame of encoded speech. As an additional contribution, we also generated a mixture dataset based on LibriSpeech where the devices mismatched and different noises contained making the separation task harder.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings
EditorsYuxin Peng, Hongbin Zha, Qingshan Liu, Huchuan Lu, Zhenan Sun, Chenglin Liu, Xilin Chen, Jian Yang
PublisherSpringer Science and Business Media Deutschland GmbH
Pages485-494
Number of pages10
ISBN (Print)9783030606350
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020 - Nanjing, China
Duration: 16 Oct 202018 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12307 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020
Country/TerritoryChina
CityNanjing
Period16/10/2018/10/20

Keywords

  • Recurrent neural network
  • Speech processing
  • Speech separation

Fingerprint

Dive into the research topics of 'TeeRNN: A Three-Way RNN Through Both Time and Feature for Speech Separation'. Together they form a unique fingerprint.

Cite this