Gabor based lipreading with a new audiovisual mandarin corpus

Yan Xu, Yuexuan Li, Andrew Abel*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

4 Citations (Scopus)

Abstract

Human speech processing is a multimodal and cognitive activity, with visual information playing a role. Many lipreading systems use English speech data, however, Chinese is the most spoken language in the world and is of increasing interest, as well as the development of lightweight feature extraction to improve learning time. This paper presents an improved character-level Gabor-based lip reading system, using visual information for feature extraction and speech classification. We evaluate this system with a new Audiovisual Mandarin Chinese (AVMC) database composed of 4704 characters spoken by 10 volunteers. The Gabor-based lipreading system has been trained on this dataset, and utilizes the Dlib Region-of-Interest(ROI) method and Gabor filtering to extract lip features, which provides a fast and lightweight approach without any mouth modelling. A character-level Convolutional Neural Network (CNN) is used to recognize Pinyin, with 64.96% accuracy, and a Character Error Rate (CER) of 57.71%.

Original languageEnglish
Title of host publicationAdvances in Brain Inspired Cognitive Systems - 10th International Conference, BICS 2019, Proceedings
EditorsJinchang Ren, Amir Hussain, Huimin Zhao, Jun Cai, Rongjun Chen, Yinyin Xiao, Kaizhu Huang, Jiangbin Zheng
PublisherSpringer
Pages169-179
Number of pages11
ISBN (Print)9783030394301
DOIs
Publication statusPublished - 2020
Event10th International Conference on Brain Inspired Cognitive Systems, BICS 2019 - Guangzhou, China
Duration: 13 Jul 201914 Jul 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11691 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Brain Inspired Cognitive Systems, BICS 2019
Country/TerritoryChina
CityGuangzhou
Period13/07/1914/07/19

Keywords

  • Audiovisual
  • Chinese
  • Gabor transform
  • Speech recognition

Fingerprint

Dive into the research topics of 'Gabor based lipreading with a new audiovisual mandarin corpus'. Together they form a unique fingerprint.

Cite this