Accurate and robust text detection: A step-in for text retrieval in natural scene images

Xu Cheng Yin; Xuwang Yin; Kaizhu Huang; Hong Wei Hao

doi:10.1145/2484028.2484197

Accurate and robust text detection: A step-in for text retrieval in natural scene images

Xu Cheng Yin^*, Xuwang Yin, Kaizhu Huang, Hong Wei Hao

^*Corresponding author for this work

School of Advanced Technology

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

20 Citations (Scopus)

Abstract

We propose and implement a robust text detection system, which is a prominent step-in for text retrieval in natural scene images or videos. Our system includes several key components: (1) A fast and effective pruning algorithm is designed to extract Maximally Stable Extremal Regions as character candidates using the strategy of minimizing regularized variations. (2) Character candidates are grouped into text candidates by the single-link clustering algorithm, where distance weights and threshold of clustering are learned automatically by a novel self-training distance metric learning algorithm. (3) The posterior probabilities of text candidates corresponding to non-text are estimated with an character classifier; text candidates with high probabilities are then eliminated and finally texts are identified with a text classifier. The proposed system is evaluated on the ICDAR 2011 Robust Reading Competition dataset and a publicly available multilingual dataset; the f measures are over 76% and 74% which are significantly better than the state-of-the-art performances of 71% and 65%, respectively.

Original language	English
Title of host publication	SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
Pages	1091-1092
Number of pages	2
DOIs	https://doi.org/10.1145/2484028.2484197
Publication status	Published - 2013
Event	36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 - Dublin, Ireland Duration: 28 Jul 2013 → 1 Aug 2013

Publication series

Name	SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference	36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013
Country/Territory	Ireland
City	Dublin
Period	28/07/13 → 1/08/13

Keywords

Distance metric learning
Maximally stable extremal regions
Scene text detection
Single-link clustering

Access to Document

10.1145/2484028.2484197

Cite this

Yin, X. C., Yin, X., Huang, K., & Hao, H. W. (2013). Accurate and robust text detection: A step-in for text retrieval in natural scene images. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1091-1092). (SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval). https://doi.org/10.1145/2484028.2484197

Yin, Xu Cheng ; Yin, Xuwang ; Huang, Kaizhu et al. / Accurate and robust text detection : A step-in for text retrieval in natural scene images. SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. pp. 1091-1092 (SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval).

@inproceedings{8358a8da26c44309b60ab74b13a8a38d,

title = "Accurate and robust text detection: A step-in for text retrieval in natural scene images",

abstract = "We propose and implement a robust text detection system, which is a prominent step-in for text retrieval in natural scene images or videos. Our system includes several key components: (1) A fast and effective pruning algorithm is designed to extract Maximally Stable Extremal Regions as character candidates using the strategy of minimizing regularized variations. (2) Character candidates are grouped into text candidates by the single-link clustering algorithm, where distance weights and threshold of clustering are learned automatically by a novel self-training distance metric learning algorithm. (3) The posterior probabilities of text candidates corresponding to non-text are estimated with an character classifier; text candidates with high probabilities are then eliminated and finally texts are identified with a text classifier. The proposed system is evaluated on the ICDAR 2011 Robust Reading Competition dataset and a publicly available multilingual dataset; the f measures are over 76% and 74% which are significantly better than the state-of-the-art performances of 71% and 65%, respectively.",

keywords = "Distance metric learning, Maximally stable extremal regions, Scene text detection, Single-link clustering",

author = "Yin, {Xu Cheng} and Xuwang Yin and Kaizhu Huang and Hao, {Hong Wei}",

year = "2013",

doi = "10.1145/2484028.2484197",

language = "English",

isbn = "9781450320344",

series = "SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval",

pages = "1091--1092",

booktitle = "SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval",

note = "36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 ; Conference date: 28-07-2013 Through 01-08-2013",

}

Yin, XC, Yin, X, Huang, K & Hao, HW 2013, Accurate and robust text detection: A step-in for text retrieval in natural scene images. in SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1091-1092, 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, Dublin, Ireland, 28/07/13. https://doi.org/10.1145/2484028.2484197

Accurate and robust text detection: A step-in for text retrieval in natural scene images. / Yin, Xu Cheng; Yin, Xuwang; Huang, Kaizhu et al.
SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. p. 1091-1092 (SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Accurate and robust text detection

T2 - 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013

AU - Yin, Xu Cheng

AU - Yin, Xuwang

AU - Huang, Kaizhu

AU - Hao, Hong Wei

PY - 2013

Y1 - 2013

N2 - We propose and implement a robust text detection system, which is a prominent step-in for text retrieval in natural scene images or videos. Our system includes several key components: (1) A fast and effective pruning algorithm is designed to extract Maximally Stable Extremal Regions as character candidates using the strategy of minimizing regularized variations. (2) Character candidates are grouped into text candidates by the single-link clustering algorithm, where distance weights and threshold of clustering are learned automatically by a novel self-training distance metric learning algorithm. (3) The posterior probabilities of text candidates corresponding to non-text are estimated with an character classifier; text candidates with high probabilities are then eliminated and finally texts are identified with a text classifier. The proposed system is evaluated on the ICDAR 2011 Robust Reading Competition dataset and a publicly available multilingual dataset; the f measures are over 76% and 74% which are significantly better than the state-of-the-art performances of 71% and 65%, respectively.

AB - We propose and implement a robust text detection system, which is a prominent step-in for text retrieval in natural scene images or videos. Our system includes several key components: (1) A fast and effective pruning algorithm is designed to extract Maximally Stable Extremal Regions as character candidates using the strategy of minimizing regularized variations. (2) Character candidates are grouped into text candidates by the single-link clustering algorithm, where distance weights and threshold of clustering are learned automatically by a novel self-training distance metric learning algorithm. (3) The posterior probabilities of text candidates corresponding to non-text are estimated with an character classifier; text candidates with high probabilities are then eliminated and finally texts are identified with a text classifier. The proposed system is evaluated on the ICDAR 2011 Robust Reading Competition dataset and a publicly available multilingual dataset; the f measures are over 76% and 74% which are significantly better than the state-of-the-art performances of 71% and 65%, respectively.

KW - Distance metric learning

KW - Maximally stable extremal regions

KW - Scene text detection

KW - Single-link clustering

UR - http://www.scopus.com/inward/record.url?scp=84883074303&partnerID=8YFLogxK

U2 - 10.1145/2484028.2484197

DO - 10.1145/2484028.2484197

M3 - Conference Proceeding

AN - SCOPUS:84883074303

SN - 9781450320344

T3 - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

SP - 1091

EP - 1092

BT - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

Y2 - 28 July 2013 through 1 August 2013

ER -

Yin XC, Yin X, Huang K, Hao HW. Accurate and robust text detection: A step-in for text retrieval in natural scene images. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. p. 1091-1092. (SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval). doi: 10.1145/2484028.2484197

Accurate and robust text detection: A step-in for text retrieval in natural scene images

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this