CASIA online and offline Chinese handwriting databases

Cheng Lin Liu*, Fei Yin, Da Han Wang, Qiu Feng Wang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

432 Citations (Scopus)

Abstract

This paper introduces a pair of online and offline Chinese handwriting databases, containing samples of isolated characters and handwritten texts. The samples were produced by 1,020 writers using Anoto pen on papers for obtaining both online trajectory data and offline images. Both the online samples and offline samples are divided into six datasets, three for isolated characters (DB1.0-C1.2) and three for handwritten texts (DB2.0-C2.2). The (either online or offline) datasets of isolated characters contain about 3.9 million samples of 7,356 classes (7,185 Chinese characters and 171 symbols), and the datasets of handwritten texts contain about 5,090 pages and 1.35 million character samples. Each dataset is segmented and annotated at character level, and is partitioned into standard training and test subsets. The online and offline databases can be used for the research of various handwritten document analysis tasks.

Original languageEnglish
Title of host publicationProceedings - 11th International Conference on Document Analysis and Recognition, ICDAR 2011
Pages37-41
Number of pages5
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event11th International Conference on Document Analysis and Recognition, ICDAR 2011 - Beijing, China
Duration: 18 Sept 201121 Sept 2011

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)1520-5363

Conference

Conference11th International Conference on Document Analysis and Recognition, ICDAR 2011
Country/TerritoryChina
CityBeijing
Period18/09/1121/09/11

Keywords

  • Chinese handwriting databases
  • handwritten texts
  • isolated characters
  • offline
  • online

Cite this