Word Segmentation for Chinese Judicial Documents

Linxia Yao, Jidong Ge*, Chuanyi Li, Yuan Yao, Zhenhao Li, Jin Zeng, Bin Luo, Victor Chang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Word segmentation is an integral step in many knowledge discovery applications. However, existing word segmentation methods have problems when applying to Chinese judicial documents: (1) existing methods rely on large-scale labeled data which is typically unavailable in judicial documents, and (2) judicial document has its own language features and writing formats. In this paper, a word segmentation method is proposed for Chinese judicial documents. The proposed method consists of two steps: (1) automatically generating some labeled data as legal dictionaries, and (2) applying a hybrid multi-layer neural networks to do word segmentation incorporating legal dictionaries. Experiments are conducted on a dataset of Chinese judicial documents showing that the proposed model can achieve better results than the existing methods.

Original languageEnglish
Title of host publicationData Science - 5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019, Proceedings
EditorsXiaohui Cheng, Weipeng Jing, Xianhua Song, Zeguang Lu
PublisherSpringer Verlag
Pages466-478
Number of pages13
ISBN (Print)9789811501173
DOIs
Publication statusPublished - 2019
Event5th International Conference of Pioneer Computer Scientists, Engineers and Educators, ICPCSEE 2019 - Guilin, China
Duration: 20 Sept 201923 Sept 2019

Publication series

NameCommunications in Computer and Information Science
Volume1058
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference5th International Conference of Pioneer Computer Scientists, Engineers and Educators, ICPCSEE 2019
Country/TerritoryChina
CityGuilin
Period20/09/1923/09/19

Keywords

  • Chinese word segmentation
  • Judicial documents
  • Knowledge discovery

Fingerprint

Dive into the research topics of 'Word Segmentation for Chinese Judicial Documents'. Together they form a unique fingerprint.

Cite this