Pre-Trained or Adversarial Training: A Comparison of NER Methods on Chinese Drug Specifications

Kok Hoe Wong*, ZhuJia SHENG

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Named Entity Recognition (NER) is widely used for Natural Language Processing (NLP) but most of the current work focus on analyzing English-based text. This paper compares different NER models in extracting key contents from Chinese drug specifications. These key contents help identify important information about the drugs to the users. Three models were initially chosen for this research, namely BiLSTM-CRF, MiniRBT-BiLSTM-CRF, and MiniRBT-CRF. Experimental results show that MiniRBT-CRF outperforms the other two models, achieving high precision and F1 scores. We then worked on optimizing this model with word embedding and adversarial training. Firstly, we replaced MiniRBT with BERT-Base-Chinese model and the results show that the BERT-CRF has a 2% growth in F1 scores over the MiniRBT-CRF. Next, we augmented adversarial training to the BERT-CRF model. However, the results show that BERT-CRFAdv only increased the precision score by 1%, but not the F1 score. The results thus suggest that in order to enhance an NER model for Chinese-based text, optimizing the underlying model is the better choice.
Original languageEnglish
Title of host publicationIEEE Xplore
PublisherIEEE
Pages70-75
ISBN (Electronic)979-8-3503-6310-4
ISBN (Print)979-8-3503-6311-1
DOIs
Publication statusPublished - 1 Oct 2024

Fingerprint

Dive into the research topics of 'Pre-Trained or Adversarial Training: A Comparison of NER Methods on Chinese Drug Specifications'. Together they form a unique fingerprint.

Cite this