A knowledge based approach for tackling mislabeled multi-class big social data

Minyi Guo, Yi Liu, Jie Li, Huakang Li, Bei Xu

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

3 Citations (Scopus)

Abstract

The performance of classification models extremely relies on the quality of training data. However, label imperfection is an inherent fault of training data, which is impossible manually handled in big data environment. Various methods have been proposed to remove label noises in order to improve classification quality, with the side effect of cutting down data bulk. In this paper, we propose a knowledge based approach for tackling mislabeled multi-class big data, in which knowledge graph technique is combined with other data correction method to perceive and correct the error labels in big data. The knowledge graph is built with the medical concepts extracted from online health consulting and medical guidance. Experimental results show our knowledge graph based approach can effectively improve data quality and classification accuracy. Furthermore, this approach can be applied in other data mining tasks requiring deep understanding.

Original languageEnglish
Title of host publicationThe Semantic Web
Subtitle of host publicationTrends and Challenges - 11th International Conference, ESWC 2014, Proceedings
PublisherSpringer Verlag
Pages349-363
Number of pages15
ISBN (Print)9783319074429
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event11th International Conference on Semantic Web: Trends and Challenges, ESWC 2014 - Anissaras, Crete, Greece
Duration: 25 May 201429 May 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8465 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Semantic Web: Trends and Challenges, ESWC 2014
Country/TerritoryGreece
CityAnissaras, Crete
Period25/05/1429/05/14

Keywords

  • classification
  • knowledge graph
  • label correction
  • label imperfection

Cite this