Document security identification based on multi-classifier

Kaiwen Gu*, Huakang Li, Guozi Sun

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Data leakage is a potentially important issue for businesses. Numerous corporate offer data loss prevention (DLP) solutions to monitor information flow, and detect such leakage. Adding a secret label to a document, DLP can use documents label to do securely control, effectively protecting data. With the increasing documents every day, manual labeling is time-consuming. To better solve the difficult task, recently researchers need to start use document security identification by machine learning quickly classify a large number of texts. The contribution of this paper is to explore dimensionality reduction by feature selection and combine two models to avoid the process of weighting different type of features. In contrast to training all features with one algorithm, our experimental results demonstrate that the combination of two models can improve the classification performance.

Original languageEnglish
Title of host publicationInternational Conference on Applications and Techniques in Cyber Security and Intelligence - Applications and Techniques in Cyber Security and Intelligence
EditorsRafiqul Islam, Kim-Kwang Raymond Choo, Jemal Abawajy
PublisherSpringer Verlag
Pages122-127
Number of pages6
ISBN (Print)9783319670706
DOIs
Publication statusPublished - 2018
Externally publishedYes
EventInternational Conference on Applications and Techniques in Cyber Security and Intelligence, ATCSI 2017 - Ningbo, China
Duration: 16 Jun 201718 Jun 2017

Publication series

NameAdvances in Intelligent Systems and Computing
Volume580
ISSN (Print)2194-5357

Conference

ConferenceInternational Conference on Applications and Techniques in Cyber Security and Intelligence, ATCSI 2017
Country/TerritoryChina
CityNingbo
Period16/06/1718/06/17

Keywords

  • Data leakage prevention
  • Document security identification
  • Feature selection
  • Machine learning
  • Model combination

Cite this