Categorization of Webpages using dynamic mutation based differential evolution and gradient boost classifier

Ibrahim M. Mehedi*, Mohd Heidir Mohd Shah

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

With the growths in Internet technologies, the Website categorization has turned into a demanding field of research. Webpages with destructive and offensive subjects like violence, phishing, scam, radicalism, etc. have flourished over the past several years. Also, an extensive volume of Webpages with different subjects has hampered data extraction and retrieval approaches from delivering optimum subject-related outcomes. Therefore, an efficient approach is desirable to categorize Webpages. In this paper, gradient boosting classifier (GBC) model is used to categorize Websites. It is achieved by utilizing optical character recognition and web scraping, followed by a group of nontrivial text mining and histogram of oriented gradients based feature extraction steps. Thereafter, the proposed GBC is used to recognize Websites. However, GBC suffer from the hyper-parameters tuning issue, therefore, dynamic mutation based differential evolution is used to classify the Websites. The mutation ratio of dynamic mutation based differential evolution is selected dynamically using a differential-evolution-based positioning optimization algorithm. The strength of the proposed and the existing models are also validated against the existence of mis-recognized training contents. Extensive experiments reveal that the proposed Website categorization model outperforms the competitive models.

Original languageEnglish
Pages (from-to)8363-8374
Number of pages12
JournalJournal of Ambient Intelligence and Humanized Computing
Volume14
Issue number7
DOIs
Publication statusPublished - Jul 2023
Externally publishedYes

Keywords

  • Categorization
  • Gradient boost
  • Machine learning
  • Webpage

Fingerprint

Dive into the research topics of 'Categorization of Webpages using dynamic mutation based differential evolution and gradient boost classifier'. Together they form a unique fingerprint.

Cite this