Classification and Identification of Phishing Websites based on Machine Learning

Sheng Fang, Tianyang Liu, Yaning Zhu, Wenjun Fan*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Phishing is the largest network security issue among global cybercrimes in 2022. Its frequency of occurrence has maintained rapid growth and has become one of the most important network security issues. In the state of the art of this research field, there was a trade-off between high-precision discriminant models and huge consumption of computing resources. Therefore, the research purpose of this article is mainly to balance the relationship between accuracy and computing resources (performance) to achieve accuracy and computing efficiency at the same time. This article uses principal component analysis (PCA) as a tool, uses its excellent dimensionality reduction ability to process sample data, compresses the original feature set, and then uses different machine learning models to conduct experiments. In the end, the random forest model after PCA achieved a discrimination accuracy of 97.157% with a performance improvement of 25.1%, effectively achieving a win-win balance between accuracy and performance.

Original languageEnglish
Title of host publicationProceedings - 2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages397-403
Number of pages7
ISBN (Electronic)9798350308693
DOIs
Publication statusPublished - 2023
Event15th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023 - Jiangsu, China
Duration: 2 Nov 20234 Nov 2023

Publication series

NameProceedings - 2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023

Conference

Conference15th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023
Country/TerritoryChina
CityJiangsu
Period2/11/234/11/23

Keywords

  • Cyber Crime
  • Machine Learning
  • PCA
  • Phishing Detection
  • Random Forest

Fingerprint

Dive into the research topics of 'Classification and Identification of Phishing Websites based on Machine Learning'. Together they form a unique fingerprint.

Cite this