An efficient multistage phishing website detection model based on the CASE feature framework: Aiming at the real web environment

Dong Jie Liu, Guang Gang Geng*, Xiao Bo Jin, Wei Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

26 Citations (Scopus)

Abstract

Phishing has become a favorite method of hackers for committing data theft and continues to evolve. As long as phishing websites continue to operate, many more people and companies will suffer privacy leaks or financial losses. Therefore, the demand for fast and accurate phishing website detection grows stronger. However, the existing phishing detection methods do not fully analyze the features of phishing, and the performance and efficiency of the models only apply to certain limited datasets and need to be improved to be applied to the real web environment. This paper fully considers the social engineering principles of phishing, proposes a comprehensive and interpretable CASE feature framework and designs a multistage phishing detection model to effectively detect phishing sites, especially in the real web environment, where high efficiency and performance and extremely low false alarm rates are required. To fully verify the proposed method, two kinds of data experiments were carried out. One was the comparative experiments among different features and different detection models on CASE, which covers both classic machine learning and deep learning algorithms based on a constructed complex dataset. The other was a one-year phishing discovery experiment in the real web environment. The proposed method achieves better detection results under the premise of significantly shortening the execution time and works well in real phishing discovery, which proves its high practicability in reality.

Original languageEnglish
Article number102421
JournalComputers and Security
Volume110
DOIs
Publication statusPublished - Nov 2021

Keywords

  • CASE feature framework
  • Machine learning
  • Multistage model
  • Phishing detection
  • Real web environment

Cite this