SHAPE: A Simultaneous Header and Payload Encoding Model for Encrypted Traffic Classification

Jianbang Dai, Xiaolong Xu*, Honghao Gao, Xinheng Wang, Fu Xiao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

Many end-to-end deep learning algorithms seeking to classify malicious traffic and encrypted traffic have been proposed in recent years. End-to-end deep learning algorithms require a large number of samples to train a model. However, it is hard for existing methods fully utilizing the heterogeneous multimodal input. To this end, we propose the SHAPE model (simultaneous header and payload encoding), which mainly consists of two autoencoders and a transformer layer, to improve model performance. The two auto encoders extract features from heterogeneous inputs - the statistical information of each packet and byte-form payloads - and convert them into a unified format; then, a lightweight Transformers layer further extracts the relationship hidden in simultaneous input. In particular, the autoencoder for payload feature extraction contains several depthwise separable residual convolution layers for efficient feature extraction and a token squeeze layer to reduce the computing overhead of the Transformers layer. Moreover, we train the SHAPE model using deep metric learning, which pulls samples with the same class label together and separates samples from different classes in the low-dimensional embedding space. Thus, the SHAPE model can naturally handle multitask classification, and its performance is approximately 5.43% better than the current SOTA on the traffic type classification of the ISCX-VPN2016 dataset, at the cost of 9.31 times the training time, and 1.45 times the inference time.

Original languageEnglish
Pages (from-to)1993-2012
Number of pages20
JournalIEEE Transactions on Network and Service Management
Volume20
Issue number2
DOIs
Publication statusPublished - 1 Jun 2023

Keywords

  • Traffic classification
  • autoencoder
  • deep metric learning
  • encrypted traffic
  • transformer

Fingerprint

Dive into the research topics of 'SHAPE: A Simultaneous Header and Payload Encoding Model for Encrypted Traffic Classification'. Together they form a unique fingerprint.

Cite this