ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework

Runwei Guan; Shanliang Yao; Xiaohui Zhu; Jeremy Smith; Ka Lok Man; Yutao Yue

doi:10.1145/3604078.3604108

ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework

Runwei Guan, Shanliang Yao, Xiaohui Zhu, Jeremy Smith, Ka Lok Man, Yutao Yue^*

^*Corresponding author for this work

Department of Computing

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type, color, orientation, and other attributes. The categories of these labels are neither relevant nor hierarchical. For a neural network, this means that it needs to output multiple labels of a vehicle while regressing its bounding box. For traffic supervisors, a multi-label vehicle detection system could help them find out the targeted vehicle efficiently. Moreover, there has been no research on multi-label vehicle detection so far. Therefore, we design and develop a unified multi-label vehicle detection framework called ML-VehicleDet, which can detect the location (bounding box), type, color and orientation of the vehicle at the same time. In ML-VehicleDet, firstly, we design a hybrid one-stage object detection NN with ViT-encoder and CNN-decoder called Swin Only Look Once, which abbreviates SOLO. Such a SOLO is an anchor-free detector. Secondly, we design a practical loss function framework called MLC Loss, which includes two loss functions namely MLC-OM and MLC-OO for two different annotations of multi-label detection, specialized for alleviating the mutual inhibition problem in multi-label classification. Thirdly, we design a low-cost NMS algorithm called ML-NMS, specialized to merge bounding boxes with multiple labels for one vehicle. Furthermore, we reconstruct UA-DETRAC as a multi-label vehicle detection dataset (benchmark), called UA-DETRAC-ML. To the best of our knowledge, UA-DETRAC-ML is the first unified multi-label vehicle detection dataset. On UA-DETRAC-ML, ML-VehicleDet achieves 70.23% mAP, outperforming YOLOv5-M and YOLOX-M. To promote the development of the community, we release UA-DETRAC-ML at https://github.com/GuanRunwei/UA-DETRAC-ML.

Original language	English
Title of host publication	Proceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023
Publisher	Association for Computing Machinery
ISBN (Electronic)	9798400708237
DOIs	https://doi.org/10.1145/3604078.3604108
Publication status	Published - 19 May 2023
Event	15th International Conference on Digital Image Processing, ICDIP 2023 - Nanjing, China Duration: 19 May 2023 → 22 May 2023

Publication series

Name	ACM International Conference Proceeding Series

Conference

Conference	15th International Conference on Digital Image Processing, ICDIP 2023
Country/Territory	China
City	Nanjing
Period	19/05/23 → 22/05/23

Keywords

Multi-label object detection
hybrid neural network
multi-label classification
non-maximum suppression

Access to Document

10.1145/3604078.3604108

Cite this

@inproceedings{901dcbaabb3444f78e79c362034078bf,

title = "ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework",

abstract = "Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type, color, orientation, and other attributes. The categories of these labels are neither relevant nor hierarchical. For a neural network, this means that it needs to output multiple labels of a vehicle while regressing its bounding box. For traffic supervisors, a multi-label vehicle detection system could help them find out the targeted vehicle efficiently. Moreover, there has been no research on multi-label vehicle detection so far. Therefore, we design and develop a unified multi-label vehicle detection framework called ML-VehicleDet, which can detect the location (bounding box), type, color and orientation of the vehicle at the same time. In ML-VehicleDet, firstly, we design a hybrid one-stage object detection NN with ViT-encoder and CNN-decoder called Swin Only Look Once, which abbreviates SOLO. Such a SOLO is an anchor-free detector. Secondly, we design a practical loss function framework called MLC Loss, which includes two loss functions namely MLC-OM and MLC-OO for two different annotations of multi-label detection, specialized for alleviating the mutual inhibition problem in multi-label classification. Thirdly, we design a low-cost NMS algorithm called ML-NMS, specialized to merge bounding boxes with multiple labels for one vehicle. Furthermore, we reconstruct UA-DETRAC as a multi-label vehicle detection dataset (benchmark), called UA-DETRAC-ML. To the best of our knowledge, UA-DETRAC-ML is the first unified multi-label vehicle detection dataset. On UA-DETRAC-ML, ML-VehicleDet achieves 70.23% mAP, outperforming YOLOv5-M and YOLOX-M. To promote the development of the community, we release UA-DETRAC-ML at https://github.com/GuanRunwei/UA-DETRAC-ML.",

keywords = "Multi-label object detection, hybrid neural network, multi-label classification, non-maximum suppression",

author = "Runwei Guan and Shanliang Yao and Xiaohui Zhu and Jeremy Smith and Man, {Ka Lok} and Yutao Yue",

note = "Publisher Copyright: {\textcopyright} 2023 ACM.; 15th International Conference on Digital Image Processing, ICDIP 2023 ; Conference date: 19-05-2023 Through 22-05-2023",

year = "2023",

month = may,

day = "19",

doi = "10.1145/3604078.3604108",

language = "English",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

booktitle = "Proceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023",

}

Guan, R, Yao, S, Zhu, X, Smith, J, Man, KL & Yue, Y 2023, ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework. in Proceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023., 30, ACM International Conference Proceeding Series, Association for Computing Machinery, 15th International Conference on Digital Image Processing, ICDIP 2023, Nanjing, China, 19/05/23. https://doi.org/10.1145/3604078.3604108

ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework. / Guan, Runwei; Yao, Shanliang; Zhu, Xiaohui et al.
Proceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023. Association for Computing Machinery, 2023. 30 (ACM International Conference Proceeding Series).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - ML-VehicleDet

T2 - 15th International Conference on Digital Image Processing, ICDIP 2023

AU - Guan, Runwei

AU - Yao, Shanliang

AU - Zhu, Xiaohui

AU - Smith, Jeremy

AU - Man, Ka Lok

AU - Yue, Yutao

PY - 2023/5/19

Y1 - 2023/5/19

N2 - Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type, color, orientation, and other attributes. The categories of these labels are neither relevant nor hierarchical. For a neural network, this means that it needs to output multiple labels of a vehicle while regressing its bounding box. For traffic supervisors, a multi-label vehicle detection system could help them find out the targeted vehicle efficiently. Moreover, there has been no research on multi-label vehicle detection so far. Therefore, we design and develop a unified multi-label vehicle detection framework called ML-VehicleDet, which can detect the location (bounding box), type, color and orientation of the vehicle at the same time. In ML-VehicleDet, firstly, we design a hybrid one-stage object detection NN with ViT-encoder and CNN-decoder called Swin Only Look Once, which abbreviates SOLO. Such a SOLO is an anchor-free detector. Secondly, we design a practical loss function framework called MLC Loss, which includes two loss functions namely MLC-OM and MLC-OO for two different annotations of multi-label detection, specialized for alleviating the mutual inhibition problem in multi-label classification. Thirdly, we design a low-cost NMS algorithm called ML-NMS, specialized to merge bounding boxes with multiple labels for one vehicle. Furthermore, we reconstruct UA-DETRAC as a multi-label vehicle detection dataset (benchmark), called UA-DETRAC-ML. To the best of our knowledge, UA-DETRAC-ML is the first unified multi-label vehicle detection dataset. On UA-DETRAC-ML, ML-VehicleDet achieves 70.23% mAP, outperforming YOLOv5-M and YOLOX-M. To promote the development of the community, we release UA-DETRAC-ML at https://github.com/GuanRunwei/UA-DETRAC-ML.

AB - Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type, color, orientation, and other attributes. The categories of these labels are neither relevant nor hierarchical. For a neural network, this means that it needs to output multiple labels of a vehicle while regressing its bounding box. For traffic supervisors, a multi-label vehicle detection system could help them find out the targeted vehicle efficiently. Moreover, there has been no research on multi-label vehicle detection so far. Therefore, we design and develop a unified multi-label vehicle detection framework called ML-VehicleDet, which can detect the location (bounding box), type, color and orientation of the vehicle at the same time. In ML-VehicleDet, firstly, we design a hybrid one-stage object detection NN with ViT-encoder and CNN-decoder called Swin Only Look Once, which abbreviates SOLO. Such a SOLO is an anchor-free detector. Secondly, we design a practical loss function framework called MLC Loss, which includes two loss functions namely MLC-OM and MLC-OO for two different annotations of multi-label detection, specialized for alleviating the mutual inhibition problem in multi-label classification. Thirdly, we design a low-cost NMS algorithm called ML-NMS, specialized to merge bounding boxes with multiple labels for one vehicle. Furthermore, we reconstruct UA-DETRAC as a multi-label vehicle detection dataset (benchmark), called UA-DETRAC-ML. To the best of our knowledge, UA-DETRAC-ML is the first unified multi-label vehicle detection dataset. On UA-DETRAC-ML, ML-VehicleDet achieves 70.23% mAP, outperforming YOLOv5-M and YOLOX-M. To promote the development of the community, we release UA-DETRAC-ML at https://github.com/GuanRunwei/UA-DETRAC-ML.

KW - Multi-label object detection

KW - hybrid neural network

KW - multi-label classification

KW - non-maximum suppression

UR - http://www.scopus.com/inward/record.url?scp=85179894202&partnerID=8YFLogxK

U2 - 10.1145/3604078.3604108

DO - 10.1145/3604078.3604108

M3 - Conference Proceeding

AN - SCOPUS:85179894202

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023

PB - Association for Computing Machinery

Y2 - 19 May 2023 through 22 May 2023

ER -

ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this